[go: up one dir, main page]

CN100466725C - Multimedia communication method and terminal thereof - Google Patents

Multimedia communication method and terminal thereof Download PDF

Info

Publication number
CN100466725C
CN100466725C CNB2006100690163A CN200510110013A CN100466725C CN 100466725 C CN100466725 C CN 100466725C CN B2006100690163 A CNB2006100690163 A CN B2006100690163A CN 200510110013 A CN200510110013 A CN 200510110013A CN 100466725 C CN100466725 C CN 100466725C
Authority
CN
China
Prior art keywords
error
data
real time
information
transport protocol
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2006100690163A
Other languages
Chinese (zh)
Other versions
CN1863302A (en
Inventor
罗忠
宋彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wang Miaomiao
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CNB2006100690163A priority Critical patent/CN100466725C/en
Priority to PCT/CN2006/002961 priority patent/WO2007051425A1/en
Publication of CN1863302A publication Critical patent/CN1863302A/en
Application granted granted Critical
Publication of CN100466725C publication Critical patent/CN100466725C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0023Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the signalling
    • H04L1/0025Transmission of mode-switching indication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0023Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the signalling
    • H04L1/0028Formatting
    • H04L1/0029Reduction of the amount of signalling, e.g. retention of useful signalling or differential signalling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0078Avoidance of errors by organising the transmitted data in a format specifically designed to deal with errors, e.g. location
    • H04L1/0079Formats for control data

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明涉及多媒体通信技术,公开了一种多媒体通信方法及其终端的设计和实现,实现不等保护,方便于QoS保证的实现。本发明采用ERRTP在现有RTP协议基础上提供了可以携带错误弹性编码方案相关信息的传送层封装格式,使得多媒体数据在ERRTP上传送的同时标记其相应的错误弹性编码方案信息,从而将错误弹性机制融入传送层;针对H.264 NALU结构给出专用的ERRTP封装方法和协议头信息的改造方案,能够将同一个ERRTP包中的所有NALU的头信息字节结合到其头信息中,从而在ERRTP头信息中体现NALU的重要信息,并且提高传送效率。

Figure 200510110013

The invention relates to multimedia communication technology, discloses a multimedia communication method and the design and realization of the terminal, which realizes unequal protection and facilitates the realization of QoS guarantee. The present invention adopts ERRTP on the basis of the existing RTP protocol to provide a transport layer encapsulation format that can carry information related to the error elastic coding scheme, so that the multimedia data is transmitted on the ERRTP while marking its corresponding error elastic coding scheme information, thereby making the error elastic The mechanism is integrated into the transport layer; for the H.264 NALU structure, a special ERRTP encapsulation method and a modification scheme for the protocol header information are given, which can combine the header information bytes of all NALUs in the same ERRTP packet into its header information, so that The important information of NALU is reflected in the ERRTP header information, and the transmission efficiency is improved.

Figure 200510110013

Description

多媒体通信方法及其终端 Multimedia Communication Method and Terminal

技术领域 technical field

本发明涉及多媒体通信技术,特别涉及支持错误弹性的多媒体通信技术。The invention relates to multimedia communication technology, in particular to a multimedia communication technology supporting error resilience.

背景技术 Background technique

随着计算机互联网(Internet)和移动通信网络的飞速发展,流媒体技术的应用越来越广泛,从网上广播、电影播放到远程教学以及在线的新闻网站等都用到了流媒体技术。当前网上传送视频、音频主要有下载(Download)和流式传送(Streaming)两种方式。流式传送是连续传送视/音频信号,当流媒体在客户机播放时其余部分在后台继续下载。流式传送有顺序流式传送(Progressive Streaming)和实时流式传送(Realtime Streaming)两种方式。实时流式传送是实时传送,特别适合现场事件,实时流式传送必须匹配连接带宽,这意味着图像质量会因网络速度降低而变差,以减少对传送带宽的需求。“实时”的概念是指在一个应用中数据的交付必须与数据的产生保持精确的时间关系。With the rapid development of computer Internet (Internet) and mobile communication network, the application of streaming media technology is becoming more and more extensive, from online broadcasting, movie playing to distance teaching and online news websites, etc., all use streaming media technology. Currently, there are mainly two ways to transmit video and audio on the Internet: Download and Streaming. Streaming transmission is the continuous transmission of video/audio signals, and when the streaming media is played on the client computer, the rest continues to download in the background. There are two ways of streaming: Progressive Streaming and Realtime Streaming. Live streaming is real-time delivery, especially suitable for live events, live streaming must match the connection bandwidth, which means that the image quality will be degraded by lower network speeds to reduce the need for delivery bandwidth. The concept of "real-time" means that the delivery of data in an application must maintain a precise time relationship with the generation of data.

尤其是随着第三代移动通信系统(3rd Generation,简称“3G”)的出现和普遍基于网际协议(Internet Protocol,简称“IP”)的网络迅速发展,视频通信正逐步成为通信的主要业务之一。而双方或多方视频通信业务,如可视电话、视频会议、移动终端多媒体服务等,更对多媒体数据流的传送及服务质量提出苛刻的要求。不仅要求网络传送实时性更好,而且等效的也要求视频数据压缩编码效率更高。Especially with the emergence of the third generation mobile communication system (3rd Generation, referred to as "3G") and the rapid development of networks based on Internet Protocol (Internet Protocol, referred to as "IP"), video communication is gradually becoming one of the main services of communication. one. And two-party or multi-party video communication services, such as videophone, video conferencing, mobile terminal multimedia services, etc., put forward strict requirements on the transmission of multimedia data streams and service quality. Not only requires better real-time performance of network transmission, but also requires higher efficiency of video data compression and encoding.

鉴于媒体通信的需求现状,国际电信联盟标准部(InternationalTelecommunication Union Telecommunication Standardization Sector,简称“ITU-T”)继制定了H.261、H.263、H.263+等视频压缩标准后,于2003年正式发布了H.264标准。这是ITU-T和国际标准化组织(InternationalStandardization Organization,简称“ISO”)的运动图像专家组(Moving PictureExperts Group,简称“MPEG”)一起联合制定的适应新阶段网络媒体传送及通信需求的高效压缩编码标准。它同时也是MPEG-4标准第10部分的主要内容。In view of the current needs of media communications, the International Telecommunication Union Telecommunications Standardization Sector ("ITU-T") formulated video compression standards such as H.261, H.263, and H.263+ in 2003. Officially released the H.264 standard. This is a high-efficiency compression coding that is jointly formulated by ITU-T and the Moving Picture Experts Group (MPEG) of the International Standardization Organization (International Standardization Organization, "ISO") to meet the needs of network media transmission and communication in the new stage. standard. It is also the main content of Part 10 of the MPEG-4 standard.

制定H.264标准的目的在于更加有效地提高视频编码效率和它对网络的适配性。事实上由于其优越性,H.264视频压缩编码标准很快就已经逐渐成为当前多媒体通信中的主流标准。大量的采用H.264多媒体实时通信产品(如会议电视,可视电话,3G移动通信终端)和网络流媒体产品先后问世,是否支持H.264已经成为这个市场领域中决定产品竞争力的关键因素。可以预测,随着H.264的正式颁布和广泛使用,基于IP网络和3G、后3G无线网络的多媒体通信必然进入一个飞跃发展的新阶段。The purpose of formulating the H.264 standard is to more effectively improve video coding efficiency and its adaptability to the network. In fact, due to its superiority, the H.264 video compression coding standard has gradually become the mainstream standard in current multimedia communication. A large number of H.264 multimedia real-time communication products (such as conference TV, videophone, 3G mobile communication terminals) and network streaming media products have come out successively. Whether or not to support H.264 has become a key factor in determining product competitiveness in this market field . It can be predicted that with the formal promulgation and widespread use of H.264, multimedia communications based on IP networks and 3G and post-3G wireless networks will inevitably enter a new stage of rapid development.

下面简单介绍H.264标准的消息构成及发送机制:H.264标准采用分层模式,定义了视频编码层(Video Coding Layer,简称“VCL”)和网络抽象层(Network Abstraction Layer,简称“NAL”),后者专为网络传送设计,能适应不同网络中的视频传送,进一步提高网络的“亲和性”。H.264引入了面向IP包的编码机制,有利于网络中的分组传送,支持网络中视频的流媒体传送;具有较强的抗误码特性,特别适应丢包率高、干扰严重的无线视频传送的要求。H.264的所有待传送数据,包括图像数据及其他消息均封装为统一格式的包传送,即网络抽象层单元(NAL Unit,简称“NALU”)。每个NALU是一个一定语法元素的可变长字节字符串,包括包含一个字节的头信息,可用来表示数据类型,以及若干整数字节的负荷数据。一个NAL单元可以携带一个编码片、各自类型数据分割或一个序列或图像参数集。为了加强数据可靠性,每帧图像都被分为若干个条带(Slice),每个Slice由一个NALU承载,Slice又是由若干个更小的宏块组成,即为最小的处理单元。一般的说,前后帧对应位置的Slice相互关联,不同位置的Slice相互独立,这样可以避免Slice之间发生误码相互扩散。The following is a brief introduction to the message composition and sending mechanism of the H.264 standard: The H.264 standard adopts a layered mode, which defines the Video Coding Layer (Video Coding Layer, referred to as "VCL") and the Network Abstraction Layer (Network Abstraction Layer, referred to as "NAL") ”), the latter is specially designed for network transmission, and can adapt to video transmission in different networks, further improving the "affinity" of the network. H.264 introduces an IP packet-oriented coding mechanism, which is beneficial to packet transmission in the network and supports streaming video transmission in the network; it has strong anti-error characteristics, and is especially suitable for wireless video with high packet loss rate and serious interference send request. All data to be transmitted in H.264, including image data and other messages, are encapsulated into a uniform format for packet transmission, that is, a network abstraction layer unit (NAL Unit, "NALU" for short). Each NALU is a variable-length byte string of certain syntax elements, including header information containing one byte, which can be used to indicate the data type, and several integer bytes of payload data. A NAL unit can carry a coded slice, data partition of the respective type or a sequence or picture parameter set. In order to enhance data reliability, each frame of image is divided into several slices (Slice), each Slice is carried by a NALU, and Slice is composed of several smaller macroblocks, which is the smallest processing unit. Generally speaking, the Slices at the corresponding positions of the previous and subsequent frames are related to each other, and the Slices at different positions are independent of each other, so as to avoid the mutual diffusion of bit errors between the Slices.

H.264数据包含非参考帧的纹理数据、序列参数、图像参数、补充增强消息(Supplemental Enhancement Information,简称“SEI”)、参考帧纹理数据等。其中,SEI消息是在H.264视频的解码、显示及其它方面起辅助作用的消息的统称。现有技术定义了各类SEI消息,同时保留了SEI预留消息,为未来的各种可能应用留下了扩展余地。根据H.264,SEI消息并非在解码过程重构亮度和色度图像所必需的。符合H.264标准的解码器,是不需要对于SEI作任何处理的。也就是说,不是所有符合H.264基本要求的终端都能够处理SEI消息的,但是对于不能处理SEI消息的终端,发送SEI对于它是没有影响的,它会简单地忽略掉它不能处理的SEI消息。按照SEI语法规则,用户可以利用预留消息传送自定义消息,实现功能扩展。H.264 data includes non-reference frame texture data, sequence parameters, image parameters, Supplemental Enhancement Information (Supplemental Enhancement Information, referred to as "SEI"), reference frame texture data, etc. Wherein, the SEI message is a general term for messages that play an auxiliary role in decoding, displaying and other aspects of the H.264 video. Various SEI messages are defined in the prior art, and SEI reserved messages are reserved at the same time, leaving room for expansion for various possible applications in the future. According to H.264, SEI messages are not necessary to reconstruct luma and chrominance images during the decoding process. A decoder that conforms to the H.264 standard does not need to do any processing on the SEI. In other words, not all terminals that meet the basic requirements of H.264 can process SEI messages, but for terminals that cannot process SEI messages, sending SEI has no effect on it, and it will simply ignore the SEI that it cannot process information. According to SEI syntax rules, users can use reserved messages to transmit custom messages to realize function extension.

下面首先介绍SEI消息的结构及文法等相关资料。H.264中提供了多种可以进行消息扩展的机制,其中包括SEI。H.264中定义了补充增强信息(SEI),它的数据表示区域与视频编码数据独立,它的使用方法在H.264协议中NAL的描述中给出。H.264码流的基本单位是NALU,NALU可以承载各种H.264数据类型,比如视频序列参数(Sequence parameters),图像参数(Picture parameters),Slice数据(即具体图像数据),以及SEI消息数据。SEI用于传递各种消息,支持消息扩展。因此SEI域内用于传送为特定目的而自定义的消息,而不会影响基于H.264视频通信系统的兼容性。承载SEI消息的NALU叫做SEI NALU。一个SEI NALU含有一个或多个SEI消息。每个SEI消息含有一些变量,主要是载荷类型(payloadType)和载荷大小(payloadSize),这些变量指明了消息载荷的类型和大小。在H.264AnnexD.8,D.9中定义了一些常用的H.264SEI消息的文法和语意。NALU中包含的载荷叫做原始字节序列载荷(Raw-Byte Sequence Payload,简称“RBSP”),SEI是RBSP的一种类型。The following first introduces the structure and grammar of the SEI message and other related materials. H.264 provides a variety of mechanisms for message extension, including SEI. Supplemental Enhancement Information (SEI) is defined in H.264. Its data representation area is independent of video coding data. Its usage method is given in the description of NAL in the H.264 protocol. The basic unit of the H.264 stream is the NALU, which can carry various H.264 data types, such as video sequence parameters (Sequence parameters), image parameters (Picture parameters), Slice data (that is, specific image data), and SEI messages data. SEI is used to deliver various messages and supports message extensions. Therefore, the SEI domain is used to transmit messages customized for specific purposes without affecting the compatibility of the H.264-based video communication system. The NALU carrying the SEI message is called SEI NALU. An SEI NALU contains one or more SEI messages. Each SEI message contains some variables, mainly payload type (payloadType) and payload size (payloadSize), these variables indicate the type and size of the message payload. The syntax and semantics of some commonly used H.264SEI messages are defined in H.264Annex D.8 and D.9. The payload contained in the NALU is called Raw-Byte Sequence Payload ("RBSP" for short), and SEI is a type of RBSP.

SEI的数据表示区域简称为SEI域。每个SEI域包含一个或多个SEI消息,而SEI消息又由SEI头信息和SEI有效载荷组成。SEI头信息包括两个字段:一个给出SEI消息中载荷的类型,另一个给出载荷的大小。当载荷类型在0到255之间时用一个字节0x00到0xFE表示,当类型在256到511之间时用两个字节0xFF00到0xFFFE表示,当类型大于511时表示方法以此类推,这样用户可以自定义任意多种载荷类型。其中类型0到类型18标准中已定义为特定的信息如缓存周期、图像定时等。由此可见H.264中定义的SEI域可根据需求存放足够多的用户自定义信息。对于不支持解析这些用户自定义信息的H.264解码器,会自动丢弃SEI域中的数据。因此,在SEI域内记入有用的自定义信息不会影响基于H.264视频通信系统的兼容性。The data representation area of the SEI is referred to as the SEI domain for short. Each SEI field contains one or more SEI messages, and SEI messages are composed of SEI header information and SEI payload. The SEI header information includes two fields: one gives the type of payload in the SEI message, and the other gives the size of the payload. When the load type is between 0 and 255, it is represented by one byte 0x00 to 0xFE; when the type is between 256 and 511, it is represented by two bytes 0xFF00 to 0xFFFE; when the type is greater than 511, the representation method is deduced by analogy. Users can customize any number of load types. Among them, type 0 to type 18 have been defined in the standard as specific information such as cache period, image timing, etc. It can be seen that the SEI field defined in H.264 can store enough user-defined information according to requirements. For H.264 decoders that do not support parsing these user-defined information, the data in the SEI field will be discarded automatically. Therefore, recording useful custom information in the SEI field will not affect the compatibility of the H.264-based video communication system.

如前所述多媒体通信不仅要求媒体压缩编码效率高,而且要求网络传送的实时性。目前多媒体流传送基本上都是采用实时传送协议(Real-timeTransport Protocol,简称“RTP”)及其控制协议(Real-time Transport ControlProtocol,简称“RTCP”)。RTP是针对Internet上多媒体数据流的一个传送协议,由互联网工程任务组(Internet Engineering Task Force,简称“IETF”)发布。RTP被定义为在一对一或一对多的传送情况下工作,其目的是提供时间信息和实现流同步。RTP的典型应用建立在用户数据包协议(UserDatagram Protocol,简称“UDP”)上,但也可以在传送控制协议(TransportControl Protocol,简称“TCP”)或异步传送模式(Asynchronous Transfer Mode,简称“ATM”)等其他协议之上工作。As mentioned earlier, multimedia communication not only requires high media compression and coding efficiency, but also requires real-time network transmission. At present, multimedia streaming basically adopts Real-time Transport Protocol (Real-time Transport Protocol, referred to as "RTP") and its control protocol (Real-time Transport Control Protocol, referred to as "RTCP"). RTP is a transmission protocol for multimedia data streams on the Internet, released by the Internet Engineering Task Force (Internet Engineering Task Force, referred to as "IETF"). RTP is defined to work in the case of one-to-one or one-to-many transmission, and its purpose is to provide time information and realize stream synchronization. The typical application of RTP is based on the User Datagram Protocol ("UDP" for short), but it can also be used in the Transmission Control Protocol (Transport Control Protocol, "TCP") or Asynchronous Transfer Mode (Asynchronous Transfer Mode, "ATM" for short). ) and other protocols to work on.

RTP本身只保证实时数据的传送,并不能为按顺序传送数据包提供可靠的传送机制,也不提供流量控制或拥塞控制,它依靠RTCP提供这些服务。RTCP负责管理传送质量在当前应用进程之间交换控制信息。在RTP会话期间,各参与者周期性地传送RTCP包,包中含有已发送的数据包的数量、丢失的数据包的数量等统计资料,因此,服务器可以利用这些信息动态地改变传送速率,甚至改变有效载荷类型。RTP和RTCP配合使用,能以有效的反馈和最小的开销使传送效率最佳化,故适合传送网上的实时数据。RTP itself only guarantees the delivery of real-time data, and does not provide a reliable delivery mechanism for in-order delivery of data packets, nor does it provide flow control or congestion control. It relies on RTCP to provide these services. RTCP is responsible for managing the quality of transmission and exchanging control information between current application processes. During the RTP session, each participant periodically transmits RTCP packets, which contain statistical data such as the number of data packets sent, the number of lost data packets, etc. Therefore, the server can use this information to dynamically change the transmission rate, or even Change payload type. The combination of RTP and RTCP can optimize the transmission efficiency with effective feedback and minimum overhead, so it is suitable for transmitting real-time data on the Internet.

而H.264多媒体数据在IP网络上传送,不例外的也是基于UDP和其上层的RTP协议。RTP本身在结构上对于不同的媒体数据类型都能够适用,但是在多媒体通信中不同的高层协议或媒体压缩编码标准(如H.261,H.263,MPEG-1/-2/-4,MP3等),IETF都会制定针对该协议的RTP净荷(Payload)打包方法的规范文件,详细规定RTP封装打包的方法,对于该具体协议是经过优化的。同样的,对于H.264也存在对应的IETF标准是RFC 3984:RTPPayload Formatfor H.264 Video。该标准目前是H.264视频码流在IP网络上传送的主要标准,应用很广泛。在视频通信领域,各主流厂商的产品都是基于RFC 3984的,也是目前仅有的H.264/RTP传送方式。The transmission of H.264 multimedia data on the IP network is also based on UDP and its upper layer RTP protocol without exception. RTP itself can be applied to different media data types in structure, but in multimedia communication, different high-level protocols or media compression coding standards (such as H.261, H.263, MPEG-1/-2/-4, MP3 etc.), the IETF will formulate a specification file for the RTP payload (Payload) packaging method of the protocol, specifying the RTP packaging method in detail, which is optimized for this specific protocol. Similarly, the corresponding IETF standard for H.264 is RFC 3984: RTPPayload Format for H.264 Video. This standard is currently the main standard for the transmission of H.264 video streams on IP networks, and is widely used. In the field of video communication, the products of all mainstream manufacturers are based on RFC 3984, which is currently the only H.264/RTP transmission method.

事实上,H.264和以往其它的视频压缩编码协议不同的关键地方在于H.264定义了一个新的层面,称为网络抽象层(Network Abstract Layer,简称“NAL”)。H.264为了增加其视频编码层(Video Coding Layer,简称“VCL”)和下面具体的网络传送协议层的分离和无关性,带来更大的应用灵活性,定义了NAL这个新的层面,该层在ITU-T早期的视频压缩编码协议比如H.261,H.263/H.263+/H.263++中都是没有的。然而,如何在NAL和RTP协议承载协同工作中针对H.264的优点设计效率更高、更好的方案,使得RTP对于H.264的承载性能更好,是一个很值得研究和有很大应用价值的问题。In fact, the key difference between H.264 and other previous video compression coding protocols is that H.264 defines a new layer called Network Abstract Layer (NAL). In order to increase the separation and independence of its video coding layer (Video Coding Layer, referred to as "VCL") and the following specific network transmission protocol layer, H.264 brings greater application flexibility and defines a new level of NAL. This layer does not exist in ITU-T's early video compression coding protocols such as H.261, H.263/H.263+/H.263++. However, how to design a more efficient and better solution based on the advantages of H.264 in the cooperative work of NAL and RTP protocol bearer, so that RTP can better bearer performance for H.264, is a problem worthy of research and has great applications. question of value.

RFC 3984规范所提出的RTP承载H.264的NAL层数据的方法是目前主流传送方法,该方案在RTP协议(RFC 3550)的基础上,将NAL层数据封装在RTP净荷中进行承载。NAL层位于VCL和RTP之间,规定要把视频码流按照定义的规则和结构,分割成一连串的NAL数据单元(NAL Units,简称“NALU”)。在RFC 3984中定义了RTP净荷对于NALU的封装格式。下面依次简单介绍RTP的帧格式和现有技术中NALU的封装方法。The method of RTP carrying H.264 NAL layer data proposed by the RFC 3984 specification is the current mainstream transmission method. On the basis of the RTP protocol (RFC 3550), the NAL layer data is encapsulated in the RTP payload for carrying. The NAL layer is located between VCL and RTP, and it is stipulated that the video stream should be divided into a series of NAL data units (NAL Units, referred to as "NALU") according to the defined rules and structures. The encapsulation format of RTP payload for NALU is defined in RFC 3984. The frame format of the RTP and the encapsulation method of the NALU in the prior art are briefly introduced below in sequence.

RTP设计的主要目的是实时多媒体会议和连续数据存储、交互分布式仿真、控制和测量应用等。RTP通常被承载于UDP协议之上,以利用其多路复用和校验的功能。如果底层提供多点分发,RTP支持多地址传送。RTP提供的功能包括:载荷类型鉴别、序列编号、时间戳、和发送监测。The main purpose of RTP design is real-time multimedia conference and continuous data storage, interactive distributed simulation, control and measurement applications, etc. RTP is usually carried on top of UDP protocol to take advantage of its multiplexing and verification functions. RTP supports multi-address delivery if the underlying layer provides multi-point distribution. The functions provided by RTP include: load type identification, sequence number, time stamp, and transmission monitoring.

RTP的包格式如下:RTP头信息基本选项占用12字节(最小情况),而IP协议和UDP协议的头信息分别占用20字节和8字节,因此RTP包封装在UDP包再封装在IP包中,总的头信息占用字节数是12+8+20=40字节。RTP包的头信息的详细结构如图1所示。The packet format of RTP is as follows: the basic options of RTP header information occupy 12 bytes (minimum case), while the header information of IP protocol and UDP protocol occupy 20 bytes and 8 bytes respectively, so RTP packets are encapsulated in UDP packets and then encapsulated in IP packets. In the package, the total number of bytes occupied by the header information is 12+8+20=40 bytes. The detailed structure of the header information of the RTP packet is shown in FIG. 1 .

图1中所示从前到后RTP头信息依次为:第1字节(字节0)为一些关于头信息结构本身的字段,第2字节(字节1)为定义净荷类型,第3、4字节(字节2、3)为包序号(Sequence Number),第5-8字节为时间戳(timestamp),第9-12字节为同步贡献源标识符(Synchronous Source Identifier,简称“SSRCID”),最后为贡献源标识符(Contributing Source Identifiers,简称“CSRCIDs”)的列表,其数目不确定。注意到,在本文描述中第1个字节为标注的字节0,之后依此类推。The RTP header information shown in Fig. 1 from front to back is as follows: the first byte (byte 0) is some fields about the header information structure itself, the second byte (byte 1) is to define the payload type, and the third byte , 4 bytes (bytes 2 and 3) are the Sequence Number, the 5th to 8th bytes are the timestamp, and the 9th to 12th bytes are the Synchronous Source Identifier (Synchronous Source Identifier, referred to as "SSRCID"), and finally a list of Contributing Source Identifiers ("CSRCIDs"), the number of which is indeterminate. Note that the first byte in this description is marked byte 0, and so on.

其中前12个字节出现在所有不同类型的RTP数据包中,而头信息中的其它数据,比如贡献源标识符标识只有当混合器插入时才有。因此CSRC一般用于存在媒体混合时候的情况,比如在多方会议中,音频需要混合,视频也可以用这种方法提供多画面的功能。而同步源标识SSRC其实就是所承载媒体流的标识。Among them, the first 12 bytes appear in all different types of RTP packets, while other data in the header information, such as the contribution source identifier, is only available when the mixer is inserted. Therefore, CSRC is generally used when there is media mixing. For example, in a multi-party conference, audio needs to be mixed, and video can also use this method to provide multi-screen functions. The synchronization source identifier SSRC is actually the identifier of the carried media stream.

上述各个字段的具体意义及全称分别描述如下:The specific meanings and full names of the above fields are described as follows:

V字段为版本(Version)信息,占2比特(bits),目前采用的版本为2,因此置V=2,而其他值如V=1表示更早的RTP版本,V=0表示最原始的RTP前身,即在早期Mbone网络上使用的语音IP(VOIP)通信系统中采用,后来演化成了RTP,而V=3则尚未定义。The V field is version information, which occupies 2 bits. The currently used version is 2, so set V=2, and other values such as V=1 represent an earlier RTP version, and V=0 represents the most original The predecessor of RTP, that is, adopted in the Voice over IP (VOIP) communication system used on the early Mbone network, later evolved into RTP, and V=3 has not yet been defined.

P字段为填充标识(Padding),占1比特,P如果置位,则表示数据包末尾包含一个或多个填充字节(Padding),填充不属于有效载荷的一部分;The P field is the padding flag (Padding), which occupies 1 bit. If P is set, it means that the end of the data packet contains one or more padding bytes (Padding), and the padding is not part of the payload;

X字段为扩展标识比特(Extension),占1比特,X如果置位,则RTP头的最后必须跟一个可变长的头扩展(如果有CSRC列表,头扩展要跟在其后),主要是保留用于某些应用环境下头信息字段不够用的情况,该头信息扩展包含一个16比特的长度字段来计数扩展中有多少个32比特长的字,头扩展的前16比特是左开放的,以便区分标识符和参数,这16比特的格式由具体的层面规范定义,该头扩展的格式定义在RFC 3550第5.3.1节中有详细描述。The X field is the extension identification bit (Extension), which occupies 1 bit. If X is set, the end of the RTP header must be followed by a variable-length header extension (if there is a CSRC list, the header extension must follow it), mainly Reserved for some application environments where the header information field is not enough, the header information extension contains a 16-bit length field to count how many 32-bit long words there are in the extension, the first 16 bits of the header extension are left open, In order to distinguish between identifiers and parameters, the format of these 16 bits is defined by specific layer specifications, and the format definition of this header extension is described in detail in Section 5.3.1 of RFC 3550.

CC字段为贡献源数目(CSRC Count),占4比特,指明头信息最后面的CSRC标识符的个数,接收方根据CC字段可以确定头信息后面的CSRCIDs列表长度;The CC field is the number of contributing sources (CSRC Count), which occupies 4 bits and indicates the number of CSRC identifiers at the end of the header information. The receiver can determine the length of the CSRCIDs list behind the header information according to the CC field;

M字段为标识比特(Marker),占1比特,该标识比特的解释在特定的层面(Profile)中定义,它允许标识出数据包流中的重要事件,一个层面可以定义附加的标识比特或规定没有标识比特,这里所谓层面就是指具体的应用环境设置,由通信双方具体协定,不受协议的限定;The M field is the marker bit (Marker), which occupies 1 bit. The interpretation of the marker bit is defined in a specific profile (Profile), which allows to identify important events in the data packet flow. A layer can define additional marker bits or regulations. There is no identification bit. The so-called level here refers to the specific application environment settings, which are specifically agreed by the two parties in communication and are not limited by the agreement;

PT字段为载荷类型(Payload Type,简称“PT”),共7比特,标识RTP载荷的格式并确定他在应用程序中的解释;标志比特和载荷类型共一个字节携带层面规定信息,这个字节可能会被具体层面重新定义以适应不同需求,在具体应用中可以定义所谓的profile,其实就是一组静态(即通信双方事先约定好的)对应关系,将PT比特不同的取值和不同的媒体格式对应起来。当然也可以通过RTP之外的信令来进行动态协商定义PT取值和媒体格式之间的关系。在一个RTP会话(Session)中,RTP源是可以变更PT的。The PT field is the payload type (Payload Type, referred to as "PT"), a total of 7 bits, which identifies the format of the RTP payload and determines its interpretation in the application; the flag bit and the payload type share one byte to carry layer-specific information, this word Sections may be redefined at the specific level to meet different needs. In specific applications, the so-called profile can be defined. match the media format. Of course, the relationship between the PT value and the media format can also be defined through dynamic negotiation through signaling other than RTP. In an RTP session (Session), the RTP source can change the PT.

接着的字段就是序号共16比特,每发送一个RTP数据包,该序号值加一,这样接收者可以用它来检测数据包丢失和恢复数据包顺序,一次通信中的序号初始值可以随机给定,不影响通信。The next field is the serial number with a total of 16 bits. Every time an RTP data packet is sent, the serial number value is increased by one, so that the receiver can use it to detect data packet loss and restore the order of data packets. The initial value of the serial number in a communication can be given randomly. , does not affect communication.

时间戳占32比特,它反映了RTP数据包中第一个字节的采样时间,这里的采样时间必须来源于一个单调线性增长的时钟,接收方根据其调整媒体播放时间或者进行同步。The timestamp occupies 32 bits, and it reflects the sampling time of the first byte in the RTP packet. The sampling time here must come from a clock that increases monotonically and linearly, and the receiver adjusts the media playback time or performs synchronization according to it.

同步源SSRC ID占32比特,其具体值可随机选择,但要确保同一个RTP会话中的唯一性,即能唯一标识一个媒体源,如果一个源改变了源传送地址,必须选择一个新的SSRC标志符。The synchronization source SSRC ID occupies 32 bits, and its specific value can be selected randomly, but the uniqueness in the same RTP session must be ensured, that is, a media source can be uniquely identified. If a source changes the source transmission address, a new SSRC must be selected identifier.

贡献源CSRC列表,可以根据需要为0-15项,每项占32比特,该列表的长度即CSRC ID的数目正好由CC字段的4个比特标出。事实上,用于标识某个媒体源的CSRC标志符与其对应的贡献源的SSRC标志符是一致的,只不过在不同的接收方的角°色不同,而被置为SSRC或CSRC。在多方通信中,CSRC ID是由混合器插入。The contribution source CSRC list can be 0-15 items according to the needs, and each item occupies 32 bits. The length of the list, that is, the number of CSRC IDs, is just marked by 4 bits of the CC field. In fact, the CSRC identifier used to identify a certain media source is consistent with the SSRC identifier of the corresponding contribution source, except that different receivers have different roles and are set as SSRC or CSRC. In multiparty communication, the CSRC ID is inserted by the mixer.

在承载H.264视频的情况下,RTP把H.264的NALU封装打包成RTP包流。在RFC 3984文件中主要定义了NALU,并且基于此给出H.264层NAL数据在RTP中的封装打包格式。这种NALU的RTP封装格式如图2所示。In the case of carrying H.264 video, RTP encapsulates the NALU of H.264 into an RTP packet stream. The NALU is mainly defined in the RFC 3984 file, and based on this, the packaging format of the H.264 layer NAL data in RTP is given. The RTP encapsulation format of this NALU is shown in FIG. 2 .

图2中给出一个NALU在RTP的净荷中的封装结构,前面第一个字节为NALU头信息,之后为NALU的数据内容,多个NALU首尾相接的填充到RTP包的净荷中,在最后还有可选的RTP填充,这是RTP包格式规定的内容,是为了使得RTP包的长度符合某种特定要求(比如达到固定长度),可选的RTP填充数据一般都填零。Figure 2 shows the encapsulation structure of a NALU in the RTP payload. The first byte in the front is the NALU header information, followed by the data content of the NALU. Multiple NALUs are filled end to end into the payload of the RTP packet. , there is optional RTP padding at the end, which is the content specified in the RTP packet format, in order to make the length of the RTP packet meet certain requirements (such as reaching a fixed length), and the optional RTP padding data is generally filled with zeros.

NALU头信息即第1个字节,也称为八比特组(Octet),其共有三个字段,意义和全称分别描述如下:The NALU header information is the first byte, also known as an octet (Octet), which has three fields in total. The meaning and full name are described as follows:

F字段定义为禁止比特(forbidden_zero_比特),占1比特,用于标识语法错等情况,如果有语法冲突则置为1,当网络识别此单元中存在比特错误时,可将其设为1,以便接收方丢掉该单元,主要用于适应不同种类的网络环境(比如有线无线相结合的环境);The F field is defined as a forbidden bit (forbidden_zero_bit), which occupies 1 bit and is used to identify syntax errors, etc. If there is a syntax conflict, it is set to 1. When the network recognizes that there is a bit error in this unit, it can be set to 1 , so that the receiver can discard the unit, it is mainly used to adapt to different types of network environments (such as the combination of wired and wireless environments);

NRI字段定义为NAL参考标识(nal_ref_idc),占2比特,用于指示NALU数据的重要程度,其值为00表示NALU的内容不用于重建帧间预测的参考图像,而非00则表示当前NALU是属于参考帧的条带(slice)或序列参数集(Sequence Parameter Set,简称“SPS”)、图像参数集(PictureParameter Set,简称“PPS”)等重要数据,该值越大表示当前NAL越重要;The NRI field is defined as the NAL reference identifier (nal_ref_idc), which occupies 2 bits and is used to indicate the importance of the NALU data. Its value is 00, indicating that the content of the NALU is not used to reconstruct the reference image for inter-frame prediction, and not 00, indicating that the current NALU is Important data such as slices or sequence parameter sets (Sequence Parameter Set, referred to as "SPS") and picture parameter sets (PictureParameter Set, referred to as "PPS") belonging to the reference frame. The larger the value, the more important the current NAL is;

Type字段定义为NALU类型(Nal_unit_type),共5比特,可以有32种NALU的类型,其值和具体类型的对应关系在表1中详细给出。The Type field is defined as the NALU type (Nal_unit_type), with a total of 5 bits, and there may be 32 types of NALUs. The corresponding relationship between its values and specific types is given in Table 1 in detail.

表1 NALU头信息中Type字段取值与类型对应关系表Table 1 Correspondence between Type field value and type in NALU header information

  Type值 NALU内容的类型 0 未指定 1 非IDR图像的编码slice 2 编码slice数据划分A 3 编码slice数据划分B 4 编码slice数据划分C 5 IDR图像中的编码slice 6 SEI(补充增强信息) 7 SPS(序列参数集) 8 PPS(图像参数集) 9 接入单元定界符 10 序列结束 11 码流结束 12 填充数据 13-23 保留 24-31 未指定 Type value Types of NALU content 0 not specified 1 Encoded slices for non-IDR images 2 Encoding slice data partition A 3 Encoding slice data partition B 4 Encoding slice data division C 5 Encoded slices in IDR images 6 SEI (Supplementary Enhancement Information) 7 SPS (Sequence Parameter Set) 8 PPS (Picture Parameter Set) 9 access unit delimiter 10 end of sequence 11 end of stream 12 Data input 13-23 reserve 24-31 not specified

可见,NALU的头信息的一个字节中给出的信息主要包含NALU的有效性、重要性等级,根据这些信息可以确定RTP所承载的数据重要性。It can be seen that the information given in one byte of the header information of the NALU mainly includes the validity and importance level of the NALU, and the importance of the data carried by the RTP can be determined according to these information.

当前H.264/RTP的多媒体通信框架下,主要通过运用配合控制协议RCTP来完成服务质量(Quality of Service,简称“QoS”)监测的,以及基于此的拥塞控制和流量控制。RTCP主要用于RTP协议的控制和报告。报告的主要内容就是和QoS相关的信息。RTCP采取的报告方法是周期性报告,即周期性向两方或多方会话(Session)中的所有参与方传送控制数据包,报告采用和RTP数据包同样的分发机制。底层协议提供数据和控制数据包的多路复用(例如各自使用单独的UDP端口号等)。在RFC 3550文件中,建议为RTCP而增加的会话带宽为媒体带宽的5%。Under the current H.264/RTP multimedia communication framework, the quality of service (Quality of Service, referred to as "QoS") monitoring, and congestion control and flow control based on this are mainly completed by using the cooperative control protocol RCTP. RTCP is mainly used for control and reporting of the RTP protocol. The main content of the report is the information related to QoS. The reporting method adopted by RTCP is periodic reporting, that is, the control data packet is periodically transmitted to all participants in the two-party or multi-party session (Session), and the report adopts the same distribution mechanism as the RTP data packet. The underlying protocol provides multiplexing of data and control packets (eg each using a separate UDP port number, etc.). In the RFC 3550 document, it is recommended to increase the session bandwidth for RTCP to be 5% of the media bandwidth.

下面介绍RTCP数据包的类型和结构。RTCP中定义了以下几种RTCP数据包类型来携带多种控制信息:发送方报告(Sender Report,简称“SR”),有关主动发送方的传送和接收的统计信息;接收方报告(Receiver Report,简称“RR”)从不是主动发送方的参与方接收统计信息;资源描述项(SourceDescription,简称“SDES”),里面包括CNAME;参与方结束(退出)标识(BYE);专用功能(Application-specific function,简称“APP”)。The types and structures of RTCP packets are introduced below. The following RTCP packet types are defined in RTCP to carry a variety of control information: sender report (Sender Report, referred to as "SR"), statistical information about the transmission and reception of the active sender; receiver report (Receiver Report, "RR" for short) receives statistical information from a participant that is not an active sender; resource description item (SourceDescription, "SDES" for short), which includes CNAME; participant end (exit) identifier (BYE); special function (Application-specific function, referred to as "APP").

RTCP发送和接收报告的包结构如图3所示,可以按内容类型分为三段,最前面的是头信息,接着是发送方信息,其次是报告内容块,最后的是特定层面(Profile)的扩展(所谓层面表示针对某种特定应用场景需要而制定的具体规则特例)。图3中示出的各个具体字段的意义及全称详细描述如下:The packet structure of the RTCP sending and receiving report is shown in Figure 3. It can be divided into three sections according to the content type. The first is the header information, followed by the sender information, followed by the report content block, and the last is the specific level (Profile) The extension of (the so-called level represents a special case of specific rules formulated for a specific application scenario). The meaning and full name of each specific field shown in Fig. 3 are described in detail as follows:

V字段为版本信息(Version,简称“V”),占2比特,当前RTCP的版本号为V=2;The V field is version information (Version, referred to as "V"), which occupies 2 bits, and the current RTCP version number is V=2;

P字段为填充标志比特(Padding),占1比特,如果P置位,则表示这个RTCP数据包在尾部包含一些不属于控制信息的附加的填充字节,但这些字节计算在长度字段中;The P field is the padding flag bit (Padding), which occupies 1 bit. If P is set, it means that the RTCP packet contains some additional padding bytes that are not control information at the end, but these bytes are calculated in the length field;

RC字段为接收报告计数(Reception Report Count,简称“RC”),占5比特,数据包中包含的接收报告块的数目,允许为0;The RC field is the reception report count (Reception Report Count, referred to as "RC"), which occupies 5 bits, and the number of reception report blocks contained in the data packet is allowed to be 0;

PT字段为数据包类型(Payload Type,简称“PT”),占8比特,取值200的时候标识这是一个SR数据包;The PT field is the data packet type (Payload Type, referred to as "PT"), which occupies 8 bits. When the value is 200, it indicates that this is an SR data packet;

长度(Length)字段,占16比特,等于RTCP数据包以32-比特字(32bitWord)为单位的长度减1,包含头和任何填充;The Length (Length) field, which occupies 16 bits, is equal to the length of the RTCP packet in units of 32-bit words (32bitWord) minus 1, including the header and any padding;

发送方的SSRC,占32比特,指示这个SR数据包的发起者的同步源标识符(Synchronous Source Identifier,简称“SSRC”),这里的同步源唯一标识一个媒体数据源,比如一路视频的源;The sender's SSRC, which occupies 32 bits, indicates the Synchronous Source Identifier (Synchronous Source Identifier, referred to as "SSRC") of the originator of the SR data packet. The synchronization source here uniquely identifies a media data source, such as the source of a video;

NTP timestamp字段为网络时间协议时间戳(Network Time Protocol,简称“NTP”),占64比特,当该报告发送之后,指示了wallclock(绝对日期和时间),与RTP时间戳结合使用;The NTP timestamp field is the Network Time Protocol timestamp (Network Time Protocol, referred to as "NTP"), which occupies 64 bits. When the report is sent, it indicates the wallclock (absolute date and time), which is used in combination with the RTP timestamp;

RTP timestampe字段为RTP时间戳,占32比特,即RTP协议产生的时间戳;The RTP timestampe field is the RTP timestamp, which occupies 32 bits, which is the timestamp generated by the RTP protocol;

发送方的数据包计数字段,占32比特,指示从发送建立到产生这个SR数据包期间发送方传送的RTP数据包总数;The data packet count field of the sender, which occupies 32 bits, indicates the total number of RTP data packets transmitted by the sender during the period from the establishment of the transmission to the generation of the SR data packet;

发送方字节计数字段,共32比特,指示从发送建立到产生这个SR数据包期间,发送方在RTP数据包中传送载荷(Payload)的总字节数(不包括头或填充),该字段可以用来估算载荷的平均速率;The sender byte count field, a total of 32 bits, indicates the total number of bytes (not including header or padding) transmitted by the sender in the RTP data packet during the period from the establishment of sending to the generation of this SR data packet. can be used to estimate the average velocity of the load;

之后的字段包含了0个或多个接收报告块,具体块个数依赖于从上个报告发送方得知的其它源数目,每一个接收报告块传递从单个同步源收到的RTP数据包的统计信息,这些统计信息包括:The following fields contain 0 or more reception report blocks. The specific number of blocks depends on the number of other sources known from the sender of the previous report. Each reception report block conveys the number of RTP packets received from a single synchronization source. Statistics, which include:

碎片丢失(fraction lost)占8比特,表示继上一个报告发送后,来自该源的媒体的丢失碎片数;累积丢失包数,占24比特,表示开始接收以来累积丢包的数目;Fragment loss (fraction lost) occupies 8 bits, indicating the number of lost fragments of the media from the source after the last report was sent; the cumulative number of lost packets, accounting for 24 bits, indicating the cumulative number of lost packets since the start of reception;

其次就是接收到的扩展最大序号、到达时延抖动,都反映网络传送状况;The second is the received extended maximum sequence number and arrival delay jitter, which all reflect the network transmission status;

上一个SR(Last SR,简称“LSR”)占32比特,是指该源上一个SR报告的时间戳标记,取值为上一个SR的NTP的中间32比特;The last SR (Last SR, referred to as "LSR") occupies 32 bits, which refers to the timestamp mark of the previous SR report on the source, and the value is the middle 32 bits of the NTP of the previous SR;

自上一个SR以来的时延(Delay since Last SR,简称“DLSR”),占32比特,是指自上一个SR到这个SR期间的时间间隔长度,这个参数是用来计算QoS报告的关键参数。The delay since the last SR (Delay since Last SR, referred to as "DLSR"), which occupies 32 bits, refers to the length of the time interval from the last SR to this SR. This parameter is a key parameter used to calculate the QoS report .

接收报告(RR)数据包格式同发送报告(SR)的区别是:数据包类型字段的值为201;没有发送方信息部分。The difference between the reception report (RR) packet format and the transmission report (SR) is: the value of the packet type field is 201; there is no sender information part.

根据RTP/RTCP协议标准,RTCP完成四项功能如下:According to the RTP/RTCP protocol standard, RTCP completes four functions as follows:

基本功能,为实时多媒体数据传送质量提供反馈报告机制,这是RTP作为传送层协议的有机组成部分,这种反馈功能通过RTCP来传递发送方报告(SR)和接收方报告(RR)来实现;The basic function is to provide a feedback reporting mechanism for the quality of real-time multimedia data transmission. This is an integral part of RTP as a transport layer protocol. This feedback function is realized by transmitting the sender report (SR) and receiver report (RR) through RTCP;

RTCP为每一个RTP源传送一个永久传送层标识,称为规范名(CanonicalName,简称“CNAME”),SSRC标识在发现冲突或程序重启时可能发生改变,因此接收方需要通过CNAME来跟踪每个参与方;RTCP transmits a permanent transport layer identifier for each RTP source, called the canonical name (CanonicalName, referred to as "CNAME"). The SSRC identifier may change when a conflict is found or the program is restarted, so the receiver needs to use CNAME to track each participant. square;

前两项功能需要所有的参与方都发送RTCP数据包,因此为了让RTP能按比例地增加参与方的数目,必须控制RTCP数据包的速率;The first two functions require all participants to send RTCP packets, so in order for RTP to increase the number of participants proportionally, the rate of RTCP packets must be controlled;

第四项为可选功能,即传送尽可能少的控制信息。The fourth item is an optional function, that is, to transmit as little control information as possible.

可见,采用RTCP协议传送QoS报告,按照RTCP协议规定的报告内容来报告这些QoS信息,基于此实现对H.264等承载媒体的QoS监测。It can be seen that the RTCP protocol is used to transmit the QoS report, and the QoS information is reported according to the report content stipulated by the RTCP protocol, based on which the QoS monitoring of the bearer media such as H.264 is realized.

然而应该指出的是,RTCP在带来能够提供QoS报告机制的同时,因为采用周期性报告方法,导致了额外网络带宽的开销,最高可以达到5%。如果网络出现拥塞(Congestion),导致传送QoS下降,那么RTCP产生的额外流量将使得问题更加恶化。However, it should be pointed out that while RTCP can provide a QoS reporting mechanism, because of the periodic reporting method, the overhead of additional network bandwidth can reach up to 5%. If the network is congested and the delivery QoS drops, the extra traffic generated by RTCP will make the problem worse.

在了解了H.264/RTP及其基于RTCP的服务质量监测的传送结构之后,下面简单介绍有关于视频网络传送的错误弹性和相关技术背景。After understanding the transmission structure of H.264/RTP and its RTCP-based quality of service monitoring, the following briefly introduces the error resilience and related technical background of video network transmission.

H.264视频是未来多媒体通信的主要协议,未来的多媒体通信应用的网络主要是以IP为代表的数据包交换网络和无线网络。这两大类网络都无法提供很好的服务质量(Quality of Service,简称“QoS”)保证,因此视频在网络上传送必然会受到各种传送错误而丢包的影响,从而使得通信质量降低。这里面的一个最主要的问题是IP网络实现“尽力”(best effort)传送,并不能保证传送视频数据的QoS。特别是对经过高效压缩编码的H.264码流,问题更为突出。IP网络上的尽力传送不能保证实时视频通信的QoS,具体表现在三个方面:数据包丢失、时延和时延抖动。其中,数据包丢失对恢复视频的质量影响最大,由于H.264压缩编码算法使用运动估值和运动补偿技术,一旦有数据包丢失存在,不仅影响当前解码图像,而且会影响后续解码图像,即误码扩散。误码扩散对恢复视频质量的影响非常大,只有结合编码端和解码端联合抗误码,才能完全避免误码扩散。H.264 video is the main protocol of multimedia communication in the future, and the network of future multimedia communication application is mainly packet switching network and wireless network represented by IP. Neither of these two types of networks can provide a good Quality of Service ("QoS") guarantee, so the video transmission on the network will inevitably be affected by various transmission errors and packet loss, thereby reducing the communication quality. One of the most important problems here is that the IP network realizes "best effort" (best effort) transmission, which cannot guarantee the QoS of transmitting video data. Especially for the H.264 code stream that has been compressed and encoded with high efficiency, the problem is more prominent. Best-effort delivery over IP networks cannot guarantee the QoS of real-time video communication, which is manifested in three aspects: packet loss, delay and delay jitter. Among them, data packet loss has the greatest impact on the quality of the restored video. Since the H.264 compression coding algorithm uses motion estimation and motion compensation technology, once there is data packet loss, it will not only affect the current decoded image, but also affect the subsequent decoded image, namely Error propagation. Bit error diffusion has a great impact on the restored video quality. Only by combining the encoding end and the decoding end to jointly resist bit errors can the bit error diffusion be completely avoided.

错误弹性(Error Resilience)是指传送机制具有预防错误发生或者在错误发生后能够以一定能力纠正的能力(错误强度在一定范围内,可以完全纠正;超过一定范围,只能部分纠正)。在未来的广泛(可以说无所不在)的多媒体通信环境中,一种视频传送机制是否具有错误弹性将是非常关键的。Error resilience (Error Resilience) means that the transmission mechanism has the ability to prevent errors from occurring or to correct them with a certain ability after an error occurs (error intensity within a certain range can be completely corrected; beyond a certain range, it can only be partially corrected). In the future extensive (it can be said to be ubiquitous) multimedia communication environment, whether a video transmission mechanism has error resilience will be very critical.

存在多种错误弹性机制,比如前向纠错(Forward Error Correction,简称“FEC”)、自动重发请求(Automatic Retransmission Request,简称“ARQ”)、错误掩盖(Error Concealment)、信源信道联合编码(Joint Source-ChannelCoding,简称“JSCC”)、交织(Interleaving)及消除误码扩散等。对于H.264视频在数据包网络上传送,FEC是一种很实用的技术,效果很好。该方法主要采用多种纠错编码来对于要保护的数据进行编码,实质是形成数据冗余,从而增加抗御错误的能力。There are a variety of error resilience mechanisms, such as forward error correction (Forward Error Correction, referred to as "FEC"), automatic retransmission request (Automatic Retransmission Request, referred to as "ARQ"), error concealment (Error Concealment), source channel joint coding (Joint Source-Channel Coding, referred to as "JSCC"), interleaving (Interleaving), and eliminating error diffusion. For the transmission of H.264 video over the data packet network, FEC is a very practical technology, and the effect is very good. This method mainly uses a variety of error correction codes to encode the data to be protected, and the essence is to form data redundancy, thereby increasing the ability to resist errors.

在数据包网络上主要的错误是丢包错误,这种错误在纠错编码理论中叫做删除错误(Erasure Error)。针对删除错误的纠错编码是一大类叫做纠删码(Erasure Codes)。所谓纠删码就是把数据码流顺序逐段分割成大小相同的一个个单元(Unit),也叫做数据节点(Data Nodes),为表示方便,假设共有n个数据节点。然后按照一定的数学运算规则对于这些数据节点进行计算产生出校验节点(Parity Nodes或Check Nodes),为了增强保护能力,还可以对于这些校验节点继续按照相同或者不同的数学运算规则运算产生出第二层校验节点,依次类推,可以生成第三层,第四层,直至第N层校验节点。The main error on the data packet network is the packet loss error, which is called erasure error in error correction coding theory. Error correction codes for erasure errors are a large class called erasure codes (Erasure Codes). The so-called erasure code is to divide the data code stream into units of the same size segment by segment, also called data nodes (Data Nodes). For convenience, it is assumed that there are n data nodes in total. Then calculate these data nodes according to certain mathematical operation rules to generate check nodes (Parity Nodes or Check Nodes). The second layer of check nodes, and so on, can generate the third layer, the fourth layer, and up to the Nth layer of check nodes.

一般来说,如果涉及多层校验节点,每层上的节点数目相对于上一层是按照一定规律(最常见的是等比规律)递减的,这样就行成一个逐层递缩的多层节点结构。可以形象地表示为一个向右转90度的金字塔。其中,最左边是数据节点层,向右排列依次是第一层校验节点,第二层校验节点,......,第N层校验节点。Generally speaking, if multi-layer verification nodes are involved, the number of nodes on each layer is reduced according to a certain law (the most common is the law of proportionality) relative to the previous layer, so that a layer-by-layer shrinking multi-layer node structure. It can be visualized as a pyramid turned 90 degrees to the right. Among them, the leftmost layer is the data node layer, and the rightmost layer is the first layer of check nodes, the second layer of check nodes, ..., the Nth layer of check nodes.

其中一类纠删码具有一种非常重要的性质,即处理需要的时间复杂度是和数据节点数n存在线性关系,因此叫做线性时间特性(linear-time)。而很多其它的纠删码比如著名的Reed-Solomon码需要的时间复杂度就要高得多,是n*log2n*log(logn)数量级的。因此,具有线性时间性的纠删码其在实时通信中的用途要好得多。One type of erasure code has a very important property, that is, the time complexity required for processing is linearly related to the number n of data nodes, so it is called linear-time characteristic. And many other erasure codes, such as the famous Reed-Solomon code, require a much higher time complexity, which is on the order of n * log2n * log(logn). Therefore, erasure codes with linear temporality are much better for real-time communication.

Tornado纠删码(下文均简称Tornado码)是1998年前后出现的一种的新型纠删码。Tornado码结构简单;运算高效,因为它具有线性时间性;保护能力强。在实际应用中,获得了很好的效果。目前已经获得较为广泛的应用。根据最新的ITU-T动态,其中的SG16目前正在考虑对于错误控制编码类(Error Control Codes)技术进行标准化的可能性,主要是针对视频音频网络传送进行保护。Tornado码及其多个变种很可能是其中的重要技术。Tornado erasure codes (hereinafter referred to as Tornado codes) are a new type of erasure codes that appeared around 1998. The structure of Tornado code is simple; the operation is efficient because it has linear time; and the protection ability is strong. In practical applications, good results have been obtained. At present, it has been widely used. According to the latest ITU-T trends, SG16 is currently considering the possibility of standardizing Error Control Codes (Error Control Codes) technology, mainly for the protection of video and audio network transmission. Tornado code and its many variants are likely to be one of the important technologies.

在Tornado码中,从数据节点逐层产生出多个校验节点层。校验节点和数据节点都由发送端通过网络发送给接收端。如果在网络传送过程中,部分节点丢失了,因为上层节点参加了下层节点的生成,因此上层节点的信息已经包含在了下层节点以及更下层节点中,因此丢失节点的信息可以通过足够多数目的下层节点或者更下层节点来完全恢复。如果每个节点是一个包,则丢失的包可以由正确接收到的其它包完全恢复。设数据节点个数为n,产生的校验节点数为1。则定义纠删码的码率和冗余率分别是:r=n/(n+1),1-r=1/(n+1);在其它条件相同情况下(保护能力,造成的延迟等),码率越高(必然地,冗余率越低),则纠删码的效率越高。In the Tornado code, multiple layers of check nodes are generated layer by layer from the data nodes. Both the check node and the data node are sent from the sender to the receiver through the network. If some nodes are lost during the network transmission process, because the upper layer nodes participated in the generation of the lower layer nodes, the information of the upper layer nodes has been included in the lower layer nodes and lower layer nodes, so the information of the lost nodes can pass through enough lower layer nodes node or lower-level nodes to fully recover. If each node is a packet, lost packets can be fully recovered by other packets received correctly. Let the number of data nodes be n, and the number of generated check nodes be 1. Then define the code rate and redundancy rate of erasure code to be respectively: r=n/(n+1), 1-r=1/(n+1); Under the same situation of other conditions (protection ability, the delay caused etc.), the higher the code rate (necessarily, the lower the redundancy rate), the higher the efficiency of the erasure code.

图4示出了一种典型的Tornado码数据节点及各层校验节点间的关系。图中节点之间的连线称为边,表示边的左侧节点参与计算右侧节点,可见前后两层节点之间是一种多对多的逻辑关系。设数据节点个数为n,总的校验节点个数为m,则定义纠删码的码率r=n/(n+m)和冗余率1-r=m/(n+m),在相同情况下(保护能力,造成的延迟等),码率越高、冗余率越低,则纠删码的效率越高。Tornado码的结构和性能主要由三个因素决定:(a)数据节点的数目以及逐层递缩的规律,一般按等比例递缩;(b)产生下一层节点的计算方法;(c)相邻两层节点之间的关联关系。FIG. 4 shows a typical Tornado code data node and the relationship between check nodes of each layer. The connection between the nodes in the figure is called an edge, which means that the left node of the edge participates in the calculation of the right node. It can be seen that there is a many-to-many logical relationship between the nodes in the front and back layers. Assuming that the number of data nodes is n and the total number of check nodes is m, then define the code rate r=n/(n+m) and redundancy rate 1-r=m/(n+m) of the erasure code , under the same circumstances (protection capability, delay caused, etc.), the higher the code rate and the lower the redundancy rate, the higher the efficiency of the erasure code. The structure and performance of the Tornado code are mainly determined by three factors: (a) the number of data nodes and the law of layer-by-layer shrinkage, generally shrinking in equal proportions; (b) the calculation method for generating the next layer of nodes; (c) The association relationship between two adjacent layers of nodes.

Tornado码各个参数之间可以推得以下关系,数据节点的数目设为n,校验节点数目设为m,递缩比例设为p,校验节点层数为i,则前i-1层校验节点的数目分别为np、np2、...、npi-1,而最后一层即第i层的数目定为npi/(1-p),这样得到总节点数n+m=n+n+np2+..+npi-1+npi/(1-p)=n/(1-p),则有m=np/(1-p),即为递缩比例与校验节点数之间满足的隐含关系。因为要保证每层的节点数np、np2、...、npi-1及npi/(1-p)都是整数,即可根据给定的i和p计算出n的可行值,比如i=4,p=1/2,则可以推算出n必须为16的倍数。The following relationship can be deduced between the various parameters of the Tornado code. The number of data nodes is set to n, the number of check nodes is set to m, the scaling ratio is set to p, and the number of layers of check nodes is i. The number of test nodes is np, np 2 ,..., np i-1 respectively, and the number of the last layer, i.e. the i-th layer, is set as np i /(1-p), so that the total number of nodes n+m= n+n+np 2 +..+np i-1 +np i /(1-p)=n/(1-p), then there is m=np/(1-p), which is the scaling ratio and Check the implicit relationship satisfied between the number of nodes. Because it is necessary to ensure that the number of nodes np, np 2 , ..., np i-1 and np i /(1-p) of each layer are all integers, the feasible value of n can be calculated according to the given i and p, For example, i=4, p=1/2, then it can be deduced that n must be a multiple of 16.

Tornado码产生过程中最常采用的计算方法是异或运算,因为异或运算具有很方便的恢复功能。对于两个等长的比特序列A=[a0,a1,a2,.....,aL],B=[b0,b1,b2,.....,bL],按比特进行异或运算得到同样长的比特序列C,则有以下性质:A与C异或得到B,B与C异或得到A;同样的对于多个序列之间的异或运算,也有相应的恢复方法。可见,经过异或运算后,数据节点或者校验节点之间即建立相互联系,任意一个节点丢失后,均可由所有其余节点恢复。由于最后一层校验节点的递缩比例不同,因此一般采用常规的纠错编码策略进行计算,比如Reed-Solomon码。The most commonly used calculation method in the process of Tornado code generation is XOR operation, because XOR operation has a very convenient recovery function. For two equal-length bit sequences A=[a 0 , a 1 , a 2 ,..., a L ], B=[b 0 , b 1 , b 2 ,..., b L ], carry out the exclusive OR operation by bit to obtain the same long bit sequence C, then have the following properties: A and C XOR obtain B, B and C XOR obtain A; similarly for the XOR operation between multiple sequences, There is also a corresponding recovery method. It can be seen that after the XOR operation, the data nodes or check nodes are connected to each other, and if any node is lost, it can be restored by all other nodes. Since the shrinkage ratio of the check nodes in the last layer is different, it is generally calculated using a conventional error correction coding strategy, such as a Reed-Solomon code.

Tornado码的另一个重要因素就是前后层之间的关联关系,即下层的某个节点是由前一层的哪些节点计算得到的。根据图论,前后两层节点之间形成一个二部图,任意一条边的两端分别在前一层和后一层,前一层节点也称为左侧节点,后一层节点称为右侧节点,每个节点与其关联的边的条数称为度。根据Luby等人的随机图论数学证明,决定Tornado码的保护能力的参数实际上是前后层构成的二部图的两侧节点的度的向量,而这个度向量是随机产生的。在实际应用中,Tornado编码之前需要先确定节点度向量的随机分布,然后按照该分布随机匹配产生各级二部图,根据二部图左右节点间的关联即确定了前后层节点之间的关联关系。Another important factor of the Tornado code is the relationship between the front and back layers, that is, a certain node in the lower layer is calculated from which nodes in the previous layer. According to graph theory, a bipartite graph is formed between nodes in the front and back layers. The two ends of any edge are in the previous layer and the next layer respectively. The nodes in the previous layer are also called left nodes, and the nodes in the latter layer are called right nodes. Side nodes, the number of edges associated with each node is called the degree. According to the mathematical proof of random graph theory by Luby et al., the parameters that determine the protection ability of Tornado codes are actually the degree vectors of the nodes on both sides of the bipartite graph composed of the front and rear layers, and this degree vector is randomly generated. In practical applications, before Tornado encoding, it is necessary to determine the random distribution of node degree vectors, and then randomly match the distribution to generate bipartite graphs at all levels. According to the correlation between the left and right nodes of the bipartite graph, the correlation between the front and rear nodes relation.

在目前的Tornado码策略中,通过给定保护能力和其它要求,比如数据节点大小合理性,可以接受的最大网络延迟等,确定参数n,m,i,p等,并给定节点度向量的随机分布,并可进行Tornado编码。在接收端进行解码时,根据每一级的二部图,如果有一个右节点被正确接收,且与它相关联的所有左节点中只有一个节点丢失,那么该丢失的节点就可以通过这个右节点与所有未丢失的左节点恢复得到,即达到了纠错的效果。In the current Tornado code strategy, the parameters n, m, i, p, etc. are determined by given protection capabilities and other requirements, such as the rationality of the data node size, the maximum acceptable network delay, etc., and the given node degree vector Randomly distributed and can be Tornado encoded. When decoding at the receiving end, according to the bipartite graph at each level, if a right node is received correctly and only one node is lost among all the left nodes associated with it, then the lost node can pass through the right node. The node and all the left nodes that are not lost are recovered, that is, the effect of error correction is achieved.

其实,纠删码的范围很大,Tornado码只是其中比较典型的一种,另外还有比如RS(Reed-Solomon)码、低密度校验码(Low Density Parity Codes,简称“LDPC”)等。In fact, the range of erasure codes is very large, and Tornado codes are just one of the more typical ones. In addition, there are RS (Reed-Solomon) codes, Low Density Parity Codes (Low Density Parity Codes, referred to as "LDPC") and so on.

纠删码的一个重要的性能指标就是其纠错能力(或者叫做保护能力),直接体现为能够完全纠正丢包错误所允许的最大丢包数量(在一定包的总数前提下),或者当丢包高于这个最大允许数量条件下,能够正确纠正包的百分比。一般来说,在其他条件相同情况下,保护能力越高,冗余率越高。An important performance index of erasure codes is its error correction capability (or protection capability), which is directly reflected in the maximum number of packet losses allowed to completely correct packet loss errors (under the premise of a certain total number of packets), or when the packet loss error is completely corrected. Percentage of correctly corrected packets above this maximum allowed number. Generally speaking, under other conditions being the same, the higher the protection capability, the higher the redundancy rate.

保护能力不仅适用于纠删码,在更大范围内,所有FEC编码都可以用保护能力来度量。在视频数据中,有些数据相对重要性高,比如视频序列的结构参数、图像的结构参数、头信息等;另外一些数据的重要性相对低,比如图像内容数据等。在使用FEC进行保护时,对于相对重要的数据采用保护能力较强的编码;而对于相对不重要的数据采用保护能力较弱的编码。这样可以在保护能力和效率之间达成平衡。不能一味强调保护能力,因为这样会导致很高的冗余,牺牲了效率。这种根据数据相对重要性来进行不同保护能力的FEC保护的方法叫做不等保护(Unequal Protection,简称“UEP”)。通过不等保护,容易实现视频通信服务的QoS保证。The protection ability is not only applicable to erasure codes, but in a larger scope, all FEC codes can be measured by the protection ability. In video data, some data are relatively important, such as structural parameters of video sequences, image structural parameters, header information, etc.; other data are relatively low in importance, such as image content data. When FEC is used for protection, a code with stronger protection ability is used for relatively important data; a code with weaker protection ability is used for relatively unimportant data. This allows for a balance between protection capability and efficiency. The protection ability cannot be emphasized blindly, because this will lead to high redundancy and sacrifice efficiency. This method of FEC protection with different protection capabilities according to the relative importance of data is called Unequal Protection ("UEP" for short). Through unequal protection, it is easy to realize the QoS guarantee of the video communication service.

不等保护的思想是对于多媒体数据中具有不同重要性(相对的)的数据采用不同保护能力/保护强度的保护机制进行保护。不同的保护机制可以指大类或者小类,比如大类在原理上不同,小类仅仅在结构或者参数上不同。分级保护是对于保护机制按照保护能力分成多个级别,这些级别可以是跨大类的,也可以是在一个大类中跨小类的,即仅仅是结构和参数上的不同。在网络传送中,网络状况随时间变化,这种变化规律是复杂的预先无法确定的,因此,分级保护其实是一种自适应的策略,根据网络瞬时的状况,按照一组预先定义的规则选择最优的保护措施。不等保护和分级保护是可以结合起来,形成更为复杂和强有力的保护策略的。The idea of unequal protection is to adopt protection mechanisms with different protection capabilities/protection strengths for data with different importance (relatively) in multimedia data. Different protection mechanisms can refer to major categories or subcategories. For example, the major categories are different in principle, and the small categories are only different in structure or parameters. Hierarchical protection is to divide the protection mechanism into multiple levels according to the protection ability. These levels can be across large categories, or across subcategories within a large category, that is, only the structure and parameters are different. In network transmission, network conditions change with time, and this change rule is complex and unpredictable. Therefore, hierarchical protection is actually an adaptive strategy, which is selected according to a set of predefined rules according to the instantaneous network conditions optimal protection. Unequal protection and hierarchical protection can be combined to form a more complex and powerful protection strategy.

在实际H.264视频通信中,由于丢包等引起的删除错误导致图像质量退化是非常严重的,更甚于引起解码端系统的崩溃。这是由于H.264相对于其它视频编码标准来说能力更强、效率更高、功能更丰富,反过来对于删除错误的承受能力也更低。因此,在基于H.264标准的视频通信中,除了错误弹性保护策略之外,必须采用有效的抗丢包等删除错误的技术,并结合多种视频抗误码方法,来保证恢复图像的质量。In actual H.264 video communication, the degradation of image quality due to deletion errors caused by packet loss is very serious, even worse than causing the collapse of the decoding end system. This is because H.264 is more capable, more efficient, and more functional than other video coding standards, and in turn has lower tolerance for deletion errors. Therefore, in the video communication based on the H.264 standard, in addition to the error resilience protection strategy, effective anti-packet loss and other error deletion technologies must be adopted, combined with a variety of video anti-error methods to ensure the quality of the restored image .

现有的抗丢包错误技术大体可以分为两类:(a)主动防错型:事先采取保护措施,比如引入冗余机制,尽量保证数据包不丢失或者确保接收端能够恢复少量丢失的数据;(b)错误补偿型:在发生误码情况下采取一定的补偿措施,比如在网络状况恶化严重情况下,丢包率非常高,主动防错方法失去效果,这时就需要对已经发生的误码进行补偿。Existing anti-packet loss error technologies can be roughly divided into two categories: (a) Active error-proof type: take protective measures in advance, such as introducing redundancy mechanisms, try to ensure that data packets are not lost or ensure that the receiving end can recover a small amount of lost data (b) error compensation type: take certain compensation measures in the event of bit errors, such as when the network condition deteriorates seriously, the packet loss rate is very high, and the active error prevention method loses its effect. Bit errors are compensated.

错误补偿的误码消除方法根据侧重点不同又分为错误掩盖和误码扩散消除两种。其中,错误掩盖是侧重于补偿误码当前的影响,比如接收端当前视频帧或者Slice丢失,则图像无法正确显示,即采取一定措施进行补偿,使得对于用户产生的影响最小。而误码扩散消除则是消除误码在空间和时间上扩散带来的后续影响,比如接收端收到的帧丢失或者部分丢失,由于该帧可能是后续帧的预测参考帧,其误码将会在时间域上扩散到后续帧中;或者由于H.264中可能存在的帧内预测,以及环路滤波,都可能使得该帧的误码通过空间预测扩散到该帧的其它位置。误码扩散消除就是采用一定的措施,在空间上要限制误码影响在有限区域内,在时间上要限制在有限时间内,避免视频通信失败,更甚于解码器系统工作紊乱和崩溃。According to different emphases, the bit error elimination method of error compensation is divided into two types: error concealment and error diffusion elimination. Among them, error concealment focuses on compensating for the current impact of bit errors. For example, if the current video frame or slice at the receiving end is lost, the image cannot be displayed correctly, that is, certain measures are taken to compensate, so as to minimize the impact on users. The error diffusion elimination is to eliminate the subsequent impact of error diffusion in space and time. For example, the frame received by the receiving end is lost or partially lost. Since this frame may be the prediction reference frame of the subsequent frame, its error will be It will spread to subsequent frames in the time domain; or due to the possible intra-frame prediction and loop filtering in H.264, the bit error of the frame may spread to other positions of the frame through spatial prediction. Elimination of bit error diffusion is to take certain measures to limit the influence of bit errors in a limited area in space and a limited time in time to avoid video communication failure, which is even worse than the decoder system working disorder and collapse.

可见在误码环境下,误码扩散不仅使出错帧的恢复图像质量下降,而且可能会给后续帧造成不可恢复的损失,即使解码端使用了错误掩盖技术,也不能避免恢复图像质量的下降。另外,由于视频通信的实时性要求强,通常不采用ARQ方式重传发生错误的数据。It can be seen that in a bit error environment, bit error diffusion not only degrades the restored image quality of the erroneous frame, but also may cause unrecoverable losses to subsequent frames. Even if the error concealment technology is used at the decoding end, the restored image quality cannot be avoided. In addition, due to the strong real-time requirements of video communication, the ARQ mode is usually not used to retransmit erroneous data.

通过错误掩盖,即通过用发生误码部分在空间上和时间上相邻部分的正确数据进行简单代替或者复杂的预测,插值等掩盖错误部分的方法,可以解决补偿误码问题。这在接收端即可实现,无需发送端参与。而误码扩散问题则更加复杂,需要发送端和接收端配合起来采取适当的策略才能抑制和消除。The problem of bit error compensation can be solved by error concealment, that is, by simply replacing or complex prediction, interpolation, and other methods of covering up the error portion with the correct data of the adjacent portion of the bit error portion in space and time. This is done on the receiving end without the involvement of the sending end. The problem of bit error diffusion is more complicated, and requires the cooperation of the sending end and the receiving end to adopt appropriate strategies to suppress and eliminate it.

需要注意的是,错误掩盖也会导致误码扩散的产生。事实上,由于错误掩盖会造成编码端和解码端重构图像缓存内容不匹配,从而导致误码在时间域上的扩散。例如,当解码第n-1帧有丢包时,解码端会使用第n-2帧对应位置图像数据进行错误掩盖,而在发送端编码时并不知第n-1帧有丢包,会使用正确的第n-1帧图像来编码第n帧图像,而在接收端解码第n帧时却会用第n-2帧代替第n-1帧解码,由此引起误码扩散。It should be noted that error concealment can also lead to error diffusion. In fact, due to error concealment, the contents of the reconstructed image cache at the encoding end and the decoding end do not match, resulting in the diffusion of bit errors in the time domain. For example, when there is a packet loss in decoding frame n-1, the decoding end will use the image data corresponding to the position of frame n-2 to cover up the error, while the sending end does not know that there is packet loss in frame n-1 when encoding, and will use The correct n-1 frame image is used to encode the n-th frame image, but when the n-th frame is decoded at the receiving end, the n-2 frame is used instead of the n-1 frame for decoding, which causes error diffusion.

现有的H.264/RTP传送架构以及基于RTCP的QoS报告方法,采用RTP直接封装NALU进行传送,用RTCP的SR/RR报告监测QoS信息,前面已经介绍相关技术细节。The existing H.264/RTP transmission architecture and RTCP-based QoS reporting method use RTP to directly encapsulate NALU for transmission, and use RTCP SR/RR reports to monitor QoS information. The relevant technical details have been introduced above.

另外,现有技术中采用的Tornado码是一种比较复杂的方案。采用Tornado码实现基于H.261/H.263/H.263+/H.263++/H.264视频压缩编码的数据传送保护方法包括如下步骤:In addition, the Tornado code used in the prior art is a relatively complicated solution. Adopting Tornado code to realize the data transmission protection method based on H.261/H.263/H.263+/H.263++/H.264 video compression coding comprises the steps:

在步骤1、设计Tornado码的结构。具体为:根据给定的保护能力、其它要求、具体的数据类型如音频还是视频、速率大小等因素决定数据节点大小为L1比特,根据具体应用中可以接受的最大网络延迟等因素确定数据节点数目n、总校验节点数目L、中间校验层的层数m、各层之间节点数目的递缩比例因子

Figure C200510110013D0032151659QIETU
等,确定任何相邻两层的节点之间的二部图的左边度分布向量和右边度分布向量。In step 1, the structure of the Tornado code is designed. Specifically: according to the given protection capability, other requirements, specific data types such as audio or video, rate and other factors, the size of the data node is determined to be L1 bits, and the number of data nodes is determined according to factors such as the maximum acceptable network delay in specific applications. n, the total number of verification nodes L, the number of layers in the middle verification layer m, and the scaling factor for the number of nodes between layers
Figure C200510110013D0032151659QIETU
etc., determine the left degree distribution vector and right degree distribution vector of the bipartite graph between nodes of any adjacent two layers.

到步骤2、根据上述左边度分布向量和右边度分布向量采用随机图匹配的方式生成每个相邻两层节点之间的随机二部图。Go to step 2, according to the above-mentioned left degree distribution vector and right degree distribution vector, a random bipartite graph between each adjacent two-layer nodes is generated by means of random graph matching.

到步骤3、开始数据传送过程,发送端的H.261/H.263/H.263+/H.263++/H.264视频编码器产生数据码流,这个数据码流是需要进行数据传送保护的。Go to step 3 and start the data transmission process. The H.261/H.263/H.263+/H.263++/H.264 video encoder at the sending end generates a data stream, which is required for data transmission protected.

到步骤4、把需要保护的数据码流,分割成具有L1比特的大小相等的数据节点D0、D1、D2、D3....DTGo to step 4, divide the data code stream to be protected into equal-sized data nodes D 0 , D 1 , D 2 , D 3 . . . DT with L1 bits.

到步骤5、令t=0。Go to step 5, let t=0.

到步骤6、获取Dt、Dt+1....Dt+n-1共n个数据节点。Go to step 6, obtain n data nodes in total D t , D t+1 ... D t+n-1 .

到步骤7、根据步骤2确定的相邻两层节点之间的随机二部图为上述n个数据节点Dt、Dt+1....Dt+n-1逐层生成各个中间校验节点层MC(0)、MC(1)...MC(m)和最后校验节点层FC。Go to step 7, according to the random bipartite graph between the adjacent two-layer nodes determined in step 2, each intermediate school is generated layer by layer for the above n data nodes D t , D t+1 .... D t+n-1 The check node layers MC (0) , MC (1) ... MC (m) and the final check node layer FC.

到步骤8、将数据节点Dt、Dt+1....Dt+n-1和通过步骤7生成的各校验节点层中的校验节点全部通过UDP/IP或者TCP/IP等网络打包方式打包后发送至接收端。Go to step 8, pass data nodes D t , D t+1 ... D t+n-1 and check nodes in each check node layer generated through step 7 through UDP/IP or TCP/IP, etc. In the network packaging method, it is packaged and sent to the receiving end.

到步骤9、发送端根据t、T的数值判断所有的数据节点是否处理完毕,如果所有的数据节点已处理完毕,到步骤10;否则,t=t+n,到步骤6。Go to step 9, the sender judges whether all data nodes have been processed according to the values of t and T, if all data nodes have been processed, go to step 10; otherwise, t=t+n, go to step 6.

在步骤10、发送端的通信过程结束。In step 10, the communication process of the sending end ends.

由于在实时通信过程中发送端往往是一边进行编码压缩一边发送的,所以上述步骤3至步骤8表示的并不是时间上的先后顺序,而是逻辑上的先后顺序。Since the sending end usually transmits while encoding and compressing in the process of real-time communication, the above steps 3 to 8 represent not a chronological sequence, but a logical sequence.

上述过程是对发送端进行描述的,对于接收端,在接收端接收到一批数据节点和校验节点后,首先应判断哪些数据节点是应该收到而实际上丢失了、并可恢复的数据节点,然后,对于上述丢失了的且可恢复的数据节点按照Tornado码的一般解码过程进行解码恢复。接收端重复上述接收并恢复过程,直到通信过程结束。The above process describes the sending end. For the receiving end, after receiving a batch of data nodes and check nodes, the receiving end should first determine which data nodes should be received but actually lost and recoverable data The node then performs decoding and recovery on the above-mentioned lost and recoverable data according to the general decoding process of the Tornado code. The receiving end repeats the above receiving and recovery process until the communication process ends.

另外,现有的误码消除方法都是独立的错误掩盖方法或者误码扩散消除方法,都有很多种实现技术细节不同的方法。错误掩盖方法有时间域掩盖、空间域掩盖、时空联合掩盖等。误码扩散消除又有帧内编码、标识、自适应帧内块刷新等。In addition, the existing bit error elimination methods are all independent error concealment methods or error diffusion elimination methods, and there are many methods with different implementation technical details. Error concealment methods include time domain concealment, space domain concealment, joint spatiotemporal concealment and so on. Error diffusion elimination includes intra-frame coding, identification, adaptive intra-frame block refresh, etc.

时间域掩盖方法就是采用时间轴上相邻的帧的信息来推算丢失数据。推算的方法可以是:简单采用相邻帧相同位置的数据代替丢失数据;考虑运动预测因素,根据相邻帧数据进行运动预测。除此还有更加复杂的掩盖策略,但是计算量非常大。The time domain masking method is to use the information of adjacent frames on the time axis to calculate the missing data. The method of inferring can be: simply adopt the data of the same position of the adjacent frame to replace the missing data; consider the motion prediction factor, and perform motion prediction according to the data of the adjacent frame. In addition, there are more complex masking strategies, but the amount of calculation is very large.

空间域掩盖方法就是利用丢失数据区域的空间相邻区域来进行错误掩盖。同样的有:简单用领域替代;基于数据融合的有多个空间相邻区域推算丢失数据,比如空间插值;代数反演法,把丢包过程用一个线性模型建模,其输入是丢包前数据,输出是正确接收到的数据,利用代数反演的方法,比如最小二乘法,从输出来反演输入,用反演结果来替代错误数据,这种方法计算量大。The spatial domain masking method is to use the spatial adjacent area of the missing data area to conceal the error. The same is as follows: simple replacement with domain; based on data fusion, there are multiple spatially adjacent areas to calculate the lost data, such as spatial interpolation; algebraic inversion method, the packet loss process is modeled with a linear model, and its input is Data, the output is the data received correctly, using algebraic inversion method, such as the least squares method, to invert the input from the output, and replace the wrong data with the inversion result, this method is computationally intensive.

时空联合掩盖方法则是联合使用空间域和时间域的错误掩盖。比如,根据丢失数据的特点和相邻时间数据和空间数据的情况,采用某种策略确定用空间域掩盖还是时间域掩盖更好,然后实施这种更好的掩盖策略。或者,融合空间数据和时间数据,共同进行掩盖。The spatio-temporal joint concealment method is to jointly use the error concealment of space domain and time domain. For example, according to the characteristics of the missing data and the situation of adjacent time data and spatial data, a certain strategy is adopted to determine whether it is better to use the space domain or the time domain to cover up, and then implement this better cover strategy. Alternatively, fuse spatial and temporal data to jointly perform masking.

基于帧内编码的误码扩散消除方法是将受误码影响的宏块采用帧内编码,即利用运动矢量的前向依赖关系进行准确误码跟踪,并对受误码影响的宏块采用帧内编码,可以有效地防止误码扩散。首先给出由于运动补偿所引起的帧间依赖性;然后根据运动矢量前向依赖性和权重因子的相关性计算误码的“能量”,对“能量”最大的宏块使用帧内编码,从而防止误码扩散。The error diffusion elimination method based on intra-frame coding is to use intra-frame coding for the macroblocks affected by the errors, that is, to use the forward dependence of the motion vector for accurate error tracking, and to use frame Inner coding can effectively prevent bit errors from spreading. Firstly, the inter-frame dependence caused by motion compensation is given; then the "energy" of the bit error is calculated according to the correlation between the forward dependence of the motion vector and the weight factor, and intra-frame coding is used for the macroblock with the largest "energy", so that Prevent error propagation.

基于标识的误码扩散消除方法是将受误码影响的宏块做标识,使得编码时避免采用标识过的宏块做参考帧,直接防止了扩散。该方法需要建立重发送端到接收端的反馈机制,接收端将丢失数据的信息反馈到发送端,编码端根据错误信息将同一块组中出错宏块以后的像素全部用一个特定的值来标识,在之后的若干帧的编码中不参考已标识的区域,避免了接收端的误码扩散。The identification-based error diffusion elimination method is to identify the macroblock affected by the error, so as to avoid using the identified macroblock as a reference frame during encoding, which directly prevents the error from spreading. This method needs to establish a feedback mechanism from the sending end to the receiving end. The receiving end will feed back the information of the lost data to the sending end, and the encoding end will use a specific value to mark all the pixels after the error macroblock in the same block group according to the error information. The marked area is not referred to in the coding of several subsequent frames, which avoids error diffusion at the receiving end.

基于自适应帧内块刷新策略的误码扩散消除方法是基于编码端的“误码灵敏性尺度”来衡量每一个编码宏块对信道误码的易损性,然后进行自适应的帧内块刷新。该方法不需要反馈信道。编码端先初始化“误码灵敏性尺度”值:距离同步标志越远的宏块,对误码的敏感性越高;编码宏块的比特数越多,越容易受到误码破坏。在编码过程中,通过计算每个宏块“误码灵敏性尺度”值的积累来更新这个尺度,然后根据ESM尺度选择宏块进行帧内编码。The error diffusion elimination method based on the adaptive intra-frame block refresh strategy is based on the "error sensitivity scale" of the encoder to measure the vulnerability of each coded macroblock to channel error, and then performs adaptive intra-frame block refresh . This method does not require a feedback channel. The encoding end first initializes the value of the "bit error sensitivity scale": the farther the macroblock is from the synchronization mark, the higher the sensitivity to bit errors; the more bits of the encoded macroblock, the more vulnerable it is to being damaged by bit errors. During encoding, this scale is updated by calculating the accumulation of the "error sensitivity scale" value for each macroblock, and then macroblocks are selected for intra coding according to the ESM scale.

此外,现有技术中由于没有方便的能够提供网络状况监测的方案以及数据相对重要性的描述,都没有实现多级保护和不等保护。In addition, because there is no convenient solution for network status monitoring and description of the relative importance of data in the prior art, multi-level protection and unequal protection have not been realized.

在实际应用中,上述方案存在以下问题:现有技术中的Tornado码方案过于复杂,效率低,应用于视频数据的保护,延时大,无法满足实时通信的性能要求。In practical application, the above-mentioned scheme has the following problems: the Tornado code scheme in the prior art is too complex and inefficient, and when applied to the protection of video data, the delay is large and cannot meet the performance requirements of real-time communication.

同时缺乏一种能够报告网络状况的机制,因此,通信双方无法根据网络状况来决策采用合适的保护机制,从而不等保护和自适应分级保护都无法有效使用,多媒体通信的可靠性不能达到要求。At the same time, there is a lack of a mechanism that can report network conditions. Therefore, the communication parties cannot decide to adopt an appropriate protection mechanism according to the network conditions, so that unequal protection and adaptive hierarchical protection cannot be used effectively, and the reliability of multimedia communication cannot meet the requirements.

而且错误掩盖和误码扩散消除两种方法没有很好的统一起来,有的时候相互矛盾,其作用相互抵消。Moreover, the two methods of error concealment and bit error diffusion elimination are not well unified, and sometimes contradict each other, and their functions cancel each other out.

现有技术中,H.264 NAL和RTP协议的联合工作机制缺乏,如何基于H.264 NAL和相应的RTP封装方法来提供保护机制没有定义,是一个空白。进一步来说,好的方法,比如高效率的Tornado编码和其他保护措施、以及不等保护和自适应分级保护等现在还不能应用于H.264视频数据。In the prior art, the joint working mechanism of H.264 NAL and RTP protocol is lacking, and how to provide a protection mechanism based on H.264 NAL and the corresponding RTP encapsulation method is not defined, which is a blank. Furthermore, good methods, such as high-efficiency Tornado coding and other protection measures, as well as unequal protection and adaptive hierarchical protection, cannot be applied to H.264 video data yet.

现有技术没有能够利用H.264的消息扩展机制来实现网络状况和QoS信息的报告,缺少这种机制,很多好的技术就没有了应用的必要前提条件。In the prior art, the message extension mechanism of H.264 cannot be used to report the network status and QoS information. Without this mechanism, many good technologies do not have the necessary preconditions for application.

现有技术都比较零散,缺乏集成,彼此的效果没有相互增强。同时很多技术还停留在学术探讨阶段,没有进入到通信协议层面的定义和开发,影响了实际应用。在这些技术的整合上,必须考虑实时通信性能要求的约束,选择的技术必须要性能好同时计算不能过于复杂。The existing technologies are relatively fragmented, lack of integration, and the effects of each other do not enhance each other. At the same time, many technologies are still in the academic discussion stage, and have not entered the definition and development of communication protocols, which affects practical applications. In the integration of these technologies, the constraints of real-time communication performance requirements must be considered, and the selected technology must have good performance and the calculation should not be too complicated.

造成这种情况的主要原因在于,现有技术用固定的纠删码策略保护视频通信流,无法适应网络通信变化;错误掩盖方法采用的替代机制会引起误码扩散;误码扩散消除方法都需要复杂的机制或者额外的反馈信道,耗费系统处理资源和网络带宽资源。The main reason for this situation is that the existing technology uses a fixed erasure code strategy to protect the video communication stream, which cannot adapt to network communication changes; the alternative mechanism adopted by the error concealment method will cause error diffusion; the error diffusion elimination method requires Complicated mechanisms or additional feedback channels consume system processing resources and network bandwidth resources.

现有技术方案中将NALU的头信息完全封装在净荷当中,使得RTP协议无法直接获知有关净荷的属性、级别、重要程度等,从而无法实现基于此的QoS机制。其次,这样的封装格式还造成了NALU头信息占用净荷资源,因为每个NALU的都附带头信息,导致在很多情况下,由于一个RTP中多个相同类型的NALU的头信息都是一样的,从而浪费了RTP传送带宽资源。In the existing technical solution, the header information of the NALU is completely encapsulated in the payload, so that the RTP protocol cannot directly obtain the attributes, levels, and importance of the payload, so that the QoS mechanism based on this cannot be realized. Secondly, this encapsulation format also causes NALU header information to occupy payload resources, because each NALU has header information attached, resulting in many cases, because the header information of multiple NALUs of the same type in one RTP are the same , thus wasting RTP transmission bandwidth resources.

H.264/RTP的多媒体通信框架采用了一种通用的配合控制协议RTCP来传送QoS报告,以实现QoS监测,然而RTCP本身对于H.264这样的特定视频通信应用不一定是最合适的,由于其本身的带外重开逻辑通道来传送QoS报告,影响了网络状况,导致了矛盾的产生。The multimedia communication framework of H.264/RTP adopts a common cooperative control protocol RTCP to transmit QoS reports to realize QoS monitoring. However, RTCP itself is not necessarily the most suitable for specific video communication applications such as H.264, because Its own out-of-band re-opens the logical channel to transmit the QoS report, which affects the network status and leads to conflicts.

关键的地方在于,现有技术没有实现传输层的错误弹性保护策略,无法提供多媒体传送的可靠性和通信质量。The key point is that the existing technology does not implement the error resilience protection strategy of the transport layer, and cannot provide the reliability and communication quality of multimedia transmission.

发明内容 Contents of the invention

有鉴于此,本发明的主要目的在于提供一种多媒体通信方法及其终端,实现不等保护,方便于QoS保证的实现。In view of this, the main purpose of the present invention is to provide a multimedia communication method and its terminal, which can realize unequal protection and facilitate the realization of QoS guarantee.

为实现上述目的,本发明提供了一种多媒体通信方法,其通信过程包含以下步骤,To achieve the above object, the present invention provides a multimedia communication method, the communication process includes the following steps,

A 发送端根据基于实时传送协议的错误弹性保护策略对多媒体数据进行保护,并将其发送给接收端,所述错误弹性保护策略的相关信息携带在封装所述多媒体数据的错误弹性实时传送协议包中;A The sending end protects the multimedia data according to the error resilience protection strategy based on the real-time transport protocol, and sends it to the receiving end. middle;

B 所述接收端接收所述多媒体数据,在出现传送错误的情况下,按所述封装所述多媒体数据的错误弹性实时传送协议包携带的错误弹性保护策略恢复或部分恢复所述多媒体数据。B. The receiving end receives the multimedia data, and in the case of a transmission error, restores or partially restores the multimedia data according to the error resilience protection policy carried in the error resilient real-time transport protocol packet encapsulating the multimedia data.

其中,其通信过程还包含以下步骤,Wherein, its communication process also includes the following steps,

C 所述接收端统计通信质量,生成服务质量报告,将其发回给所述发送端;C The receiving end counts the communication quality, generates a service quality report, and sends it back to the sending end;

所述步骤A中,所述发送端根据所述服务质量报告调整所述错误弹性保护策略。In the step A, the sending end adjusts the error resilience protection policy according to the service quality report.

此外在所述方法中,其通信过程还包含以下步骤,In addition, in the method, the communication process also includes the following steps,

D 所述接收端统计传送错误信息,并且实施错误掩盖策略;D The receiving end counts and transmits error information, and implements an error concealment strategy;

所述步骤C中,所述接收端还将所述传送错误信息反馈给所述发送端;In the step C, the receiving end also feeds back the transmission error information to the sending end;

E 所述发送端根据所述传送错误信息实施误码扩散消除策略。E. The sending end implements an error diffusion elimination strategy according to the transmission error information.

此外在所述方法中,所述步骤A包含以下子步骤:In addition in described method, described step A comprises following sub-step:

A1 发送端选择错误弹性编码方案对多媒体数据进行前向纠错编码;A1 The sender selects the error elastic coding scheme to perform forward error correction coding on the multimedia data;

A2 所述发送端用错误弹性实时传送协议封装编码后的多媒体数据,并在所述错误弹性实时传送协议包头信息中携带所述错误弹性编码方案相关信息,然后发送到接收端;A2 The sending end encapsulates the coded multimedia data with the error RRT protocol, and carries the information related to the error elastic coding scheme in the error RRT protocol packet header information, and then sends it to the receiving end;

所述步骤B包含以下子步骤:Described step B comprises following substep:

B1 所述接收端将收到的错误弹性实时传送协议包去封装,并从所述错误弹性实时传送协议包头信息中提取所述前向纠错编码方案相关信息;B1 The receiving end decapsulates the received error RRTTP packet, and extracts the forward error correction coding scheme related information from the error RRTTP packet header information;

B2 如果在传送过程中发生了数据节点对应的错误弹性实时传送协议包的丢失,那么所述接收端根据所述错误弹性编码方案相关信息,选择所述前向纠错解码方案进行前向纠错解码,恢复或者部分恢复所述丢失的多媒体数据;如果没有发生数据节点的丢失,则不需要进行前向纠错解码。B2 If the error elastic real-time transport protocol packet corresponding to the data node is lost during the transmission process, then the receiving end selects the forward error correction decoding scheme to perform forward error correction according to the information about the error elastic coding scheme Decoding, recovering or partially recovering the lost multimedia data; if no loss of data nodes occurs, forward error correction decoding is not required.

此外在所述方法中,所述前向纠错编码后的多媒体数据分为数据节点和校验节点两类。In addition, in the method, the forward error correction coded multimedia data is divided into two types: data nodes and check nodes.

此外在所述方法中,在所述步骤A1中,所述发送端至少根据以下因素之一选择所述前向纠错编码方案:In addition, in the method, in the step A1, the sending end selects the forward error correction coding scheme at least according to one of the following factors:

当前网络传送状况、和待发送多媒体数据的服务质量等级,其中待发送多媒体数据的服务质量等级取决于不同数据的相对重要性。The current network transmission status, and the quality of service level of the multimedia data to be sent, wherein the quality of service level of the multimedia data to be sent depends on the relative importance of different data.

此外在所述方法中,所述错误弹性实时传送协议包头信息中包含:In addition, in the described method, the error RRTTP packet header information includes:

错误弹性实时传送协议标识字段,用于指示以区别于实时传送协议;Error RRTTP identification field, used to indicate to distinguish it from RTP;

前向纠错编码类型字段,用于指示所述前向纠错编码方案采用的前向纠错码类型;The forward error correction coding type field is used to indicate the forward error correction code type adopted by the forward error correction coding scheme;

前向纠错编码子类型字段,用于指示所述前向纠错编码方案的相关参数设置;The forward error correction coding subtype field is used to indicate the relevant parameter settings of the forward error correction coding scheme;

数据包长度字段,用于指示所述前向纠错编码方案在对所述多媒体数据进行前向纠错编码后得到的节点的长度;The data packet length field is used to indicate the length of the node obtained by the forward error correction coding scheme after performing forward error correction coding on the multimedia data;

数据包数目字段,用于指示该错误弹性实时传送协议包所承载的所述节点的数目。The number of data packets field is used to indicate the number of nodes carried by the error RRT protocol packet.

此外在所述方法中,在所述步骤A1中,所述发送端将至少一个所述H.264网络抽象层单元划分为等长的至少一个数据节点,然后对其进行前向纠错编码,得到至少一个校验节点;In addition, in the method, in the step A1, the sending end divides at least one H.264 network abstraction layer unit into at least one data node of equal length, and then performs forward error correction coding on it, Obtain at least one check node;

在所述步骤A2中,所述发送端将所述数据节点和所述校验节点分组封装在至少一个所述错误弹性实时传送协议包中进行发送;In the step A2, the sending end encapsulates the data node and the check node into at least one error RRT protocol packet for sending;

在所述步骤B1中,所述接收端在接收到所述错误弹性实时传送协议包后,去封装得到所述数据节点和所述校验节点;In the step B1, after receiving the error RRTTP packet, the receiving end decapsulates to obtain the data node and the check node;

在所述步骤B2中,如果发生了传送过程中的数据节点丢失,则所述接收端根据所述校验节点对所述丢失的数据节点进行基于前向纠错解码的恢复或者部分恢复,并划分得到所述H.264网络抽象层单元。In the step B2, if the data node in the transmission process is lost, the receiving end restores or partially restores the lost data node based on forward error correction decoding according to the check node, and The H.264 network abstraction layer unit is obtained by dividing.

此外在所述方法中,在开始传送之前,还包含步骤,In addition, in the method, before starting the transmission, it also includes the step of,

所述发送端和所述接收端协商确定:对于各种所述前向纠错码类型,所述前向纠错码子类型字段的取值与其所指示的该种前向纠错码的相关参数设置的对应关系。The sending end and the receiving end negotiate and determine: for each type of the forward error correction code, the value of the forward error correction code subtype field and the relevant parameters of the forward error correction code indicated Correspondence of settings.

此外在所述方法中,所述发送端和所述接收端都根据所述前向纠错编码子类型字段指示的对应关系建立对应关系表,用于根据所述前向纠错编码类型字段和所述前向纠错编码子类型字段查询所对应的前向纠错编码或前向纠错解码处理模块;In addition, in the method, both the sending end and the receiving end establish a correspondence table according to the correspondence indicated by the forward error correction coding subtype field, and are used to establish a correspondence table according to the forward error correction coding type field and The forward error correction coding subtype field queries the corresponding forward error correction coding or forward error correction decoding processing module;

在所述步骤A1中,所述发送端调用相应前向纠错编码处理模块进行前向纠错编码;In the step A1, the sending end invokes a corresponding forward error correction encoding processing module to perform forward error correction encoding;

在所述步骤B2中,所述接收端调用相应前向纠错解码处理模块进行前向纠错解码。In the step B2, the receiving end invokes a corresponding FEC decoding processing module to perform FEC decoding.

此外在所述方法中,在所述步骤A1中,所述发送端根据所述H.264网络抽象层单元的头信息中的网络抽象层参考标识字段和网络抽象层单元类型字段中的任意一者或两者的组合,以及任何其他可以事先定义的规则来评估对应数据的相对重要性,从而确定所述服务质量等级,进而选择所述前向纠错编码方案,确定所述前向纠错编码类型字段和前向纠错编码子类型字段。In addition, in the method, in the step A1, the sending end, according to any one of the network abstraction layer reference identification field and the network abstraction layer unit type field in the header information of the H.264 network abstraction layer unit or a combination of both, and any other rules that can be defined in advance to evaluate the relative importance of corresponding data, thereby determining the service quality level, and then selecting the forward error correction coding scheme, determining the forward error correction Encoding type field and forward error correction encoding subtype field.

此外在所述方法中,在所述步骤A中,所述发送端根据所述接收端反馈的传送报告评价所述网络传送状况,进而选择所述前向纠错编码方案,确定所述前向纠错编码类型字段和前向纠错编码子类型字段。In addition, in the method, in the step A, the sending end evaluates the network transmission status according to the transmission report fed back by the receiving end, and then selects the forward error correction coding scheme to determine the forward error correction coding scheme. Error correction coding type field and forward error correction coding subtype field.

此外在所述方法中,所述错误弹性实时传送协议包头信息中的版本信息字段取值为二进制值“11”或十进制值“3”,以区别于实时传送协议;In addition, in the method, the value of the version information field in the incorrect RRTTP packet header information is binary value "11" or decimal value "3", so as to distinguish it from the real-time transport protocol;

所述前向纠错编码类型字段位于贡献源标识符列表之后,占4比特;The forward error correction coding type field is located after the contribution source identifier list and occupies 4 bits;

所述前向纠错编码子类型字段位于所述前向纠错编码类型字段之后,占9比特;The forward error correction coding subtype field is located after the forward error correction coding type field and occupies 9 bits;

所述数据包长度字段位于所述前向纠错编码子类型字段之后,占11比特;The data packet length field is located after the forward error correction coding subtype field and occupies 11 bits;

所述数据包数目字段位于所述数据包长度字段之后,占8比特。The data packet number field is located after the data packet length field and occupies 8 bits.

此外在所述方法中,所述步骤A中,所述发送端将头信息相同的至少一个网络抽象层单元去掉其头信息后再一起进行划分、编码和封装入所述错误弹性实时传送协议包,并将该网络抽象层单元所具有的相同头信息综合在该错误弹性实时传送协议包的头信息中;In addition, in the method, in the step A, the sending end removes the header information of at least one network abstraction layer unit with the same header information, and then divides, encodes, and encapsulates them into the error RRTTP packet , and integrating the same header information of the network abstraction layer unit into the header information of the wrong RRTTP packet;

所述步骤B中,所述接收端从接收到的所述错误弹性实时传送协议包的头信息中获取所承载的头信息,并添加到从所述错误弹性实时传送协议包提取出的剥离了头信息的网络抽象层单元的头部,获得完整的网络抽象层单元;如果存在传送错误,则根据预置策略有进行前向纠错解码恢复或者部分恢复数据节点,然后再从中提取出网络抽象层单元。In the step B, the receiving end obtains the carried header information from the header information of the received error RRT protocol packet, and adds it to the stripped information extracted from the error RRT protocol packet. The head of the network abstraction layer unit of the header information, to obtain the complete network abstraction layer unit; if there is a transmission error, perform forward error correction decoding recovery or partially restore the data node according to the preset strategy, and then extract the network abstraction layer from it layer unit.

此外在所述方法中,在所述错误弹性实时传送协议头信息中,所述网络抽象层单元头信息中的网络抽象层参考标识字段和类型字段填充在所述错误弹性实时传送协议包头信息的净荷类型字段中,该净荷类型字段位于所述错误弹性实时传送协议包头信息的第2个字节的后7比特。In addition, in the method, in the incorrect RRTTP header information, the network abstraction layer reference identification field and the type field in the network abstraction layer unit header information are filled in the incorrect RRTTP header information In the payload type field, the payload type field is located in the last 7 bits of the second byte of the error RRT Protocol packet header information.

此外在所述方法中,所述错误弹性实时传送协议标识字段为所述错误弹性实时传送协议包头信息的版本信息字段,该版本信息字段位于所述错误弹性实时传送协议包头信息的第1个字节的前2比特。In addition, in the method, the error RRT protocol identification field is the version information field of the error RRT protocol header information, and the version information field is located in the first word of the error RRT protocol header information. The first 2 bits of the section.

此外在所述方法中,在所述错误弹性实时传送协议封装格式中,所述网络抽象层单元头信息中的禁止比特字段填充在所述错误弹性实时传送协议包头信息的标记字段中,该标记字段位于所述错误弹性实时传送协议包头信息的第2个字节的前1比特;In addition, in the method, in the error RRT-P encapsulation format, the forbidden bit field in the network abstraction layer unit header information is filled in the flag field of the error RRT-P packet header information, the flag The field is located at the first bit of the second byte of the error RRTTP packet header information;

且在所述步骤B中,接收端根据所述错误弹性实时传送协议包的标记字段判断其所承载的网络抽象层单元是否出错。And in the step B, the receiving end judges whether the network abstraction layer unit carried by the wrong RRT-Packet has an error according to the flag field of the wrong RRT-Packet.

此外在所述方法中,在所述步骤A、B中,所述错误弹性实时传送协议标识为所述错误弹性实时传送协议包头信息的标记字段取值,该标记字段位于所述错误弹性实时传送协议包头信息的第2个字节的前1比特。In addition, in the method, in the steps A and B, the error RRTTP identifier is the value of the flag field of the error RRTTP packet header information, and the flag field is located in the error RRTTP The first bit of the second byte of the protocol header information.

此外在所述方法中,所述步骤A包含以下子步骤:In addition in described method, described step A comprises following sub-step:

所述发送端首先判断至少一个所述网络抽象层单元的头信息中的禁止比特字段是否有效,据此将其分为正常网络抽象层单元和出错网络抽象层单元;The sending end first judges whether the prohibited bit field in the header information of at least one network abstraction layer unit is valid, and accordingly divides it into a normal network abstraction layer unit and an error network abstraction layer unit;

然后按所述错误弹性实时传送协议封装格式将所述正常网络抽象层单元封装成所述错误弹性实时传送协议包,并设所述错误弹性实时传送协议标识;Then encapsulate the normal network abstraction layer unit into the error RRT protocol package according to the error RRT protocol encapsulation format, and set the error RRT protocol identifier;

按所述实时传送协议封装格式将所述出错网络抽象层单元封装成所述实时传送协议包;encapsulating the erroneous network abstraction layer unit into the real-time transport protocol packet according to the real-time transport protocol encapsulation format;

所述步骤B包含以下子步骤:Described step B comprises following substep:

所述接收端首先判断接收到的包的头信息是否设所述错误弹性实时传送协议标识,将其分为所述错误弹性实时传送协议包和所述实时传送协议包;The receiving end first judges whether the header information of the received packet is set with the wrong RRTTP identifier, and divides it into the wrong RRTTP packet and the RRTTP packet;

然后根据所述错误弹性实时传送协议封装格式处理所述错误弹性实时传送协议包,根据所述实时传送协议包封装格式处理所述实时传送协议包。Then process the erroneous RRT protocol packet according to the erroneous RRT protocol encapsulation format, and process the RRT protocol packet according to the RRT protocol packet encapsulation format.

此外在所述方法中,所述步骤C以下子步骤,In addition, in the method, the following sub-steps of the step C,

C1 所述接收端统计生成所述服务质量报告;C1 The receiving end statistics generate the service quality report;

C2 所述接收端用H.264扩展消息承载所述服务质量报告,发给所述发送端。C2 The receiving end uses the H.264 extended message to carry the service quality report and sends it to the sending end.

此外在所述方法中,所述H.264扩展消息为补充增强信息;In addition, in the method, the H.264 extended message is supplementary enhanced information;

所述服务质量报告在所述补充增强信息中的封装格式如下:The encapsulation format of the quality of service report in the supplementary enhanced information is as follows:

第1个字节为载荷类型字段,用于指示载荷为对应服务质量报告;The first byte is the payload type field, which is used to indicate that the payload is a corresponding quality of service report;

第2、3个字节为载荷长度字段,用于指示对应服务质量报告长度;The 2nd and 3rd bytes are payload length fields, which are used to indicate the length of the corresponding quality of service report;

第4个字节及以后为载荷,用于填充对应服务质量报告。The 4th byte and the following are payloads, which are used to fill in the corresponding QoS report.

此外在所述方法中,所述服务质量报告分为发送方报告和接收方报告,由所述载荷类型字段指示区分;In addition, in the method, the quality of service report is divided into a sender report and a receiver report, which are distinguished by the payload type field indication;

当所述服务质量报告被填充于所述补充增强信息的载荷中时,所述补充增强信息的载荷包含:When the quality of service report is filled in the payload of the supplementary enhancement information, the payload of the supplementary enhancement information includes:

版本信息字段,占2比特;The version information field occupies 2 bits;

填充字段,占1比特,用于指示是否有填充内容;Padding field, occupying 1 bit, used to indicate whether there is padding content;

接收报告数字段,占5比特,用于指示该服务质量报告中所报告接收报告块数目;The received report number field occupies 5 bits and is used to indicate the number of received report blocks reported in the quality of service report;

发送方同步源标识符字段,占32比特,用于标识该服务质量报告的发送方;The sender synchronization source identifier field, which occupies 32 bits, is used to identify the sender of the QoS report;

当所述服务质量报告为发送方报告时,还包含发送方信息块,用于描述该服务质量报告的发送方的相关信息;When the QoS report is a sender report, it also includes a sender information block, which is used to describe information about the sender of the QoS report;

包含至少一块所述接收报告块,用于描述来自不同源的多媒体统计信息;including at least one of said reception report blocks for describing multimedia statistical information from different sources;

包含特定层面扩展,用于特定层面的保留功能扩展。Contains layer-specific extensions for layer-specific reserved function extensions.

此外在所述方法中,用于承载所述服务质量报告的所述补充增强信息进一步由抽象网络层单元承载;In addition, in the method, the supplementary enhanced information for carrying the quality of service report is further carried by an abstract network layer unit;

所述通信终端根据所述服务质量报告传送的可靠性要求设置该抽象网络层单元的网络抽象层参考标识。The communication terminal sets the network abstraction layer reference identifier of the abstract network layer unit according to the reliability requirement transmitted by the quality of service report.

此外在所述方法中,所述通信终端根据当前网络状态和高层应用需求来动态调整所述服务质量报告的统计生成及发送的周期。In addition, in the method, the communication terminal dynamically adjusts the cycle of statistical generation and sending of the quality of service report according to the current network status and high-level application requirements.

此外在所述方法中,当所述通信终端用所述补充增强消息混合承载至少一种媒体流的服务质量报告时,In addition, in the method, when the communication terminal uses the supplementary enhanced message to mix and carry at least one quality of service report of the media stream,

该服务质量报告中包含所承载的至少一种媒体流相应的所述接收报告块。The quality of service report includes the reception report block corresponding to at least one media stream carried.

此外在所述方法中,所述步骤C中,所述接收端根据接收到的所述视频流数据的网络抽象层单元序号,统计丢失的所述网络抽象层单元数目,生成所述服务质量报告,发回给所述发送端;In addition, in the method, in the step C, the receiving end counts the number of lost network abstraction layer units according to the received network abstraction layer unit serial number of the video stream data, and generates the service quality report , sent back to the sending end;

所述步骤A中,所述发送端根据所述丢失的网络抽象层单元序号,计算得到所述累计丢包率,据此调整所述错误弹性保护策略。In the step A, the sending end calculates the cumulative packet loss rate according to the lost network abstraction layer unit serial number, and adjusts the error resilience protection strategy accordingly.

此外在所述方法中,所述接收端根据接收到的服务质量报告,分析计算网络状况参数;所述参数包括端到端的瞬时带宽、延时和抖动。In addition, in the method, the receiving end analyzes and calculates network status parameters according to the received service quality report; the parameters include end-to-end instantaneous bandwidth, delay and jitter.

此外在所述方法中,所述发送端设置不同等级的错误弹性保护策略系列,在所述步骤A中根据所述服务质量报告选择使用相应等级的所述错误弹性保护策略。In addition, in the method, the sending end sets a series of error resilience protection strategies of different levels, and in the step A, selects and uses the error resilience protection strategy of the corresponding level according to the service quality report.

此外在所述方法中,所述步骤C中,所述接收端根据接收到的所述视频流数据的网络抽象层单元序号,统计得到丢失视频流数据的定位信息,并将其发回给所述发送端;In addition, in the method, in the step C, the receiving end calculates the location information of the lost video stream data according to the received network abstraction layer unit serial number of the video stream data, and sends it back to the the sender;

所述步骤A中,所述发送端根据所述丢失视频流数据的定位信息,重新发送所述丢失视频流数据给所述接收端。In the step A, the sending end resends the lost video stream data to the receiving end according to the location information of the lost video stream data.

此外在所述方法中,在所述步骤E中,所述发送端根据所述传送错误信息获得所述丢失条带的定位信息,通过对该丢失条带进行分段逐次帧内编码,以实现所述误码扩散消除策略。In addition, in the method, in the step E, the sending end obtains the location information of the lost slice according to the transmission error information, and performs segment-by-segment intraframe coding on the lost slice to realize The error diffusion elimination strategy.

此外在所述方法中,所述分段逐次帧内编码包含以下步骤,In addition, in the method, the segmented successive intra-frame coding includes the following steps,

E1 从所述丢失条带中分割一组连续的宏块,组成新条带,剩余的所述宏块仍属于所述丢失条带,进入步骤E2;E1 Divide a group of continuous macroblocks from the lost slice to form a new slice, and the remaining macroblocks still belong to the lost slice, and enter step E2;

E2 对所述新条带进行帧内编码,在下一帧时发送,在此之后该新条带做常规编码,进入步骤E3;E2 performs intra-frame coding on the new strip and sends it in the next frame, after which the new strip is routinely coded and enters step E3;

E3 在下一帧编码时,判断所述丢失条带是否还包含未处理的宏块,如果是,则返回步骤E1,否则结束帧内编码。E3 When encoding the next frame, judge whether the lost slice still contains unprocessed macroblocks, if yes, return to step E1, otherwise end intra-frame encoding.

此外在所述方法中,在所述步骤E1中,每次分隔的所述一组连续宏块的大小满足:该组连续宏块进行帧内编码后,本帧的数据率在H.264数据率控制范围内。In addition, in the method, in the step E1, the size of the group of continuous macroblocks separated each time satisfies: after the group of continuous macroblocks is intra-frame encoded, the data rate of this frame is within the H.264 data rate. within the rate control range.

此外在所述方法中,所述步骤D包含以下子步骤,In addition, in the method, the step D includes the following sub-steps,

D1 所述接收端检测传送错误,并统计传送错误信息;D1 The receiving end detects transmission errors and counts transmission error information;

D2 所述接收端在发生传送错误后,进行视频信息重同步;D2 The receiving end resynchronizes the video information after a transmission error occurs;

D3 所述接收端根据所述传送错误信息实施所述错误掩盖策略。D3 The receiving end implements the error concealment strategy according to the transmission error information.

此外在所述方法中,所述步骤D1中,所述接收端根据网络抽象层单元序号的不连续情况来检测并统计传送错误信息。In addition, in the method, in the step D1, the receiving end detects and counts the transmission error information according to the discontinuity of the sequence number of the network abstraction layer unit.

此外在所述方法中,在所述步骤D1中,所述接收端根据所述网络抽象层单元序号的中断情况获得丢失条带的定位信息,该定位信息包含所述丢失条带所在的帧号和所述丢失条带在该帧的位置。In addition, in the method, in the step D1, the receiving end obtains the location information of the lost slice according to the interruption of the serial number of the network abstraction layer unit, and the location information includes the frame number where the lost slice is located and the position of the missing slice in that frame.

此外在所述方法中,所述错误掩盖策略包含步骤:所述接收端用所述丢失条带所在帧的前一帧的相应条带,来替代该丢失条带。In addition, in the method, the error concealment strategy includes a step: the receiving end replaces the missing segment with a corresponding segment of the frame preceding the frame where the missing segment is located.

此外在所述方法中,所述错误弹性编码方案包含改进的“Tornado”纠删码;Also in the method, the error resilient coding scheme comprises a modified "Tornado" erasure code;

所述改进的“Tornado”纠删码对于一组所述数据节点仅生成一层所述校验节点。The improved "Tornado" erasure code generates only one layer of check nodes for a set of data nodes.

此外在所述方法中,所述步骤B中的传送错误包含数据包丢失、或随机比特错误。In addition, in the method, the transmission error in the step B includes data packet loss, or random bit error.

本发明还提供了一种多媒体通信终端,包含用于实现多媒体通信的基本功能模块,其中包含至少包含用于实现多媒体编解码功能的编解码模块,还包含以下模块:The present invention also provides a multimedia communication terminal, comprising a basic functional module for realizing multimedia communication, including at least a codec module for realizing a multimedia codec function, and further comprising the following modules:

错误弹性实时传送控制协议模块,用于将通过所述编解码模块编码后的所述多媒体数据进行错误弹性保护后再在网络侧传送,对来自网络侧的所述多媒体数据进行纠错后再传给所述编解码模块进行解码,所述进行错误弹性保护的相关信息携带在封装所述多媒体数据的错误弹性实时传送协议包中。An error-resilient real-time transmission control protocol module, used for performing error-resilience protection on the multimedia data encoded by the codec module before transmitting on the network side, and performing error correction on the multimedia data from the network side before transmitting Decoding the codec module, the error resilience protection related information is carried in the error resilience real-time transport protocol packet encapsulating the multimedia data.

其中,还包含以下模块:Among them, the following modules are also included:

保护方法和策略协商模块,用于负责在通信双方之间进行错误弹性保护策略协商,确定保护策略集合,供所述错误弹性实时传送控制协议模块选择;The protection method and policy negotiation module is used to be responsible for negotiating error resilience protection strategies between the communication parties, and determining a set of protection strategies for selection by the error resilience real-time transmission control protocol module;

前向纠错模块,用于实现至少一种前向纠错保护方法,维护所述前向纠错保护方法的相关参数,其中所述保护方法和策略协商模块控制所述前向纠错模块以实现不等保护和自适应分级保护功能,所述错误弹性实时传送控制协议模块通过调用该前向纠错模块实现错误弹性保护和纠错功能。A forward error correction module, configured to implement at least one forward error correction protection method, and maintain relevant parameters of the forward error correction protection method, wherein the protection method and policy negotiation module controls the forward error correction module to The functions of unequal protection and self-adaptive hierarchical protection are realized, and the error resilient real-time transmission control protocol module realizes the functions of error resilient protection and error correction by calling the forward error correction module.

此外,还包含Additionally, it contains

错误掩盖模块,用于实现错误掩盖功能;An error masking module is used to realize the error masking function;

所述编解码模块用于实现H.264编解码标准,还用于误码扩散消除功能;The encoding and decoding module is used to realize the H.264 encoding and decoding standard, and is also used for the error diffusion elimination function;

还包含网络状况分析计算模块,用于分析计算网络状况,并向所述错误掩盖模块和所述编解码模块提供信息。It also includes a network status analysis and calculation module, which is used to analyze and calculate network status, and provide information to the error concealment module and the codec module.

此外,还包含Additionally, it contains

补充增强消息扩展处理模块,用于实现服务质量报告和网络状况报告功能,并将报告发送给所述网络状况分析计算模块。The supplementary enhanced message extension processing module is used to implement the functions of service quality report and network status report, and send the reports to the network status analysis and calculation module.

此外,其传送层基于所述错误弹性实时传送协议/实时传送控制协议,用于实现支持错误弹性的多媒体传送功能;In addition, its transmission layer is based on the error-resilient real-time transport protocol/real-time transport control protocol, and is used to realize the multimedia transmission function supporting error resilience;

其应用协议层包含保护机制和策略协商子层,用于实现分级保护和不等保护功能;Its application protocol layer includes a protection mechanism and a policy negotiation sublayer, which are used to realize hierarchical protection and unequal protection functions;

其H.264视频编码层包含补充增强消息扩展报告层,用于实现基于补充增强消息扩展的报告功能;Its H.264 video coding layer includes a supplementary enhanced message extended reporting layer, which is used to realize the reporting function based on the supplementary enhanced message extension;

其H.264网络抽象层中包含前向纠错编码层,用于实现前向纠错编码功能。Its H.264 network abstraction layer includes a forward error correction coding layer, which is used to realize the forward error correction coding function.

此外,所述用于实现多媒体通信的基本功能模块包含以下之一或其任意组合:In addition, the basic functional modules for realizing multimedia communication include one of the following or any combination thereof:

主控模块,用于负责整个终端的控制;The main control module is used to control the entire terminal;

用户接口模块,用于负责用户输入输出的交互和信息的显示;The user interface module is responsible for the interaction of user input and output and the display of information;

网络通信模块,用于负责和网络进行通信,提供下层传送通道;The network communication module is responsible for communicating with the network and providing the lower layer transmission channel;

输入输出和底层驱动模块,用于负责对于硬件设备进行驱动;The input and output and underlying driver modules are responsible for driving hardware devices;

业务模块,用于实现高层业务;Business module, used to realize high-level business;

通信过程控制模块,用于控制通信过程;The communication process control module is used to control the communication process;

应用协议模块,用于实现应用协议功能;The application protocol module is used to implement the application protocol function;

实时传送控制协议模块,用于实现实时传送控制协议功能;The real-time transmission control protocol module is used to realize the real-time transmission control protocol function;

H.264网络抽象层模块,用于实现网络抽象层功能;H.264 network abstraction layer module, used to realize the network abstraction layer function;

音频编解码模块,用于实现音频编解码功能。The audio codec module is used to realize the audio codec function.

通过比较可以发现,本发明的技术方案与现有技术的主要区别在于,采用了错误弹性实时传送协议(ERRTP),在现有RTP基础上提供了可以携带错误弹性编码方案相关信息的传送层封装格式,使得多媒体数据在ERRTP上传送的同时标记其相应的错误弹性编码方案信息,从而将错误弹性机制融入传送层;Through comparison, it can be found that the main difference between the technical solution of the present invention and the prior art is that the error resilient real-time transport protocol (ERRTP) is adopted, and a transport layer encapsulation that can carry information related to the error resilient coding scheme is provided on the basis of the existing RTP Format, so that multimedia data is marked with its corresponding error-resilient coding scheme information when it is transmitted on ERRTP, so that the error-resilience mechanism is integrated into the transport layer;

针对H.264NALU结构给出专用的ERRTP封装方法和协议头信息的改造方案,通过将同一个ERRTP包中的所有NALU的头信息字节结合到其头信息中,采用了一种巧妙的结合方式使得既不影响现有ERRTP协议及设备的运作,而且能够将NALU净荷的属性直接体现在ERRTP头信息中,一方面使得承载效率大大提高,另一方面提供了QoS机制实现的基础;Aiming at the H.264NALU structure, a special ERRTP encapsulation method and a transformation scheme for protocol header information are given. By combining the header information bytes of all NALUs in the same ERRTP packet into its header information, a clever combination method is adopted. It does not affect the operation of the existing ERRTP protocol and equipment, and can directly reflect the attributes of the NALU payload in the ERRTP header information. On the one hand, it greatly improves the carrying efficiency, and on the other hand, it provides the basis for the realization of the QoS mechanism;

基于H.264消息扩展机制,通过接收端统计通信质量并反馈给发送端,直接采用高层媒体协议H.264本身的扩展消息机制来承载QoS报告信息,避免使用额外的信道,实现了一种"带内"QoS报告机制;Based on the H.264 message extension mechanism, the receiving end counts the communication quality and feeds it back to the sending end, and directly adopts the extended message mechanism of the high-level media protocol H.264 itself to carry the QoS report information, avoiding the use of additional channels, and realizes a " In-band "QoS reporting mechanism;

在发送端还可以根据当前网络状况和多媒体数据重要性等级等因素来选择采用各种备用的错误弹性编码方案,从而达到不等保护的目的,实现保护能力和传送效率的均衡;At the sending end, various alternative error elastic coding schemes can be selected according to factors such as the current network status and the importance level of multimedia data, so as to achieve the purpose of unequal protection and achieve a balance between protection capability and transmission efficiency;

在从接收端到发送端的反馈机制的基础上,实现不等保护和多种错误弹性方案的交替混合使用,发送端根据接收端反馈的QoS报告以及相关网络传送状况消息,选择使用不同等级的保护策略,另外基于从ERRTP头信息反映的数据重要性等级,也可以选择对不同等级的数据使用合适的保护策略;On the basis of the feedback mechanism from the receiving end to the sending end, the alternate mixed use of unequal protection and multiple error resilience schemes is realized. The sending end chooses to use different levels of protection according to the QoS report fed back by the receiving end and related network transmission status messages In addition, based on the data importance level reflected from the ERRTP header information, you can also choose to use an appropriate protection strategy for data of different levels;

针对H.264 NALU数据流,给出错误掩盖和误码扩散消除结合的方案,综合体现两种技术的优点,通过误码信息反馈机制和分段逐次帧内编码实现误码扩散消除;For the H.264 NALU data stream, a combined scheme of error concealment and error diffusion elimination is given, which comprehensively reflects the advantages of the two technologies, and the error diffusion elimination is realized through the error information feedback mechanism and segmented sequential intra-frame coding;

还提供了一种高效的Tornado码方案,在确保数据传送保护能力没有显著下降的情况下,通过设置仅具有一层校验节点层的纠删码,减少了纠删码生成校验节点层的运算量,减少了数据传送延迟时间,使数据传送保护性能与代价比得到提高;An efficient Tornado code scheme is also provided. In the case of ensuring that the protection capability of data transmission is not significantly reduced, by setting an erasure code with only one check node layer, the generation of erasure codes is reduced. The amount of calculation reduces the delay time of data transmission and improves the performance and cost ratio of data transmission protection;

最后,将上述各种多媒体通信相关的增强技术整合在多媒体通信系统上,并模块化实现了各种技术及协议架构,各种技术相互协调工作,彼此进一步增强多媒体通信可靠性。Finally, the above-mentioned various enhancement technologies related to multimedia communication are integrated into the multimedia communication system, and various technologies and protocol frameworks are implemented in a modular manner. Various technologies work in harmony with each other to further enhance the reliability of multimedia communication.

这种技术方案上的区别,带来了较为明显的有益效果,即ERRTP协议传送架构在传送层实现错误弹性机制大大简化错误弹性传送结构,节省了网络传送带宽;不等保护的实现,达到了保护能力和传送效率的均衡,方便于多媒体传送的QoS保证的实现,进一步提高服务质量,降低冗余、提高传送效率,实现了与现有技术的兼容,都提高了ERRTP这种新方法的健壮性;The difference in this technical solution has brought more obvious beneficial effects, that is, the ERRTP protocol transmission architecture implements an error resilience mechanism at the transport layer, which greatly simplifies the error resilience transmission structure and saves network transmission bandwidth; the realization of unequal protection achieves The balance of protection capability and transmission efficiency facilitates the realization of QoS guarantee for multimedia transmission, further improves service quality, reduces redundancy, improves transmission efficiency, and achieves compatibility with existing technologies, all of which improve the robustness of this new method of ERRTP sex;

基于H.264的消息扩展机制的QoS报告,在带内实现QoS监测,降低带宽开销,且降低系统实现的复杂性,提高目前H.264视频网络传送质量的报告机制的效果和效率,从而提升H.264视频网络传送质量;Based on the QoS report of H.264 message extension mechanism, QoS monitoring can be realized in-band, bandwidth overhead can be reduced, and the complexity of system implementation can be reduced. H.264 video network transmission quality;

不等保护和多级保护策略更加灵活、准确、及时地适应网络传送需求,提高保护能力,提高系统效率和可靠性,能保证统计信息精确无误而且节省系统资源;The unequal protection and multi-level protection strategies are more flexible, accurate and timely to adapt to network transmission requirements, improve protection capabilities, improve system efficiency and reliability, ensure accurate statistical information and save system resources;

结合错误掩盖和误码扩散消除,避免由错误掩盖引起的误码扩散,在简单复杂度前提下,达到理想的误码消除效果,提高视频传送质量,节省开销、简化机制,且保证系统兼容性;Combining error concealment and error diffusion elimination, avoiding error diffusion caused by error concealment, achieving ideal error elimination effect under the premise of simple complexity, improving video transmission quality, saving overhead, simplifying mechanism, and ensuring system compatibility ;

使用改进的Tornado纠删码方案,提高数据传送保护性价比、提高数据传送效率、促进H.264等新技术应用;Use the improved Tornado erasure code scheme to improve the cost performance of data transmission protection, improve data transmission efficiency, and promote the application of new technologies such as H.264;

将多种增强技术综合在多媒体通信系统中,共同提高多媒体通信质量,可以大大提高基于H.264的多媒体通信产品比如会议电视、可视电话在IP网络上应用的性能和用户体验,提高产品竞争力,带来显著的经济效益,有不可限量的实用价值。Integrating multiple enhancement technologies into the multimedia communication system to jointly improve the quality of multimedia communication can greatly improve the performance and user experience of H.264-based multimedia communication products such as conference TV and videophone on IP networks, and improve product competition. Power, bring significant economic benefits, has unlimited practical value.

附图说明 Description of drawings

图1是RTP数据包的头信息结构示意图;Fig. 1 is a schematic diagram of the header information structure of the RTP packet;

图2是RTP包净荷对NALU数据的封装格式示意图;Fig. 2 is a schematic diagram of the encapsulation format of RTP packet payload to NALU data;

图3是基于RTCP协议的QoS报告数据包格式示意图;Fig. 3 is a schematic diagram of the QoS report packet format based on the RTCP protocol;

图4是Tornado纠删码原理示意图;Figure 4 is a schematic diagram of the principle of Tornado erasure code;

图5是根据本发明的第一实施方式的支持错误弹性多媒体通信终端模块结构示意图;5 is a schematic structural diagram of a multimedia communication terminal module supporting error resilience according to a first embodiment of the present invention;

图6是根据本发明的第一实施方式的多媒体通信协议栈结构示意图;6 is a schematic structural diagram of a multimedia communication protocol stack according to a first embodiment of the present invention;

图7是根据本发明的第二、三实施方式的ERRTP数据包的头信息结构示意图;Fig. 7 is a schematic diagram of the header information structure of the ERRTP packet according to the second and third embodiments of the present invention;

图8是根据本发明的第四实施方式的承载QoS报告的SEI封装格式示意图;FIG. 8 is a schematic diagram of an SEI encapsulation format of a bearer QoS report according to a fourth embodiment of the present invention;

图9是根据本发明的第六实施方式的基于分段逐次帧内编码的误码扩散消除原理示意图。Fig. 9 is a schematic diagram of the principle of error diffusion elimination based on segmented successive intra-frame coding according to the sixth embodiment of the present invention.

图10是本发明的纠删码结构示意图。Fig. 10 is a schematic diagram of the structure of the erasure correction code of the present invention.

具体实施方式 Detailed ways

为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作进一步地详细描述。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings.

本发明将各种增强技术综合在一个多媒体通信系统上实现,将各种增强技术的各自的优点结合共同提高系统性能、传送可靠性和通信质量。这些增强技术包含了将FEC融合于RTP协议的错误弹性实时传送协议(ErrorResilience Real-time Transport Protocol,简称“ERRTP”)、将NALU头信息综合于RTP包头的技术、采用SEI扩展消息承载QoS报告及网络状况的反馈技术、基于反馈实现的多级保护和不等保护机制、采用错误掩盖和扩散消除结合的技术以及改进的Tornado编码方案。The invention integrates various enhanced technologies into one multimedia communication system, and combines the respective advantages of various enhanced technologies to jointly improve system performance, transmission reliability and communication quality. These enhanced technologies include the Error Resilience Real-time Transport Protocol (ERRTP) that integrates FEC into the RTP protocol, the technology that integrates NALU header information into the RTP header, uses SEI extended messages to carry QoS reports, and Feedback technology of network status, multi-level protection and unequal protection mechanism based on feedback, combined technology of error concealment and diffusion elimination, and improved Tornado coding scheme.

本发明将各种增强技术模块化之后,组合在一个多媒体通信系统中,实现错误弹性H.264视频通信,该系统包括一般的主控模块、用户接口、网络通信模块、I/O和底层驱动模块、各种业务模块、通信过程控制模块、应用协议模块等,还包括实现各种增强技术的保护方法和策略协商模块、FEC模块、ERRTP模块、RTCP模块、H.264NAL模块、H.264编码器模块、H.264解码器模块、音频编解码模块、错误掩盖模块、SEI消息扩展处理模块、网络状况分析计算模块。The present invention modularizes various enhanced technologies and combines them in a multimedia communication system to realize error-resilient H.264 video communication. The system includes a general main control module, user interface, network communication module, I/O and underlying drivers Modules, various business modules, communication process control modules, application protocol modules, etc., also include protection methods and policy negotiation modules for implementing various enhanced technologies, FEC modules, ERRTP modules, RTCP modules, H.264NAL modules, H.264 encoding Converter module, H.264 decoder module, audio codec module, error concealment module, SEI message extension processing module, network status analysis and calculation module.

在本发明的第一实施例中,将多种增强技术实现并模块化综合在一个多媒体通信系统中,主要是指多媒体通信终端,下面首先从终端的各个组成功能模块来进行该装置的实现描述,一个完整的终端内部模块结构图如图5所示。应当说明,这里所说的功能模块,都是从功能上来定义的,具体的实现方式可以是软件、硬件、固件(firmware)及软件硬件混合方式。一个完整的多媒体通信终端首先必须包含以下模块:In the first embodiment of the present invention, a variety of enhanced technologies are implemented and modularized into a multimedia communication system, mainly referring to a multimedia communication terminal. The following first describes the implementation of the device from the functional modules of the terminal , a complete internal module structure diagram of the terminal is shown in FIG. 5 . It should be noted that the functional modules mentioned here are all defined in terms of functions, and specific implementation methods may be software, hardware, firmware (firmware) or a combination of software and hardware. A complete multimedia communication terminal must first include the following modules:

主控模块:负责整个终端系统的控制;Main control module: responsible for the control of the entire terminal system;

用户接口(或者叫界面)模块:负责用户输入输出的交互,用户通过界面控制元素如菜单按钮等进行操作,同时显示反馈信息,比如当前系统状态,参数,网络状况等;User interface (or interface) module: responsible for user input and output interaction, users operate through interface control elements such as menu buttons, and display feedback information at the same time, such as current system status, parameters, network status, etc.;

网络通信模块:负责和网络进行通信,提供TCP,UDP,IP和更下层的通信协议栈如Ethernet,PPP,ATM等;Network communication module: responsible for communicating with the network, providing TCP, UDP, IP and lower communication protocol stacks such as Ethernet, PPP, ATM, etc.;

I/O和底层驱动模块:负责对于硬件设备进行驱动,比如视频,音频采集设备和显示/播放设备的驱动,同时负责视频,音频数据的输入和输出;I/O and underlying driver module: responsible for driving hardware devices, such as video, audio acquisition devices and display/playback device drivers, and responsible for input and output of video and audio data;

各种业务模块:实现各种具体的业务,比如可视电话,多方会议,视频邮件,及时消息,视频聊天等等;Various business modules: realize various specific services, such as videophone, multi-party conference, video mail, instant message, video chat, etc.;

通信过程控制模块:在具体的通信过程中进行控制,比如在多方会议中实现申请主席,释放主席,申请发言,控制广播某个会场,会场浏览等;Communication process control module: control the specific communication process, such as applying for the chairman, releasing the chairman, applying for a speech, controlling the broadcast of a certain venue, browsing the venue, etc. in a multi-party conference;

应用协议模块:可以是H.323体系(包括其下的H.225.0,RAS,H.245,H.235,H.460等)和SIP等具体的应用协议。一般来说,这个协议是一系列协议的总称,叫做“协议伞”(protocol umbrella);Application protocol module: it can be specific application protocols such as H.323 system (including H.225.0, RAS, H.245, H.235, H.460, etc.) and SIP. Generally speaking, this agreement is a general term for a series of agreements, called "protocol umbrella" (protocol umbrella);

此外对应于各种增强技术,分别在以下模块中实现:In addition, corresponding to various enhancement techniques, they are implemented in the following modules:

保护方法和策略协商模块:该模块负责在通信双方之间进行保护方法协商,确定允许集合,然后根据允许集合,来协商一组保护方法混合和交替使用的策略。协商通过“应用协议模块”来进行通信完成。该模块控制FEC模块,后者实现不同的FEC保护方式,不等保护和自适应分级保护等功能;Protection method and strategy negotiation module: This module is responsible for carrying out protection method negotiation between communication parties, determining the allowed set, and then negotiating a set of strategies for mixed and alternate use of protection methods according to the allowed set. Negotiation is done through "application protocol module" to communicate. This module controls the FEC module, which implements different FEC protection methods, unequal protection and adaptive hierarchical protection functions;

FEC模块:该模块支持多种FEC保护方法,它们作为子类可以属于多个大类,假设共支持T种不同的方法。根据协商的结果(来自“保护方法和策略协商模块”),对于H.264视频数据和音频数据(不在本专利范围内)进行保护。该模块内部保存了各种FEC子类对应的生成规则和参数,因此含有一个内部的数据库,用于存储这些数据。该模块可以实现不同保护方法的混合和交替应用;FEC module: This module supports a variety of FEC protection methods, and they can belong to multiple categories as subclasses, assuming that T different methods are supported in total. According to the negotiation result (from the "protection method and policy negotiation module"), the H.264 video data and audio data (not within the scope of this patent) are protected. This module internally saves the generation rules and parameters corresponding to various FEC subclasses, so it contains an internal database for storing these data. This module can realize the mixed and alternate application of different protection methods;

ERRTP模块:实现ERRTP协议,关于ERRTP的协议封装格式以及对应H.264的相关封装去封装步骤在下面的实施例中还会详细描述;ERRTP module: realize the ERRTP protocol, the protocol encapsulation format of ERRTP and the relevant encapsulation and decapsulation steps corresponding to H.264 will also be described in detail in the following embodiments;

RTCP模块:实现正常的RTCP功能,虽然本发明提供了基于H.264 SEI消息扩展的报告机制,可以实现主要RTCP信息的报告,但是并不排除RTCP的使用,两种报告机制可以并存,这样做主要是考虑兼容性和互通性,因此对方终端可能不支持采用SEI消息扩展报告机制;RTCP module: realize normal RTCP function, although the present invention provides the report mechanism based on H.264 SEI message extension, can realize the report of main RTCP information, but do not get rid of the use of RTCP, two kinds of report mechanisms can coexist, do like this The main consideration is compatibility and interoperability, so the other terminal may not support the SEI message extended reporting mechanism;

H.264 NAL模块:实现H.264网络抽象层的功能;H.264 NAL module: realize the function of H.264 network abstraction layer;

H.264编码器模块:除了实现正常H.264编码器功能外,还实现了本发明的误码扩散消除功能,所以据的信息来自“网络状况分析计算模块”;H.264 encoder module: in addition to realizing the normal H.264 encoder function, the error diffusion and elimination function of the present invention has also been realized, so the information according to the data comes from the "network status analysis and calculation module";

H.264解码器模块:实现正常的H.264解码器功能;H.264 decoder module: realize normal H.264 decoder function;

音频编解码模块:实现音频的编解码功能,支持的协议可以是ITU-TG.711,G.722,G.723.1,G.728,G.728,G.722.2(3GPP AMR),MPEG组织的MP3,AAC等等;Audio codec module: realize audio codec function, supported protocols can be ITU-TG.711, G.722, G.723.1, G.728, G.728, G.722.2 (3GPP AMR), MPEG organization MP3, AAC, etc.;

错误掩盖模块:实现本发明提供的错误掩盖功能。所依据的信息来自“网络状况分析计算模块”和“H.264编码器模块”;Error masking module: realize the error masking function provided by the present invention. The information based on it comes from the "network status analysis and calculation module" and "H.264 encoder module";

SEI消息扩展处理模块:实现本发明的基于SEI消息扩展的QoS和网络状况报告功能,在发送端,收集数据形成RTCP SR,RR报告,然后通过SEI扩展消息封装发送出去;在接收端从SEI扩展消息中提取RTCP SR,RR报告,然后把这些数据发给“网络状况分析计算模块”进行分析和计算;SEI message extension processing module: realize QoS and network status report function based on SEI message extension of the present invention, at sending end, collect data and form RTCP SR, RR report, send out by SEI extension message encapsulation then; From SEI extension at receiving end Extract RTCP SR and RR reports from the message, and then send these data to the "network status analysis and calculation module" for analysis and calculation;

网络状况分析计算模块:根据来自“SEI消息扩展处理模块”的数据,进行分析计算获得网络状况数据,比如丢包率,抖动,延时,顺时端到端的带宽(throughput)等等,然后,用这些数据来控制“H.264编码器模块”和“错误掩盖模块”,同时还把这些数据发送到“用户接口模块”可以显示出来给用户看。Network status analysis and calculation module: according to the data from the "SEI message extension processing module", analyze and calculate to obtain network status data, such as packet loss rate, jitter, delay, end-to-end bandwidth (throughput) in a clockwise manner, etc., and then, These data are used to control the "H.264 encoder module" and "error concealment module", and these data are also sent to the "user interface module" to be displayed for the user.

在整体了解通信终端的模块构成之后,再来从协议栈的层次方面来描述这种终端。一个通信终端系统实现了多个不同层面的协议,这些协议构成了协议栈(Protocol Stack)。对于本发明的终端,其协议栈和普通的多媒体通信终端有相同的地方,也有不同的地方,在某些地方增加了一些新的层次。图6示出了根据本发明的第一实施例的多媒体通信协议栈结构示意图。After understanding the module composition of the communication terminal as a whole, we will describe this terminal from the level of the protocol stack. A communication terminal system implements multiple protocols at different levels, and these protocols constitute a protocol stack (Protocol Stack). For the terminal of the present invention, its protocol stack has the same place as that of the common multimedia communication terminal, and also has some differences, and some new layers are added in some places. Fig. 6 shows a schematic structural diagram of a multimedia communication protocol stack according to the first embodiment of the present invention.

本发明的H.264/ERRTP的多媒体传送架构与传统的H.264/RTP架构区别主要在于:The difference between the multimedia transmission architecture of H.264/ERRTP of the present invention and the traditional H.264/RTP architecture mainly lies in:

用ERRTP/RTCP层取代了一般终端协议栈中的RTP/RTCP层,使用支持错误弹性的ERRTP后,将错误弹性保护机制结合在传送层实现;The RTP/RTCP layer in the general terminal protocol stack is replaced by the ERRTP/RTCP layer. After using ERRTP that supports error resilience, the error resilience protection mechanism is combined with the transport layer for implementation;

在应用协议层中增加了一个“保护机制和策略协商层”,这一层主要用于通信双方在实现多级保护和不等保护时协商各种保护等级及其相关保护方案;A "protection mechanism and policy negotiation layer" is added to the application protocol layer. This layer is mainly used for the communication parties to negotiate various protection levels and related protection schemes when realizing multi-level protection and unequal protection;

在H.264 VCL层和NAL层之间增加了“SEI扩展报告层”,这一层方便于收发双方实现基于SEI扩展消息的QoS监测和网络传送状况反馈;The "SEI extended report layer" is added between the H.264 VCL layer and the NAL layer. This layer is convenient for both senders and senders to realize QoS monitoring and network transmission status feedback based on SEI extended messages;

在H.264 NAL层和ERRTP/RTCP层之间增加了“FEC层”,这一层实现了对于H.264的NALU数据流的节点划分、编码和封装。The "FEC layer" is added between the H.264 NAL layer and the ERRTP/RTCP layer. This layer realizes the node division, encoding and encapsulation of the H.264 NALU data stream.

熟悉本领域的技术人员可以理解,上述本发明的第一实施例中以典型的H.264业务为例给出基本模块化结构及协议栈组成,对于其他协议或未来出现的多媒体通信协议或者应用,只需在本发明的原理基础之上按照具体应用实现相关技术细节,达到发明目的,不影响本发明的实质和范围。Those skilled in the art can understand that in the first embodiment of the present invention, the typical H.264 service is taken as an example to give the basic modular structure and protocol stack composition. For other protocols or future multimedia communication protocols or applications , it is only necessary to implement relevant technical details according to specific applications on the basis of the principles of the present invention to achieve the purpose of the invention without affecting the essence and scope of the present invention.

在给出系统整体架构的前提下,下面将依次描述每一种增强技术的实现细节。On the premise of giving the overall system architecture, the implementation details of each enhancement technology will be described in turn below.

针对现有技术存在的诸多问题,本发明提出一种改进的支持错误弹性的RTP协议,旨在将错误弹性机制融入传送层协议,不但可以简化传送结构降低复杂度,而且还能提高错误弹性机制灵活性增强传送可靠性。由于具有错误弹性,本发明称这种改进的RTP协议为错误弹性实时传送协议(ErrorResilience Real-time Transport Protocol,简称“ERRTP”或者“ER2TP”)。ERRTP与RTP的主要区别在于ERRTP协议数据包头信息扩展可以携带错误弹性编码方案相关信息,比如FEC类型、保护能力、编码参数等。Aiming at many problems existing in the prior art, the present invention proposes an improved RTP protocol supporting error resilience, which aims to integrate the error resilience mechanism into the transport layer protocol, which not only simplifies the transmission structure and reduces complexity, but also improves the error resilience mechanism Flexibility enhances delivery reliability. Due to the error resilience, the present invention claims this improved RTP protocol as Error Resilience Real-time Transport Protocol (Error Resilience Real-time Transport Protocol, referred to as "ERRTP" or "ER2TP"). The main difference between ERRTP and RTP is that the header information extension of the ERRTP protocol can carry information related to the error elastic coding scheme, such as FEC type, protection capability, and coding parameters.

在ERRTP基础上,本发明很方便地实现了不等保护,首先提供多种保护能力不同的保护措施可供选择使用,然后发送端在收集得到网络状况和多媒体数据重要性等信息后,可以根据这些因素来选择合适的保护措施,从而达到不等保护的目的,实现保护能力和传送效率的均衡。由于在每个ERRTP数据包上都携带了其所采用的FEC相关信息,因此发送端只需将所选择的方案的信息填入ERRTP包头信息中,接收端就能根据其进行正确恢复或纠错。On the basis of ERRTP, the present invention conveniently realizes unequal protection. Firstly, a variety of protection measures with different protection capabilities are provided for selection. These factors are used to select appropriate protection measures, so as to achieve the purpose of unequal protection and achieve a balance between protection capability and transmission efficiency. Since each ERRTP data packet carries the FEC-related information it adopts, the sending end only needs to fill in the information of the selected scheme into the ERRTP header information, and the receiving end can perform correct recovery or error correction based on it .

最后对于H.264的NALU数据传送应用,给出了基于纠删码保护的具体实现方法,包括划分、生成、封装和解封装数据节点和校验节点的步骤。将连续一串NALU一起等长地划分为若干个数据节点,然后用Tornado码产生校验节点,所有这些节点又分布在若干个ERRTP包中传送,接收端则进行这个逆过程。Finally, for the NALU data transmission application of H.264, a specific implementation method based on erasure code protection is given, including the steps of dividing, generating, encapsulating and decapsulating data nodes and check nodes. Divide a continuous string of NALUs into several data nodes of equal length, and then use Tornado codes to generate check nodes. All these nodes are distributed in several ERRTP packets for transmission, and the receiving end performs this reverse process.

本发明的第二实施例在第一实施例的基础上,收发双方基于ERRTP实现不等保护,主要步骤如下所述:In the second embodiment of the present invention, on the basis of the first embodiment, the sending and receiving parties realize unequal protection based on ERRTP, and the main steps are as follows:

发送端选择错误弹性编码方案对多媒体数据进行纠删编码,用ERRTP封装编码后的多媒体数据,并在ERRTP包头信息中携带错误弹性编码方案相关信息,然后发送到接收端;The sending end selects the error elastic coding scheme to perform erasure coding on the multimedia data, encapsulates the encoded multimedia data with ERRTP, and carries the information about the error elastic coding scheme in the ERRTP header information, and then sends it to the receiving end;

接收端将收到的ERRTP包解封装,并从ERRTP包头信息中提取错误弹性编码方案相关信息,然后根据错误弹性编码方案相关信息,选择错误弹性编码方案进行错误弹性解码,获得多媒体数据。The receiving end decapsulates the received ERRTP packet, extracts information about the error elastic coding scheme from the ERRTP header information, and then selects the error elastic coding scheme for error elastic decoding according to the information about the error elastic coding scheme to obtain multimedia data.

其中,不等保护体现在发送端是根据当前网络传送状况和/或待发送多媒体数据的服务质量等级来选择错误弹性编码方案的。Among them, the unequal protection is embodied in that the sending end selects the error elastic coding scheme according to the current network transmission status and/or the service quality level of the multimedia data to be sent.

首先介绍ERRTP的具体结构,下面给出具体ERRTP的头信息结构实施例。图7是根据本发明的第一实施例的ERRTP头信息结构示意图。从图中可以看出,版本信息字段V取值为3,表示ERRTP协议,以区别于传统的RTP协议(V=2)。其中在头信息扩展也就是最后附有关于错误弹性编码方案的相关信息字段,此例中包括:错误弹性编码类型字段、错误弹性编码参数字段、数据包长度字段、数据包数目字段。Firstly, the specific structure of ERRTP is introduced, and a specific embodiment of the header information structure of ERRTP is given below. Fig. 7 is a schematic diagram of the structure of ERRTP header information according to the first embodiment of the present invention. It can be seen from the figure that the version information field V takes a value of 3, indicating the ERRTP protocol, to distinguish it from the traditional RTP protocol (V=2). Wherein, the extension of the header information is at the end with relevant information fields about the error elastic coding scheme, including in this example: the error elastic coding type field, the error elastic coding parameter field, the data packet length field, and the data packet number field.

错误弹性编码类型字段,用于指示错误弹性编码方案采用的纠删码类型,也可以称为FEC Type字段,即指示FEC编码类型,占4比特,可以表示16种不同的FEC类型,从实际应用中,是足够的。这里定义的类型其实是大的类型,后面还将继续细分为各种不同的方案,称为子类型,实际应用中的大类型例如:0010表示Tornado码,0011表示RS码等。该字段可标识16种不同的FEC码大类型,通信双方需要事先约定一个FEC编码类型和编码类型代号之间对应关系的查表(Look-Up Table,简称“LUT”)称为FECTypeLUT。The error elastic coding type field is used to indicate the type of erasure code used by the error elastic coding scheme. It can also be called the FEC Type field, which indicates the FEC coding type, which occupies 4 bits and can represent 16 different FEC types. From practical application , is sufficient. The type defined here is actually a large type, which will be further subdivided into various schemes, called subtypes. The large types in practical applications are: 0010 means Tornado code, 0011 means RS code, etc. This field can identify 16 different large types of FEC codes. The communication parties need to agree in advance on a look-up table (Look-Up Table, referred to as "LUT") between the FEC encoding type and the encoding type code called FECTypeLUT.

错误弹性编码子类型字段,用于指示错误弹性编码方案的相关参数设置,对于每种类型的FEC编码还需要确定各种参数的设置才能具体实施,这个字段就是起到明确具体参数的作用。由于ERRTP头信息中资源有限,不可能把各种FEC编码方案所对应的具体参数及其规则等一一罗列,本发明的第一实施例通过用子类型的概念来指示各种备选的参数设置方案。该字段也称为FEC编码子类型字段,FEC Subtype,占9比特。该域主要表示在FECTypeLUT中定义的各大类型下面进一步细分的子类型。The error elastic encoding subtype field is used to indicate the relevant parameter settings of the error elastic encoding scheme. For each type of FEC encoding, it is necessary to determine the settings of various parameters before it can be implemented. This field is to clarify the specific parameters. Due to the limited resources in the ERRTP header information, it is impossible to list the specific parameters and rules corresponding to various FEC encoding schemes. The first embodiment of the present invention indicates various alternative parameters by using the concept of subtype Set up the scheme. This field is also called the FEC encoding subtype field, FEC Subtype, and occupies 9 bits. This field mainly indicates the subtypes further subdivided under the major types defined in FECTypeLUT.

数据包长度字段,用于指示错误弹性编码方案在对多媒体数据进行纠删编码后的数据节点长度,称为Data Length字段,占11比特。由于每个数据包长度应小于网络传送最大传送单元(Maximum Transport Unit,简称“MTU”),而目前有线信道MTU<1500=0 x 5DC字节,无线信道MTU<100字节,因此该字段11个比特足以存放数据包的长度。The data packet length field is used to indicate the length of the data node after erasure coding of the multimedia data by the error elastic coding scheme. It is called the Data Length field and occupies 11 bits. Since the length of each data packet should be less than the Maximum Transport Unit (MTU) transmitted by the network, and the current wired channel MTU<1500=0 x 5DC bytes, the wireless channel MTU<100 bytes, so this field is 11 One bit is enough to store the length of the data packet.

数据包数目字段,用于指示该ERRTP包所承载的数据节点的数目,又称为Packet Number字段,占8比特,比如对于若干个NALU经过前向纠错码校验后,分组封装在多个ERRTP中,每个ERRTP中所承载的数据节点数。The data packet number field is used to indicate the number of data nodes carried by the ERRTP packet. It is also called the Packet Number field and occupies 8 bits. In ERRTP, the number of data nodes carried in each ERRTP.

可见有了这些字段之后,解码端或网络节点可以根据该字段给出的FEC码类型和数据包的校验类型对接收到的数据包进行校验,并恢复丢失的数据包。It can be seen that with these fields, the decoder or network node can check the received data packets according to the FEC code type and the data packet verification type given in this field, and recover the lost data packets.

值得注意的是,上面提到的子类型FEC Subtype字段共9个比特是用来编码指示各种备选的参数设置方案的,下面就给出本发明的第一实施例中如何进行编码指示的技术细节。It is worth noting that the 9 bits of the subtype FEC Subtype field mentioned above are used to encode and indicate various alternative parameter setting schemes. The following describes how to perform encoding instructions in the first embodiment of the present invention technical details.

首先收发双方需要协商确定该字段指示关系对应表。在开始传送之前,发送端和接收端协商确定:对于各种FEC码大类型,FEC Subtype的取值与其所指示的该种FEC码的相关参数设置方案的对应关系,及各种备选方案的具体参数设置情况。First, the sending and receiving parties need to negotiate to determine the corresponding table indicating the relationship in this field. Before starting to transmit, the sending end and the receiving end negotiate and determine: for various FEC code types, the corresponding relationship between the value of FEC Subtype and the relevant parameter setting scheme of this kind of FEC code indicated, and various alternative schemes specific parameter settings.

然后,发送端和接收端都根据协商结果建立对应关系表,用于根据FECType和FEC Subtype字段来查询所对应的FEC编码类型或FEC编解码处理模块;Then, both the sending end and the receiving end establish a correspondence table according to the negotiation result, which is used to query the corresponding FEC encoding type or FEC encoding and decoding processing module according to the FECType and FEC Subtype fields;

在收发过程中,发送端调用相应纠删编码处理模块进行纠删编码,接收端调用相应纠删解码处理模块进行纠删解码。In the process of sending and receiving, the sending end invokes the corresponding erasure coding processing module to perform erasure coding, and the receiving end calls the corresponding erasure decoding processing module to perform erasure decoding.

在实际应用中,子类型的信息实际上指示两个方面:In practice, subtype information actually indicates two aspects:

A.FEC编码的生成规则(Generation Rule);A. Generation Rule of FEC encoding;

B.保护强度/保护能力。B. Protection Strength/Protection Ability.

所谓生成规则就是在发送端如何将数据节点进行处理生成各个校验节点的规则或者算法(Algorithm)。当然在接收端所做的正好相反,如果在传送过程中发生了丢包,即某些节点丢失了,那么根据生成规则可以恢复或者部分恢复丢失的节点。可见生成规则是很重要的信息,根据它,通信的双方就可以基于FEC机制来工作了。在FECTypeLUT中列出的FEC类型中的每一类,都有不同的生成规则;而在每一类中,比如Tornado码,下面的子类的生成规则还要结合具体的生成参数(generation parameters)。因此具体到这里的每个子类,生成规则将和生成参数结合起来。The so-called generation rules are the rules or algorithms (Algorithm) of how to process the data nodes at the sending end to generate each check node. Of course, what is done at the receiving end is just the opposite. If packet loss occurs during transmission, that is, some nodes are lost, then the lost nodes can be restored or partially restored according to the generation rules. It can be seen that the generation rule is very important information, and according to it, the two parties in the communication can work based on the FEC mechanism. Each type of FEC type listed in FECTypeLUT has different generation rules; and in each type, such as Tornado code, the generation rules of the following subtypes also need to be combined with specific generation parameters (generation parameters) . So specific to each subclass here, the generation rules will be combined with the generation parameters.

比如对于Tornado码,生成参数包括如下数据:数据节点总数、校验节点总数、校验节点层数、相继两层之间节点数目的递缩比例、表示相继两层之间节点关联关系的关联矩阵,如果有L层校验节点,那么这样的关联矩阵就有L个、或者等效的表示相继两层节点关联关系的二部图(Bipartite)的参数化数学表示(parametric mathematical representation)。For example, for Tornado codes, the generation parameters include the following data: the total number of data nodes, the total number of check nodes, the number of layers of check nodes, the scaling ratio of the number of nodes between two consecutive layers, and the association matrix representing the node association relationship between two consecutive layers , if there are L layers of check nodes, then there are L such association matrices, or an equivalent parametric mathematical representation of a bipartite graph (Bipartite) representing the association relationship between successive two layers of nodes.

一般来说,在大的生成规则相同的前提下,生成参数往往决定子类型的保护强度。比如Tornado码,在上面给出的各项生成参数中,数据节点总数和检验节点总数基本上能够在很大程度上决定保护能力(当然严格来说,要完全决定保护能力,需要全部的生成参数)。在本发明中,对于每个FEC大类型,选择一些决定保护能力的主要参数(决定作用最大)作为代表性生成参数(representative generation parameters)。通过使用代表性生成参数,就可以把大类下面的子类按照保护能力从弱到强的顺序(升序)排列起来。从而建立一个LUT叫做FECSubTypeLUT。Generally speaking, under the premise of the same large generation rules, generation parameters often determine the protection strength of subtypes. For example, in the Tornado code, among the generation parameters given above, the total number of data nodes and the total number of check nodes can basically determine the protection capability to a large extent (of course, strictly speaking, to completely determine the protection capability, all generation parameters are required ). In the present invention, for each large type of FEC, select some main parameters that determine the protection ability (the most decisive effect) as representative generation parameters (representative generation parameters). By using representative generation parameters, the subcategories under the general category can be arranged in order of protection ability from weak to strong (in ascending order). Thus create a LUT called FECSubTypeLUT.

每个大类型下面具体支持多个子类型,可以有具体的应用和通信双方的通信能力(CPU处理速度、内存、程序复杂度等因素)和需要决定。如果通信环境变化很大,网络的性能波动范围很大,那么需要支持的子类型一般来说要多,相反可以较少。这个完全可以在通信开始前通过能力协商过程,由通信双方来达成一致的约定。协商可以通过H.323或会话初始协议(SessionInitial Protocol,简称“SIP”)等目前主流的多媒体通信框架协议进行。Each large type specifically supports multiple subtypes, which can be determined by the specific application and communication capabilities of both parties (such as CPU processing speed, memory, program complexity, etc.) and needs. If the communication environment changes greatly and the performance of the network fluctuates widely, then generally there are more subtypes that need to be supported, and on the contrary, there can be fewer. This can be reached through the capability negotiation process before the communication starts, and the communication parties can reach a consensus. The negotiation can be performed through current mainstream multimedia communication framework protocols such as H.323 or Session Initial Protocol (Session Initial Protocol, "SIP" for short).

假定针对某个大类下面的子类,如果需要区分S个子类型(S≤29-1),代表性生成参数有k个,用p1,p2,...,pk表示,那么表2给出一个对应关系的例子,表中上标表示FEC大类型,下标表示具体哪个参数。Assuming that for the subcategories under a certain category, if S subtypes (S≤2 9 -1) need to be distinguished, there are k representative generation parameters, denoted by p 1 , p 2 , ..., p k , then Table 2 gives an example of the corresponding relationship. The superscript in the table indicates the large type of FEC, and the subscript indicates the specific parameter.

表2FEC Subtype和参数设置方案对应关系表Table 2 Correspondence between FEC Subtype and parameter setting scheme

  FEC Subtype FEC编码子类型(参数设置) 000000000 FEC子类型0(p<sup>0</sup><sub>1</sub>,p<sup>0</sup><sub>2</sub>,..,p<sup>0</sup><sub>k</sub>) 000000001 FEC子类型1(p<sup>1</sup><sub>1</sub>,p<sup>1</sup><sub>2</sub>,..,p<sup>1</sup><sub>k</sub>) 000000010 FEC子类型2(p<sup>2</sup><sub>1</sub>,p<sup>2</sup><sub>2</sub>,..,p<sup>2</sup><sub>k</sub>) 000000011 FEC子类型3(p<sup>3</sup><sub>1</sub>,p<sup>3</sup><sub>2</sub>,..,p<sup>3</sup><sub>k</sub>) ………………… ……………… S(S≤2<sup>9</sup>-1) FEC子类型S(p<sup>S</sup><sub>1</sub>,p<sup>S</sup><sub>2</sub>,..,p<sup>S</sup><sub>k</sub>) FEC Subtype FEC encoding subtype (parameter setting) 000000000 FEC subtype 0 (p<sup>0</sup><sub>1</sub>, p<sup>0</sup><sub>2</sub>, .., p<sup>0</sup><sub>k</sub>) 000000001 FEC subtype 1 (p<sup>1</sup><sub>1</sub>, p<sup>1</sup><sub>2</sub>, .., p<sup>1</sup><sub>k</sub>) 000000010 FEC subtype 2 (p<sup>2</sup><sub>1</sub>, p<sup>2</sup><sub>2</sub>, .., p<sup>2</sup><sub>k</sub>) 000000011 FEC subtype 3 (p<sup>3</sup><sub>1</sub>, p<sup>3</sup><sub>2</sub>, .., p<sup>3</sup><sub>k</sub>) ………………… ……………… S(S≤2<sup>9</sup>-1) FEC subtype S(p<sup>S</sup><sub>1</sub>, p<sup>S</sup><sub>2</sub>, .., p<sup>S</sup><sub>k</sub>)

比如,对于Tornado码,可以设置对应关系是:000000010-(24,20)(数据节点总数=20,校验节点总数=4),000000011-(30,20),...,111111111-其它。For example, for the Tornado code, the corresponding relationship can be set as: 000000010-(24,20) (the total number of data nodes=20, the total number of check nodes=4), 000000011-(30,20), ..., 111111111-others.

针对某种特性的FEC编码的子类型,一组给定的生成规则结合相应的生成参数对应唯一的一个编码方案,即唯一决定了如何由数据节点生成校验节点,以及如何恢复丢失的节点。可以建立一个数据库,来存储每种大类型和子类型对应的生成参数。而生成规则本身用硬件或者软件模块来实现。因此,每种大类型在发送端对应一个FEC处理模块,负责生成校验节点;在接收端同样对应一个FEC处理模块,负责恢复节点。但是,对应每种大类型的模块,需要从上述生成参数数据库中读取具体的每种子类型的生成参数,从而来进行处理。因此,通信双方都是根据FEC Type和FEC Subtype两个信息域的信息来决定调用哪个FEC处理模块和读取哪些生成参数。For a subtype of FEC encoding with certain characteristics, a given set of generation rules combined with corresponding generation parameters corresponds to a unique encoding scheme, which uniquely determines how to generate check nodes from data nodes and how to recover lost nodes. A database can be created to store the generation parameters corresponding to each major type and subtype. And the generating rules themselves are realized by hardware or software modules. Therefore, each large type corresponds to a FEC processing module at the sending end, which is responsible for generating check nodes; at the receiving end, it also corresponds to an FEC processing module, which is responsible for restoring nodes. However, corresponding to each large type of module, specific generation parameters of each subtype need to be read from the above-mentioned generation parameter database for processing. Therefore, both sides of the communication decide which FEC processing module to call and which generation parameters to read based on the information of the two information fields of FEC Type and FEC Subtype.

由于目前多媒体通信技术的发展,H.264视频编码标准已逐渐成为主流媒体编码格式,因此本发明的第二实施例在第一实施例的基础上,给出了用ERRTP对H.264的NALU数据流进行FEC编解码的具体步骤,其流程如下所述。Due to the current development of multimedia communication technology, the H.264 video coding standard has gradually become the mainstream media coding format, so the second embodiment of the present invention provides the NALU of H.264 using ERRTP on the basis of the first embodiment. The specific steps of performing FEC encoding and decoding on the data stream are as follows.

发送端将多个(假设为S个)H.264 NALU合并为一组统一进行编码传送,先把S个NALU重新划分为等长的块,假设为M个,这M个就是数据节点。The sender merges multiple (assumed to be S) H.264 NALUs into a group for encoding and transmission, and first re-divides the S NALUs into equal-length blocks, assuming M, and these M are data nodes.

在该步中,将H.264的S个NALU分为一组;然后将S个NALU首尾相接(concatenated),连接形成一个大块,然后将该大块等分为M个数据块,其中每个数据块的长度为K个字节。这里如果该大块的总的字节数(设为TB)不能被M整除,那么应该进行取整运算,使得每个数据块的长度为Ceiling(TB/M)字节,Ceiling函数表示取整,即Ceiling(x)等于不小于x的最小整数,x为任意实数。那么在某些数据块中的后面可能要采用填充零串(zeropadding)的操作,使得字节数凑齐到Ceiling(TB/M))。In this step, the S NALUs of H.264 are divided into one group; then the S NALUs are connected end to end (concatenated) to form a large block, and then the large block is equally divided into M data blocks, wherein The length of each data block is K bytes. Here, if the total number of bytes of the large block (set to TB) cannot be divisible by M, then a rounding operation should be performed so that the length of each data block is Ceiling(TB/M) bytes, and the Ceiling function indicates rounding , that is, Ceiling(x) is equal to the smallest integer not less than x, and x is any real number. Then in some data blocks, the operation of filling zero strings (zero padding) may be used to make the number of bytes equal to Ceiling (TB/M)).

接着,对M个数据节点其进行FEC编码,得到N个校验节点。对M个数据块使用FEC码编码生成N个校验块,生成过程采用前面描述过的方法,根据FEC Type和FEC Subtype信息,确定调用具体哪个FEC处理模块进行校验块的生成。Next, perform FEC encoding on the M data nodes to obtain N check nodes. Use the FEC code to encode M data blocks to generate N check blocks. The generation process uses the method described above. According to the FEC Type and FEC Subtype information, determine which FEC processing module to call to generate the check block.

然后,发送端将所有数据节点和校验节点分组封装在ERRTP包中进行发送。在此例中各个字段应该按如下设置:Then, the sender encapsulates all data nodes and check node groups in ERRTP packets for transmission. In this example the fields should be set as follows:

类型字段FEC Type=0010,表示使用Tornado码;Type field FEC Type = 0010, indicating that Tornado code is used;

子类型字段则由发送端具体根据实际情况选择,比如取值为FECSubtype=000000010,表示使用Tornado(24,20)码,其中数据节点20个,校验节点4个,信道编码冗余度为16.7%;该纠删码在丢包率小于等于3%时,可以完全恢复丢失的数据包;The subtype field is selected by the sender according to the actual situation. For example, the value is FECSubtype=000000010, which means that the Tornado (24, 20) code is used, in which there are 20 data nodes, 4 check nodes, and the channel coding redundancy is 16.7 %; when the packet loss rate is less than or equal to 3%, the erasure code can completely recover the lost data packets;

数据包长度Data-Length=K Bytes;Data packet length Data-Length = K Bytes;

数据包数目Packet Number=(M+N)/P,表示一个ERRTP载荷中承载的数据节点个数。Packet Number = (M+N)/P, indicating the number of data nodes carried in one ERRTP payload.

接收端在接收到这些ERRTP包后,解封装得到数据节点和校验节点。接收端以P个数据包为周期,每接收到一组P个数据包就开始进行一次解码恢复。一组多少个数据包由双方协商确定。After receiving these ERRTP packets, the receiving end decapsulates to obtain data nodes and check nodes. The receiving end takes P data packets as a cycle, and starts decoding and recovery every time a group of P data packets are received. The number of data packets in a group is determined through negotiation between the two parties.

接收端根据校验节点对数据节点进行错误弹性解码。每次在收到数据包P+1后开始检测前面收到的P个数据包中是否有数据包丢失,如果有就采用前面描述的方法,根据FEC Type和FEC Subtype信息,确定调用具体哪个FEC处理模块进行解码和恢复或者部分丢失的数据。The receiving end performs error resilient decoding on the data nodes according to the check nodes. Each time after receiving the data packet P+1, it starts to detect whether there is any data packet loss in the previously received P data packets. If so, use the method described above to determine which FEC to call according to the FEC Type and FEC Subtype information. The processing module decodes and recovers or partially lost data.

最后在得到完整的数据节点后,重新合并就得到一个大块,采用与发送端相同的方式,划分得到S个NALU。Finally, after obtaining the complete data node, re-merge to obtain a large block, and use the same method as the sender to divide and obtain S NALUs.

在实际应用中发现,上例采用基于ERRTP的抗数据包丢失算法,可以在增加不到17%码字的情况下,大大提高视频码流的抗数据包丢失能力。而与RTP载荷头结构相比,仅仅增加了4字节,可见对传送效率基本没有影响,取得了显著的实际效果。In practical applications, it is found that the above example adopts the anti-packet loss algorithm based on ERRTP, which can greatly improve the anti-packet loss capability of the video code stream without increasing the codeword by 17%. Compared with the RTP load header structure, only 4 bytes are added, which shows that the transmission efficiency is basically not affected, and a significant practical effect has been achieved.

在前面已经提到关于本发明的另外一个关键技术点就是不等保护的实现。主要体现在两个方面,一个是根据不同重要等级的多媒体数据来选择合适的编码方案或者参数,即确定前述FEC编码类型与子类型,另一个就是根据不同时刻的网络状况来选择。对应这两个方面,分别称为混合和交替使用各种FEC编码方案。所谓混合(Hybrid),是指在同一时间内同时使用多种FEC子类型,主要用于保护不同重要性的数据;而所谓交替(Alternation),是指在不同时间(不同的网络状况下)使用不同的FEC子类型。Another key technical point of the present invention mentioned above is the realization of unequal protection. It is mainly reflected in two aspects, one is to select the appropriate encoding scheme or parameters according to the multimedia data of different importance levels, that is, to determine the aforementioned FEC encoding type and subtype, and the other is to select according to the network conditions at different times. Corresponding to these two aspects, it is called mixing and alternately using various FEC coding schemes. The so-called hybrid (Hybrid) refers to the use of multiple FEC subtypes at the same time, which is mainly used to protect data of different importance; the so-called alternation (Alternation) refers to the use at different times (under different network conditions) Different FEC subtypes.

因此对于H.264 NALU数据流,前面提到,其头字节体现了数据的重要程度,因此发送端根据NALU的头信息中的NRI字段或Type字段可以评估QoS等级,进而选择错误弹性编码方案,即确定FEC Type字段和FEC Subtype字段。而对于网络状况,一般的网络传送都有相应的网络状况监测机制,发送端可以根据这些机制获知接收端反馈的传送报告,以此评价网络传送状况,进而选择错误弹性编码方案,即确定FEC Type字段和FEC Subtype字段。Therefore, for the H.264 NALU data stream, as mentioned earlier, its header byte reflects the importance of the data, so the sender can evaluate the QoS level according to the NRI field or Type field in the NALU header information, and then choose the error elastic coding scheme , that is, to determine the FEC Type field and the FEC Subtype field. As for the network status, general network transmission has a corresponding network status monitoring mechanism. The sending end can obtain the transmission report fed back by the receiving end based on these mechanisms, so as to evaluate the network transmission status, and then choose the error elastic coding scheme, that is, determine the FEC Type field and FEC Subtype field.

H.264码流是基于NALU进行传送或存储,NALU由NAL头信息和NAL载荷组成。在H.264的NALU中,不同NALU类型对解码恢复图像的影响不同。例如,NRI取0表示NALU中存放非参考图象的一个Slice或Slice数据条带,不会影响后续解码;而NRI取非0表明NALU中存放一个序列/图像参数集或者是参考图像的一个Slice或Slice数据条带,会严重影响后续解码。The H.264 code stream is transmitted or stored based on NALU, and NALU is composed of NAL header information and NAL payload. In the NALU of H.264, different NALU types have different effects on decoding and restoring images. For example, if NRI is 0, it means that a Slice or Slice data strip of a non-reference image is stored in the NALU, which will not affect subsequent decoding; and if NRI is not 0, it means that a sequence/image parameter set or a Slice of a reference image is stored in the NALU. Or Slice data strips will seriously affect subsequent decoding.

因此,在对H.264的码流进行数据包保护时,可以根据NRI或Nal_unit_type的取值将H.264的数据分为两类:一类为相对重要的图像数据(例如Nal_ref_idc等于1);另一类为次要的图像数据(例如Nal_ref_idc等于0)。然后,对重要的图像数据使用冗余度较大、抗丢包能力强的FEC1码进行保护;而次要的图像数据可以使用冗余度较小、抗丢包能力较弱的FEC2码进行保护。Therefore, when carrying out packet protection to the code stream of H.264, the data of H.264 can be divided into two classes according to the value of NRI or Nal_unit_type: one kind is relatively important image data (for example Nal_ref_idc equals 1); The other type is secondary image data (eg Nal_ref_idc equal to 0). Then, the important image data can be protected by FEC1 code with high redundancy and strong anti-packet loss ability; while the secondary image data can be protected by FEC2 code with small redundancy and weak anti-packet loss ability .

通过这种不等保护算法,保证了各类重要信息在高数据包丢失环境下的正确恢复,而对FEC2码仍然未能恢复的图像信息采用错误掩盖和防止误码扩散等技术。FEC1,FEC2这里只是一般的表示方法,表示任意两种子类型。这两种子类型可以属于同一大类型,也可以属于不同大类型。Through this unequal protection algorithm, the correct recovery of various important information is guaranteed in the environment of high data packet loss, and the image information that cannot be recovered by the FEC2 code adopts technologies such as error concealment and prevention of error diffusion. FEC1 and FEC2 are just general representation methods here, representing any two subtypes. The two subtypes can belong to the same macrotype or to different macrotypes.

很显然,上述方法可以推广到更加一般的情形,把数据按照NAL_unit-type的取值分成更多类,比如五类:最重要数据、次重要数据、一般重要数据、较不重要数据、最不重要数据;也可以分成7类或者更多,那么,可以用相同数量的FEC子类型来保护,每类数据对应一种不同的子类型。只要保护能力从弱到强就可以了,这些子类型不一定属于同一个大类型。而对保护能力最强的FEC码保护后仍然未能恢复的图像信息采用错误掩盖和防止误码扩散等技术。Obviously, the above method can be extended to a more general situation, and the data is divided into more categories according to the value of NAL_unit-type, such as five categories: the most important data, the second important data, the general important data, the less important data, and the least important data. Important data; can also be divided into 7 categories or more, then, the same number of FEC subtypes can be used to protect, and each type of data corresponds to a different subtype. As long as the protection ability is from weak to strong, these subtypes do not necessarily belong to the same large type. For the image information that cannot be recovered after being protected by the most protective FEC code, technologies such as error concealment and prevention of error diffusion are used.

不等保护的另外一种情况也在本发明范围内,就是可以根据网络实时状况选择不同保护能力的的FEC。然后通过ERRTP的头信息来通知通信的双方,使得它们能够正确对数据进行解码和恢复丢失的数据。可以把网络当前受到影响传送性能下降的情况分成几个级别。比如五级:最严重、次严重、一般严重、较不严重、最不严重;也可以分成7级或者更多,那么,可以用相同数量的FEC子类型来保护,每级对应一种不同的子类型。只要保护能力从弱到强就可以了,这些子类型不一定属于同一个大类型。而对保护能力最强的FEC码保护后仍然未能恢复的图像信息采用错误掩盖和防止误码扩散等技术。感知网络状况可以通过现有的各种QoS监测方法实现。Another situation of unequal protection is also within the scope of the present invention, that is, FECs with different protection capabilities can be selected according to real-time network conditions. Then the two sides of the communication are notified through the header information of ERRTP, so that they can correctly decode the data and recover the lost data. The situation that the network is currently affected by the degradation of transmission performance can be divided into several levels. For example, there are five levels: most serious, less serious, generally serious, less serious, and least serious; it can also be divided into seven or more levels, then the same number of FEC subtypes can be used for protection, and each level corresponds to a different type Subtype. As long as the protection ability is from weak to strong, these subtypes do not necessarily belong to the same large type. For the image information that cannot be recovered after being protected by the most protective FEC code, technologies such as error concealment and prevention of error diffusion are used. Awareness of network conditions can be realized through various existing QoS monitoring methods.

更为复杂的应用方案也在本发明范围内,如果总共有T种FEC方案(不同类型/子类型)可以使用(通信双方终端都支持)。决定采用哪种FEC,要同时取决于数据重要性和网络的状况。那么可以采用一个二维LUT的方法,如表3所示:More complex application schemes are also within the scope of the present invention, if there are a total of T kinds of FEC schemes (different types/subtypes) available (supported by both communication terminals). Deciding which FEC to use depends on the importance of data and the status of the network. Then a two-dimensional LUT method can be used, as shown in Table 3:

表3多种FEC机制混合和交替使用的二维LUTTable 3 Two-dimensional LUT with mixed and alternate use of multiple FEC mechanisms

Figure C200510110013D00621
Figure C200510110013D00621

以上表格中,数据重要性级别和网络状况级别都按照升序排列。其中FEC的下标用二维下标表示,表中的错误弹性机制FEC(i,j),0<i≤U,0<j≤V,可以是上述T个FEC方案中的任意一种。In the above table, the data importance level and network status level are arranged in ascending order. The subscript of FEC is represented by a two-dimensional subscript, and the error resilience mechanism FEC(i, j) in the table, 0<i≤U, 0<j≤V, can be any one of the above T FEC schemes.

需要提及的是,上述发明的实施例描述中均以FEC纠删码特别是Tornado码为例,但对于其他类似的错误弹性机制特别是除Tornado码以外的FEC编码方案都可以适用,并不影响本发明的实质和范围。It should be mentioned that in the descriptions of the above-mentioned embodiments of the invention, FEC erasure codes, especially Tornado codes, are used as examples, but other similar error resilience mechanisms, especially FEC coding schemes other than Tornado codes, are applicable, and are not affect the spirit and scope of the present invention.

而在本发明的另外一个实施例中,专门采用了一种改进的Tornado纠删码,这种改进的Tornado纠删码对于一组数据节点仅生成一层所述校验节点,可以大大减少编码延时,满足实时通信的需求。In another embodiment of the present invention, an improved Tornado erasure code is specially adopted, and this improved Tornado erasure code only generates one layer of check nodes for a group of data nodes, which can greatly reduce coding Delay, to meet the needs of real-time communication.

在实时视频通信中,使用FEC码数据包保护会引入时延,时延的大小与图像数据数据包的大小相关。将S个NALU分为一组,其中一个NALU包含一个Slice的码流数据。如果一帧图像划分为一个Slice,则编码端就会有S帧的时延,同样解码端也会有S帧的时延。NALU与数据节点个数的关系如下式所示:In real-time video communication, the use of FEC code data packet protection will introduce time delay, and the size of the time delay is related to the size of the image data data packet. Divide S NALUs into a group, where one NALU contains code stream data of one Slice. If a frame of image is divided into a Slice, there will be a delay of S frames at the encoding end, and there will be a delay of S frames at the decoding end. The relationship between NALU and the number of data nodes is shown in the following formula:

&Sigma;&Sigma; ii == 00 sthe s NalSizeNalSize ii == PackSizePack Size &times;&times; DataNodeDataNodes

式中S个NALU长度值相加等于数据节点个数乘上每个节点数据包的大小。由式(1)可以看出当S取值受限时,PackSize×DataNode的取值也会受限,另外由于IP网络传送的有效性导致PackSize取值不能太小,因此DataNode的取值受限。In the formula, the sum of S NALU length values is equal to the number of data nodes multiplied by the size of each node's data packet. It can be seen from formula (1) that when the value of S is limited, the value of PackSize×DataNode will also be limited. In addition, due to the effectiveness of IP network transmission, the value of PackSize cannot be too small, so the value of DataNode is limited .

IP网络上实时视频通信中,一帧图像的延时Ttotal计算如下:In real-time video communication on an IP network, the delay T total of one frame of image is calculated as follows:

Ttotal=TFEC+Tcodec+Ttrans T total =T FEC +T codec +T trans

该式中TFEC是加入FEC保护后引入的时延,Tcodec和Ttrans分别是H.264编解码器处理时延和网络传送时延。由于数字信号处理技术和IP网络的迅速发展,可以假定Tcodec和Ttrans都能够满足实时性要求:In this formula, T FEC is the time delay introduced after adding FEC protection, and T codec and T trans are H.264 codec processing time delay and network transmission time delay respectively. Due to the rapid development of digital signal processing technology and IP network, it can be assumed that both T codec and T trans can meet the real-time requirements:

Tcodec<=Tth,Ttrans<=Tth,其中Tth=1/Ftarget T codec <= T th , T trans <= T th , where T th = 1/F target

式中Ftarget是解码目标帧率(可取值10Hz,30Hz等),且设一帧图像划分为一个Slice,这时式(2)可改为:In the formula, F target is the decoding target frame rate (possible value 10Hz, 30Hz, etc.), and a frame of image is divided into a Slice, then formula (2) can be changed to:

Ttotal<=S*Tth+2*Tth=(S+2)*Tth T total <= S*T th +2*T th =(S+2)*T th

由上两式可知,一帧图像的延时Ttotal的延时基本由S的取值确定,而DataNode又大大影响S的取值。因此,要在能够保证视频通信抗数据包丢失能力的前提下,尽量减少FEC引入的时延,进一步保证实时视频通信的QoS。It can be seen from the above two formulas that the delay T total of a frame of image is basically determined by the value of S, and the DataNode greatly affects the value of S. Therefore, on the premise of ensuring the anti-packet loss capability of video communication, the delay introduced by FEC should be reduced as much as possible to further ensure the QoS of real-time video communication.

本发明在DataNode受限的情况下,采用改进的Tornado码保护算法。该改进的Tornado方法,不采用多级偶图的编码方式,而是只使用一层校验节点的编码方式。与原来的Tornado编码方式相比,改进后的编码方法大大提高了算法的灵活性,数据节点和校验节点的个数可以任意设置,也降低了编解码算法的复杂度,可用于实时视频通信的抗数据包丢失。另外,在数据节点受限的情况下,改进Tornado码的抗数据包丢失性能基本没有下降。该改进的Tornado编码方法具体原理及详细步骤,在后文将详细阐述。The present invention adopts an improved Tornado code protection algorithm under the condition that DataNode is limited. The improved Tornado method does not use the coding method of multi-level even graphs, but only uses the coding method of one layer of check nodes. Compared with the original Tornado encoding method, the improved encoding method greatly improves the flexibility of the algorithm, the number of data nodes and check nodes can be set arbitrarily, and the complexity of the encoding and decoding algorithm is also reduced, which can be used for real-time video communication resistance to packet loss. In addition, in the case of limited data nodes, the anti-packet loss performance of the improved Tornado code basically does not decrease. The specific principles and detailed steps of the improved Tornado encoding method will be described in detail later.

注意到上例ERRTP对于NALU的封装中并没有提到NALU的信息头怎么处理,在本发明的第三实施例在第二实施例的基础上,ERRTP将同类NALU一起处理并将头信息综合到ERRTP头信息中。与RTP最基本的不同点在于,在ERRTP封装过程中,将具有相同头信息的NALU包的头信息综合入ERRTP的头信息中。Note that the encapsulation of NALUs by ERRTP in the above example does not mention how to process the information headers of NALUs. On the basis of the third embodiment of the present invention and the second embodiment, ERRTP processes NALUs of the same type together and integrates the header information into In the ERRTP header information. The most basic difference from RTP is that in the ERRTP encapsulation process, the header information of NALU packets with the same header information is integrated into the ERRTP header information.

前面已经提到过NALU头信息结构,这里再次强调一下,NALU信息依次包含:The structure of the NALU header information has been mentioned before, and here it is emphasized again that the NALU information contains in turn:

占1比特的F字段,用于指示所述NALU是否出错;The 1-bit F field is used to indicate whether the NALU has an error;

占2比特的NRI字段,用于指示所述NALU的重要性;The 2-bit NRI field is used to indicate the importance of the NALU;

占5比特的Type字段,用于指示所述NALU的类型。The 5-bit Type field is used to indicate the type of the NALU.

收发双方的执行步骤如下所述。发送端按ERRTP封装格式将头信息相同的多个NALU数据节点或者校验节点封装在同一个ERRTP包中。根据实际工程经验,在一般情况下,因为H.264比特流总是存在相邻的部分其对应的NALU类型相同这个属性,这个假设总是可以满足的。即使在某些情况下无法满足,也可以有几种对策可以处理这样的情况:第一种可以将现同类型的NALU累积,直到满足一定的数目后在封装到ERRTP中,另一种如果相同类型的NALU的数目达不到一定的数目的话,采用RTP填充的方法,虽然浪费一点带宽,但这微不足道,还有一种方法是如果类型不同的NALU非常多,则可以采用RTP封装,反正在接收端能够根据ERRTP标识来识别,进行对应的处理。The execution steps of the sending and receiving parties are as follows. The sender encapsulates multiple NALU data nodes or check nodes with the same header information in the same ERRTP packet according to the ERRTP encapsulation format. According to actual engineering experience, in general, because the H.264 bit stream always has the property that the corresponding NALU types of the adjacent parts are the same, this assumption can always be satisfied. Even if it cannot be satisfied in some cases, there are several countermeasures to deal with such a situation: the first one can accumulate NALUs of the same type until a certain number is met and then encapsulate them into ERRTP; If the number of NALUs of different types does not reach a certain number, the method of RTP filling is used. Although a bit of bandwidth is wasted, it is insignificant. Another method is that if there are many NALUs of different types, RTP encapsulation can be used. Anyway, it is receiving The end can be identified according to the ERRTP identifier, and corresponding processing is performed.

上面提到的在所述ERRTP封装格式中,将其所承载的NALU所具有的相同头信息综合在该ERRTP包的头信息中,并将所承载的NALU去掉其头信息再按照前面提到的流程处理,进行划分、编码和封装,填充入该ERRTP包的净荷中。那么如何将NALU头综合到ERRTP头中呢?下面将具体给出两套方案以解决这个几个问题。In the ERRTP encapsulation format mentioned above, the same header information of the NALU carried by it is integrated in the header information of the ERRTP packet, and the NALU carried by it is removed from the header information and then according to the aforementioned Process processing, performing division, encoding and encapsulation, filling into the payload of the ERRTP packet. So how to integrate the NALU header into the ERRTP header? Two sets of solutions will be given below to solve these problems.

在ERRTP封装格式中,NALU头信息中的NRI字段和Type字段填充在ERRTP包头信息的PT字段中,前面已经叙述,该PT字段位于ERRTP包头信息的第2个字节的后7比特。在图7中已经给出这样一个ERRTP头的格式,其中与RTP不同的地方已经用粗体部分表示,另外图中有些地方在后面还会解释。In the ERRTP encapsulation format, the NRI field and the Type field in the NALU header information are filled in the PT field of the ERRTP header information. As mentioned above, the PT field is located in the last 7 bits of the second byte of the ERRTP header information. The format of such an ERRTP header has been given in Figure 7, where the parts different from RTP have been indicated in bold, and some parts in the figure will be explained later.

另外两个要点是:第一,将ERRTP包头中的V字段作为ERRTP标识,前面已提到;第二,NALU头信息中的F字段填充在ERRTP包头信息的M字段中,该M字段位于ERRTP包头信息的第2个字节的前1比特,在接收端则根据ERRTP包的M字段判断其所承载的NALU是否出错,也就实现了F字段的禁止比特功能。可见该方案通过版本的区别,可以告诉RTP数据包的接收方,该RTP协议是ERRTP,从而在后面的处理,就要按照针对ERRTP协议的处理流程进行。The other two points are: first, use the V field in the ERRTP header as the ERRTP identifier, as mentioned above; second, the F field in the NALU header information is filled in the M field of the ERRTP header information, and the M field is located in the ERRTP The first bit of the second byte of the packet header information, at the receiving end, judges whether the NALU carried by it is wrong according to the M field of the ERRTP packet, and realizes the forbidden bit function of the F field. It can be seen that the solution can tell the receiver of the RTP data packet that the RTP protocol is ERRTP through the difference of the version, so that the subsequent processing will be performed according to the processing flow for the ERRTP protocol.

在该方案中,将NALU头信息字节(8个比特)替换原RTP头信息中的标识M字段1个比特和PT字段7个比特共8个比特。具体的替换顺序比如可以是这样:In this solution, the NALU header information byte (8 bits) replaces 1 bit of the identification M field and 7 bits of the PT field in the original RTP header information with a total of 8 bits. The specific replacement order can be as follows:

F比特替换M比特;F bits replace M bits;

NRI 2个比特替换PT 7个比特中的最高2个比特;NRI 2 bits replace the highest 2 bits of PT 7 bits;

Type 5个比特替换PT 7个比特中的最低5个比特;Type 5 bits replace the lowest 5 bits of PT 7 bits;

实际上,这样的替换方案是有其合理性的。PT 7个比特本来就是可以自由使用的,前面已经提到。M字段的用途在RTP(RFC 3550)中规定如下:某种具体的层面(Profile)可以规定不使用M比特,而是把它并入PT,这样PT最多可以有8个比特,区别256种不同的类型。因此,用F比特替换M比特完全是符合RTP规定的,不会引起ERRTP和传统RTP之间互通的问题。In fact, such an alternative is reasonable. The 7 bits of PT can be used freely, as mentioned earlier. The purpose of the M field is stipulated in RTP (RFC 3550) as follows: a specific level (Profile) can stipulate that M bits are not used, but it is incorporated into PT, so that PT can have up to 8 bits, distinguishing 256 different type. Therefore, replacing M bits with F bits is completely in line with RTP regulations, and will not cause intercommunication problems between ERRTP and traditional RTP.

容易看出本发明ERRTP的封装格式具有明显的三个优点:第一,额外开销少,尤其是一个RTP中有多个NALU时,明显节省传送比特数;第二,不用对RTP数据包中的H.264 NALU数据解码就可以判别这些NALU的相对重要性;第三,不用对RTP数据包中的H.264 NALU数据解码就可识别由于其它的比特丢失而是否会造成该RTP包能否正确解码。It can be easily seen that the encapsulation format of ERRTP of the present invention has three obvious advantages: the first, less overhead, especially when there are a plurality of NALUs in one RTP, it obviously saves the number of transmission bits; The relative importance of these NALUs can be judged by decoding the H.264 NALU data; thirdly, it is not necessary to decode the H.264 NALU data in the RTP data packet to identify whether the RTP packet is correct due to the loss of other bits decoding.

为了进一步详细描述本发明的技术细节,下面给出一个ERRTP封装和去封装的过程描述。在进行上述处理后,在同一个ERRTP数据包中的多个H.264 NALU类型完全相同,即它们的头信息字节都相同,那么在他们划分、编码、封装到ERRTP数据包中的时候,可以剥离掉原来的头信息字节,这样如果有N个NALU,可以减少N个字节。去封装时,就是把NALU从ERRTP数据包中提取解码、重新划分还原为原来的形式,即将这N个NALU从他们所在的ERRTP数据包中提取解码出来,然后把ERRTP头信息中的PT的7个比特拷贝到一个字节H(8比特)中的最低7个比特中去,而H的最高比特作为F比特,设置为0。然后把生成的H字节附加到每个提取出来的NALU的最前面,这样就还原了每个NALU。当然如果说ERRTP包头中的F字段为1的话,说明该ERRTP包中的NALU出错,因此直接丢弃即可,也节省的处理时间。In order to further describe the technical details of the present invention in detail, a process description of ERRTP encapsulation and decapsulation is given below. After the above processing, the multiple H.264 NALU types in the same ERRTP packet are exactly the same, that is, their header information bytes are the same, then when they are divided, encoded, and encapsulated into the ERRTP packet, The original header information bytes can be stripped, so that if there are N NALUs, N bytes can be reduced. When decapsulating, it is to extract and decode the NALU from the ERRTP data packet, re-divide and restore it to the original form, that is, extract and decode the N NALUs from the ERRTP data packet in which they are located, and then extract and decode the PT in the ERRTP header information. The bits are copied to the lowest 7 bits of a byte H (8 bits), and the highest bit of H is set to 0 as the F bit. Then append the generated H byte to the front of each extracted NALU, thus restoring each NALU. Of course, if the F field in the ERRTP packet header is 1, it means that the NALU in the ERRTP packet is wrong, so it can be discarded directly, which also saves processing time.

下面给出第二种解决方案,该方案与第一个有一点是相同的,即也是将NALU头中的NRI和Type字段填充到ERRTP头的PT字段的7个比特中。不同的地方有两点:采用M字段标识ERRTP,这样带来的一个问题就是F字段没有地方填充了,该实施例中将F是否置位的两类NALU分别对待,对于F置位的出错NALU还是采用原先的RTP传送,而对于正常的则采用ERRTP传,但忽略该F比特。具体细节如下所述。The second solution is given below, which is the same as the first one, that is, the NRI and Type fields in the NALU header are filled into the 7 bits of the PT field in the ERRTP header. There are two different places: the M field is used to identify ERRTP, and a problem brought about by this is that there is no place to fill the F field. In this embodiment, the two types of NALUs whether the F is set or not are treated separately, and the error NALU for the F set The original RTP transmission is still used, and the normal ERRTP transmission is used, but the F bit is ignored. The specific details are described below.

将M字段取值为1来标识ERRTP包,该M字段位于所述ERRTP包头信息的第2个字节的前1比特。而对于F比特,在H.264协议中规定:如果有语法冲突或者错误,则为1。当网络识别此单元中存在比特错误时,可将其设为1,以便接收方丢掉该单元。主要用于适应不同种类的网络环境,比如有线无线相结合的环境。具体的使用原则是:一般情况下通信的发送端和接收端在对于视频进行H.264编码和解码的时候,不对于该比特进行“写”操作,解码端对于该比特进行“读”操作。如果发现F=1,则接收端在解码过程中将丢弃该NALU。根据目前的业界普遍应用情况来看,对于F比特进行“写”操作,主要是在两种不同网络之间的网关上进行,比如进行编码转换的情况(MPEG-4到H.264,H.263到H.264等)。The value of the M field is 1 to identify the ERRTP packet, and the M field is located in the first bit of the second byte of the ERRTP packet header information. As for the F bit, it is stipulated in the H.264 protocol that if there is a syntax conflict or an error, it is 1. When the network recognizes a bit error in this unit, it can be set to 1 so that the receiver discards the unit. It is mainly used to adapt to different types of network environments, such as the combination of wired and wireless environments. The specific usage principle is: in general, when the sending end and the receiving end of the communication perform H.264 encoding and decoding on the video, they do not perform a "write" operation on the bit, and the decoding end performs a "read" operation on the bit. If F=1 is found, the receiving end will discard the NALU during decoding. According to the current general application situation in the industry, the "write" operation for the F bit is mainly performed on the gateway between two different networks, such as the case of encoding conversion (MPEG-4 to H.264, H. 263 to H.264, etc.).

因此,本发明将F比特忽略,不用与原来H.264定义的目的。从而使得原先用于填充F比特的M字段可以保留,用于未来的扩展携带更多信息,这里就是用于标识ERRTP包。这样做的好处是,不需要对于版本信息V=2进行修改,ERRTP还是用原来版本V取值2。这也是节约了目前仅有的RTP版本信息资源。Therefore, the present invention ignores the F bit, which does not meet the purpose defined by the original H.264. Therefore, the M field originally used to fill the F bits can be reserved for future expansion to carry more information, which is used to identify the ERRTP packet here. The advantage of doing this is that there is no need to modify the version information V=2, and ERRTP still uses the value 2 of the original version V. This also saves the current only RTP version information resource.

然而不可避免的是,在实际应用中可能出现需要使用F比特的小概率情况,比如NALU语法错的时候,本发明对于这种情况做如下处理:在ERRTP封装格式中,忽略所述NALU头信息中的F字段;但在发送端,对于F字段有效的出错NALU,仍旧采用RTP包封装,仅对正常的NALU采用ERRTP包装;在接收端则判断该包为ERRTP还是RTP包后按相应封装格式处理该包。也就是说,当F比特在某些特殊情况下,要用于原来H.264定义的目的,即要用于表示可能存在的H.264NALU语法错误的情况,如果一个中间设备比如网关在对于视频按照H.264协议进行视频编码的时候,发现某个NALU存在语法错误,那么就要对于该NALU单独进行封装处理。However, it is unavoidable that there may be a small probability situation in which F bits need to be used in practical applications. For example, when the NALU syntax is wrong, the present invention handles this situation as follows: In the ERRTP encapsulation format, ignore the NALU header information However, at the sending end, RTP packet encapsulation is still used for the effective error NALU of the F field, and only the normal NALU is encapsulated with ERRTP; at the receiving end, it is judged whether the packet is ERRTP or RTP packet and then the corresponding encapsulation format is adopted Process the package. That is to say, when the F bit is used for the purpose defined in the original H.264 in some special cases, that is, it is used to indicate possible H.264 NALU syntax errors. When performing video encoding according to the H.264 protocol, if a NALU is found to have a syntax error, then the NALU must be encapsulated separately.

归纳上述ERRTP和RTP交替处理的方法流程如下:Summarize the method flow process of above-mentioned ERRTP and RTP alternate processing as follows:

发送端首先判断至少一个NALU的头信息中的F字段是否有效,据此将其分为正常NALU和出错NALU;The sender first judges whether the F field in the header information of at least one NALU is valid, and accordingly divides it into normal NALU and error NALU;

然后按ERRTP封装格式将正常NALU封装成ERRTP包,并设ERRTP标识;按RTP封装格式将出错NALU封装成RTP包;Then normal NALU is encapsulated into ERRTP packet by ERRTP encapsulation format, and ERRTP mark is established; Error NALU is encapsulated into RTP packet by RTP encapsulation format;

接收端首先判断接收到的包的头信息是否设ERRTP标识,将其分为ERRTP包和RTP包;The receiving end first judges whether the header information of the received packet has an ERRTP flag, and divides it into an ERRTP packet and an RTP packet;

然后根据ERRTP封装格式处理ERRTP包,根据RTP包封装格式处理RTP包。Then process the ERRTP packet according to the ERRTP encapsulation format, and process the RTP packet according to the RTP packet encapsulation format.

可见,网关对于正常的NALU,按照前面描述的方法,对于类型相同的H.264 NALU按照一定的规则(由具体应用决定,主要规定每个ERRTP数据包中封装多少个同类的NALU)进行ERRTP封装,一旦发现某个NALU存在语法错误,那么就要对于该NALU采用常规RTP封装。这个时候常规的RTP数据包中也许就只含有一个H.264 NALU。It can be seen that the gateway performs ERRTP encapsulation for normal NALUs according to the method described above, and for H.264 NALUs of the same type according to certain rules (determined by specific applications, mainly specifying how many similar NALUs are encapsulated in each ERRTP packet) , once a syntax error is found in a certain NALU, the conventional RTP encapsulation will be used for the NALU. At this time, the regular RTP packet may contain only one H.264 NALU.

最后还需要说明的一点是,注意到前文提到的表1中给出的NALU的类型及其对应Type字段的取值,可以发现现有的类型不足16种,也就是说Type的5个比特完全可以缩减为4个,这不影响现有的H.264传送,因此在ERRTP封装格式中,当NALU的所有类型少于16种时,仅用Type字段的低4比特表征,而Type的最高比特作为扩展保留比特,称作C字段。将该C比特留待以后使用,继续进行功能扩展。将比特C进行保留后,表1中给出的NALU类型要做相应修改:共16个值,取值0-12与表1相同,取值13-15为保留。The last point to be explained is that, noticing the type of NALU and the value of the corresponding Type field given in Table 1 mentioned above, it can be found that there are less than 16 existing types, that is to say, 5 bits of Type It can be reduced to 4, which does not affect the existing H.264 transmission. Therefore, in the ERRTP encapsulation format, when all types of NALU are less than 16, only the lower 4 bits of the Type field are used to represent, and the highest of Type Bits are reserved as extended bits, called the C field. Reserve the C bit for later use and continue with functional expansion. After bit C is reserved, the NALU type given in Table 1 needs to be modified accordingly: a total of 16 values, the values 0-12 are the same as those in Table 1, and the values 13-15 are reserved.

当然虽然目前H.264的NALU类型只有13种,但是H.264后续会发展,可能会产生更多的NALU类型,如果未来NALU类型增加到16种以上,那么还是需要用PT7个比特中的最低4个比特加上C比特作为类型指示。Of course, although there are only 13 NALU types in H.264 at present, H.264 will develop in the future, and more NALU types may be produced. If the NALU types increase to more than 16 in the future, then it is still necessary to use the lowest PT7 bits. 4 bits plus C bit for type indication.

需要提及的是这里将NALU头信息综合到ERRTP包头信息中的最大好处也就是,多媒体传送设备可以根据ERRTP头信息直接获知其所承载的NALU的相关信息,并据此实施H.264多媒体数据实时传送的QoS策略。这一点在现有的RTP是无法实现的,因为对于RTP层来说,NALU层信息是不关心的,也就无法获知净荷中的每个NALU的头信息的,从而无法实现QoS策略。It should be mentioned that the biggest advantage of integrating the NALU header information into the ERRTP header information here is that the multimedia transmission device can directly know the relevant information of the NALU carried by it according to the ERRTP header information, and implement H.264 multimedia data accordingly. QoS policy delivered in real time. This point cannot be realized in the existing RTP, because for the RTP layer, the NALU layer information is not concerned, and the header information of each NALU in the payload cannot be known, so that the QoS policy cannot be implemented.

在ERRTP的基础之上,为了实现接收端的反馈,采用SEI承载QoS报告的增强技术,从前文描述可见,RTCP承担了QoS报告机制,但它其实是一种通用的报告方法,可以用于报告QoS,也可以用于报告其它信息。对于特定的视频通信应用,用RTCP来报告却不一定是最合适的。在某些时候,如果QoS信息的发送方和接收方都能使用更高层的协议比如H.264来通信,则完全可以考虑用H.264来承载报告的内容。本发明就是基于这个出发点,直接采用H.264来承载QoS报告信息,可以避免使用额外的信道,实现了一种“带内”报告机制。On the basis of ERRTP, in order to realize the feedback of the receiving end, the enhanced technology of SEI carrying QoS report is adopted. From the previous description, it can be seen that RTCP undertakes the QoS reporting mechanism, but it is actually a general reporting method that can be used to report QoS , which can also be used to report other information. For specific video communication applications, using RTCP to report may not be the most appropriate. At some point, if both the sender and the receiver of the QoS information can use a higher layer protocol such as H.264 to communicate, it is entirely possible to consider using H.264 to carry the content of the report. Based on this starting point, the present invention directly adopts H.264 to bear QoS report information, avoids the use of extra channels, and realizes an "in-band" report mechanism.

由H.264高层协议来传送QoS报告的另一个依据是,在目前的视频通信应用中,对于网络传送的适应措施,主要基于终端来实现,而不是网络中间设备比如路由器,交换机或者网关来实现。因此QoS报告的封装提取并不依赖于底层协议,只需终端能够理解提取H.264中承载的QoS报告信息就能实现QoS监测,因此可以不依赖于底层的RTCP等协议。当然,通过采用H.264的“带内”报告机制,并不意味着排斥RTCP报告机制的应用,两种机制可以选择使用,也可以共存,H.264的使用反而能够降低RTCP的报告流量。另外,如果采用H.264“带内”报告方式,则H.264的数据包可以采取多种保护措施,并且对于承载QoS报告的H.264数据包,可以认为是重要的数据,根据不等保护(Unequal Protection,简称“UEP”)的原则,可以对其采用高强度的保护措施。从而可以保证报告数据的正确到达,提高QoS监测的可靠性。Another basis for transmitting QoS reports by the H.264 high-level protocol is that in current video communication applications, adaptation measures for network transmission are mainly based on terminals, rather than network intermediate devices such as routers, switches or gateways. . Therefore, the encapsulation and extraction of the QoS report does not depend on the underlying protocol. Only the terminal can understand and extract the QoS report information carried in H.264 to realize QoS monitoring, so it does not depend on the underlying RTCP and other protocols. Of course, by adopting the "in-band" reporting mechanism of H.264, it does not mean that the application of the RTCP reporting mechanism is excluded. The two mechanisms can be used selectively or can coexist. The use of H.264 can reduce the reporting traffic of RTCP. In addition, if the H.264 "in-band" reporting method is used, multiple protection measures can be taken for the H.264 data packets, and the H.264 data packets carrying the QoS report can be considered as important data, depending on the According to the principle of Unequal Protection ("UEP"), high-intensity protection measures can be adopted for it. Therefore, the correct arrival of the report data can be ensured, and the reliability of QoS monitoring can be improved.

本发明的第四实施例在第三实施例的基础上,基于H.264的扩展消息机制来承载QoS报告的,大致分为以下三个基本步骤On the basis of the third embodiment, the fourth embodiment of the present invention carries the QoS report based on the extended message mechanism of H.264, which is roughly divided into the following three basic steps

首先,各个多媒体通信终端统计生成H.264多媒体通信的QoS报告,这些报告的内容可以与RTCP的SR、RR报告内容相同,当然也可以不同,但是所描述的有关H.264媒体通信的服务质量及网络状态等信息是一致的;First, each multimedia communication terminal statistically generates the QoS report of H.264 multimedia communication. The content of these reports can be the same as the SR and RR report content of RTCP, and of course it can also be different, but the described service quality of H.264 media communication and network status and other information are consistent;

然后,终端用H.264扩展消息承载这些QoS报告,发给其他通信终端,H.264扩展消息机制前面已提及,典型的有SEI等,本发明所采用的基本上就是SEI消息,当然随着以后H.264的扩展也可以使用其它扩展消息承载;Then, the terminal uses the H.264 extended message to carry these QoS reports and sends them to other communication terminals. The H.264 extended message mechanism has been mentioned above, typically SEI, etc., and what the present invention uses is basically the SEI message. In the future, the extension of H.264 can also be carried by other extended messages;

在发送QoS报告的同时终端也接收到其它终端发来的QoS报告,事实上每个终端都将根据这些QoS报告执行QoS策略。While sending QoS reports, the terminal also receives QoS reports from other terminals. In fact, each terminal will execute QoS policies according to these QoS reports.

本发明以SEI消息承载QoS报告的,以现有的RTCP的QoS报告为例,可以直接将RTCP的SR、RR报告的主要内容,作为H.264 SEI消息的载荷,从而用扩展SEI消息来承载这些信息。The present invention uses the SEI message to carry the QoS report. Taking the existing RTCP QoS report as an example, the main content of the SR and RR reports of RTCP can be directly used as the load of the H.264 SEI message, so as to be carried by the extended SEI message these messages.

基于这种思想,在本发明的第四实施例中,定义具体的SEI扩展消息专门用于承载QoS报告。H.264规定,SEI信息存放在一类NALU中,如前所述其Type=6。本发明在SEI域中存放类似RTCP的SR和RR报告消息,既保证了传送效率,又能有效地反馈信道状态及解码信息,便于编码端和解码端交互式抗数据包丢失。具体结构如图8所示,其中除了头信息按照SEI消息结构来安排以外,其它QoS报告内容都借鉴RTCP的SR、RR报告的格式。Based on this idea, in the fourth embodiment of the present invention, a specific SEI extension message is defined specifically for carrying the QoS report. According to H.264, SEI information is stored in a type of NALU, and its Type=6 as mentioned above. The invention stores the SR and RR report messages similar to RTCP in the SEI domain, which not only ensures the transmission efficiency, but also effectively feeds back the channel state and decoding information, and is convenient for the coding end and the decoding end to interactively resist data packet loss. The specific structure is shown in Figure 8, except that the header information is arranged according to the SEI message structure, other QoS report contents refer to the SR and RR report formats of RTCP.

用于承载QoS报告的SEI消息的头信息包含以下字段:The header information of the SEI message used to carry the QoS report contains the following fields:

第1个字节(字节0)为载荷类型字段(SEI Type),用于指示载荷为对应QoS报告,本实施例中,SEI Type=200表示存放在SEI域中的是类似RTCP中的发送报告(SR),而SEI Type=201表示其为接收报告(RR);The first byte (byte 0) is the load type field (SEI Type), which is used to indicate that the load is a corresponding QoS report. In this embodiment, SEI Type=200 means that what is stored in the SEI domain is a transmission similar to RTCP Report (SR), and SEI Type=201 indicates that it is a reception report (RR);

第2、3个字节(字节1、2)为载荷长度字段(SEI Packet-Length),用于指示对应QoS报告长度,这个长度与RTCP的QoS报告中的长度字段采用相同的定义;The second and third bytes (bytes 1 and 2) are the payload length field (SEI Packet-Length), which is used to indicate the length of the corresponding QoS report, which has the same definition as the length field in the RTCP QoS report;

第4个字节及以后为SEI消息的载荷,也即用于填充对应QoS报告。The fourth and subsequent bytes are the load of the SEI message, which is used to fill the corresponding QoS report.

QoS报告也分为发送方报告和接收方报告,由载荷类型字段指示区分,即SEI Type取值不同,QoS报告的具体内容可以与RTCP的SR、RR报告相同,比如图2中所示:The QoS report is also divided into a sender report and a receiver report, which are distinguished by the payload type field indication, that is, the value of the SEI Type is different, and the specific content of the QoS report can be the same as the SR and RR reports of RTCP, as shown in Figure 2:

版本信息字段(V),占2比特,本例取值为二进制11即V=3,表示与以前版本的区别;The version information field (V) occupies 2 bits, and the value of this example is binary 11, that is, V=3, indicating the difference from the previous version;

填充字段(P),占1比特,用于指示是否有填充内容,与RTCP相同;The padding field (P), which occupies 1 bit, is used to indicate whether there is padding content, which is the same as RTCP;

接收报告数字段(RC),占5比特,用于指示该QoS报告中所报告接收报告块数目;The received report number field (RC), which occupies 5 bits, is used to indicate the number of received report blocks reported in the QoS report;

发送方SSRC字段,占32比特,用于标识该服务质量报告的发送方;The sender SSRC field, which occupies 32 bits, is used to identify the sender of the service quality report;

对于发送方报告,这里还包含发送方信息块,用于描述该报告的发送方的相关信息;For the sender report, the sender information block is also included here, which is used to describe the relevant information of the sender of the report;

之后包含多块接收报告块,用于描述来自不同源的多媒体统计信息,每块包含源的标识符和多媒体流的相关统计指标,前面RTCP中已经描述了各种指标的意义;After that, it contains multiple receiving report blocks, which are used to describe the multimedia statistical information from different sources. Each block contains the identifier of the source and the relevant statistical indicators of the multimedia stream. The meanings of various indicators have been described in the previous RTCP;

最后包含特定层面扩展,用于特定层面的保留功能扩展。Finally, it contains layer-specific extensions, which are reserved for layer-specific extensions.

可见,图8中给出的QoS报告内容与RTCP基本相同。RTCP的基本内容RR和SR写入SEI域后,可以不需要专门的逻辑信道传递RTCP信息,节省了部分带宽开销。事实上,本发明的精髓在于用SEI消息进行带内承载,至于QoS报告的如何统计生成,只要能实现QoS监测的发明目的,都不影响本发明的实质和范围。It can be seen that the content of the QoS report given in Figure 8 is basically the same as that of RTCP. After the basic contents of RTCP, RR and SR, are written into the SEI field, no special logical channel is needed to transmit RTCP information, which saves some bandwidth overhead. In fact, the essence of the present invention is to use SEI messages for in-band bearer. As for how to generate QoS reports statistically, as long as the purpose of QoS monitoring can be realized, the essence and scope of the present invention will not be affected.

在实现QoS报告之后,即可在此基础上进行多种QoS策略,比如利用RTCP的累计数据包丢失字段,它们在双向视频通信(终端既有编码器又有解码器)中可用于反馈解码信息,便于交互式抗数据包丢失。After the QoS report is realized, a variety of QoS strategies can be implemented on this basis, such as using the cumulative data packet loss field of RTCP, which can be used to feed back decoding information in two-way video communication (the terminal has both an encoder and a decoder) , to facilitate interactive anti-packet loss.

另外,在QoS报告中有到达时延抖动和发送方字节计数等字段,它们都可用于感知网络状态。其中,速率控制算法可根据到达时延抖动字段中的信息,进一步保证编码端速率接近恒定;发送方字节计数字段可以估算载荷的平均速率,便于发送端根据网络状态重新设定编码器参数,包括调整目标帧率、恢复图像质量和原始图像的分辨率等等。In addition, there are fields such as arrival delay jitter and sender byte count in the QoS report, which can be used to perceive the network status. Among them, the rate control algorithm can further ensure that the rate of the encoder is close to constant according to the information in the arrival delay jitter field; the byte count field of the sender can estimate the average rate of the load, which is convenient for the sender to reset the encoder parameters according to the network status. Including adjusting the target frame rate, restoring the image quality and resolution of the original image, and more.

为了改进RTCP传送的可靠性不足,在采用H.264“带内”报告方式后,H.264的数据包可以采取多种保护措施,并且对于承载QoS报告的H.264数据包,可以认为是重要的数据,根据不等保护的原则,可以对其采用高强度的保护措施。从而可以保证报告数据的正确到达。比如用于承载QoS报告的SEI应该进一步由NALU承载,而如前所述NALU是有一个头信息可以设置该内容的重要程度的,因此通信终端可以根据QoS报告传送的可靠性要求来设置该NALU的nal_ref_idc字段,可以设为1,2,3等,在错误弹性编码中即会根据这一字段的等级不同而采取不同强度的保护措施。In order to improve the reliability of RTCP transmission, after adopting the H.264 "in-band" reporting method, H.264 data packets can take multiple protection measures, and for H.264 data packets carrying QoS reports, it can be considered as For important data, according to the principle of unequal protection, high-intensity protection measures can be adopted for it. Thus, the correct arrival of the report data can be guaranteed. For example, the SEI used to carry the QoS report should be further carried by the NALU, and as mentioned above, the NALU has a header information that can set the importance of the content, so the communication terminal can set the NALU according to the reliability requirements of the QoS report transmission The nal_ref_idc field can be set to 1, 2, 3, etc. In error elastic coding, protection measures with different strengths will be taken according to the level of this field.

通信终端还可以根据当前网络状态和高层应用需求来动态调整基于SEI消息的QoS报告的发送周期。缺省情况下,将RTCP信息写入SEI域的时间间隔(即报告周期)与RFC 3550中建议RTCP传送间隔一致。当然,根据特定应用的需要(特定的保护方法等),可能报告周期不一定和RFC 3550规定的完全一样,而是可以调整。报告周期根据特定应用的需要确定。比如,报告数据的一个重要用途是动态估计网络的性能:丢包率,延迟,抖动等。如果需要频繁检测这些数据,则报告周期要短,否则报告周期可以长。在网络状况良好的时候,可以停止报告。另外,用SEI消息不仅可以传送H.264视频的QoS报告,还可以混合承载多种媒体流的QoS报告,只需在QoS报告后面加入各种媒体流相应的接收报告块即可。比如音频流等,只要在SR报告中增加其源的SSRC具体的报告块内容。前面也提到,除了采用SEI进行带内监测之后,通信终端还可以选择现有的RTCP传送,也可以同时使用H.264扩展消息、RTCP中的一种或两种来传送承载QoS报告。The communication terminal can also dynamically adjust the sending period of the QoS report based on the SEI message according to the current network status and high-level application requirements. By default, the time interval for writing RTCP information into the SEI field (that is, the reporting period) is consistent with the RTCP transmission interval recommended in RFC 3550. Of course, according to the needs of specific applications (specific protection methods, etc.), the reporting period may not necessarily be exactly the same as that stipulated in RFC 3550, but can be adjusted. The reporting period is determined by the needs of a particular application. For example, an important use of reporting data is to dynamically estimate the performance of the network: packet loss rate, delay, jitter, etc. If these data need to be detected frequently, the reporting period should be short, otherwise the reporting period can be long. When the network condition is good, the report can be stopped. In addition, the SEI message can not only transmit the QoS report of H.264 video, but also carry the QoS report of multiple media streams in a mixed way, just add the corresponding receiving report blocks of various media streams after the QoS report. Such as audio stream, etc., as long as the SSRC specific report block content of the source is added in the SR report. As mentioned earlier, in addition to using the SEI for in-band monitoring, the communication terminal can also choose the existing RTCP transmission, or use one or both of the H.264 extended message and RTCP to transmit the bearer QoS report.

在给出了采用SEI实现从接收端反馈网络状况相关的QoS报告之后,在此基础上就容易实现自适应的保护策略调整,包括多级保护和不等保护。根据现有技术对于网络通信状况无法自适应调整的问题,本发明的第五实施例给出一种统计当前通信状况并自适应调整保护策略的自适应保护的视频传送方法。首先按照保护方法性能影响,给出不同参数配置,设置保护能力不同的多等级保护策略,用于在不同通信状况下被选用于进行高效可靠的保护;其次,在接收端根据通信情况统计网络状况、通信质量,并将其发回给发送端;最后由发送端根据发回的通信质量统计信息进行调整,选择最合适的保护策略等级。After the use of SEI to realize the QoS report related to the network status fed back from the receiving end is given, based on this, it is easy to realize adaptive protection strategy adjustment, including multi-level protection and unequal protection. According to the problem that the prior art cannot adaptively adjust the network communication status, the fifth embodiment of the present invention provides an adaptively protected video transmission method that counts the current communication status and adaptively adjusts the protection strategy. First, according to the performance impact of the protection method, different parameter configurations are given, and multi-level protection strategies with different protection capabilities are set, which are used to be selected for efficient and reliable protection under different communication conditions; secondly, the network status is counted at the receiving end according to the communication conditions , communication quality, and send it back to the sender; finally, the sender adjusts according to the communication quality statistical information sent back, and selects the most appropriate protection strategy level.

该方案的关键还在于统计通信质量的方法及发回统计信息的渠道。利用H.264 NALU的序号丢失情况可以统计丢包率及其位置等信息,并通过定义NALU中净荷部分的扩展SEI消息结构,用于承载该统计信息,从接收端传送统计数据到发送端。这样的反馈机制虽然与QoS报告的SR/RR格式不尽相同,但熟悉本领域的技术人员可以理解,两种方式的根本原理是相同的,只是用SEI承载的内容不同,因此下面的描述不再专门提出SEI承载网络丢包率的方案与QoS报告的方案的区别。The key of this scheme also lies in the method of statistical communication quality and the channel of sending back statistical information. Using the serial number loss of H.264 NALU, the packet loss rate and its location can be counted, and by defining the extended SEI message structure of the payload part in the NALU, it is used to carry the statistical information, and the statistical data is transmitted from the receiving end to the sending end . Although such a feedback mechanism is not the same as the SR/RR format of the QoS report, those skilled in the art can understand that the fundamental principles of the two methods are the same, but the content carried by the SEI is different, so the following descriptions are different. Then it specifically proposes the difference between the scheme of SEI bearer network packet loss rate and the scheme of QoS report.

以Tornado纠删码为例,即根据前述Tornado纠删码的编码解码方法来对视频流数据进行保护。Tornado纠删码需要设定参数有:数据节点数目、校验节点数目、递缩比率、校验节点层数、用于计算校验节点的各级二部图。在视频流通信过程中,发送端将视频流数据分割为数据节点,然后按照Tornado编码方法产生校验节点,一起发送给接收端;接收端则按照Tornado解码方法进行纠错,获得视频流数据。Taking the Tornado erasure code as an example, the video stream data is protected according to the encoding and decoding method of the aforementioned Tornado erasure code. Tornado erasure code needs to set the parameters: the number of data nodes, the number of check nodes, the shrinkage ratio, the number of layers of check nodes, and the bipartite graph at all levels used to calculate the check nodes. In the process of video stream communication, the sending end divides the video stream data into data nodes, and then generates check nodes according to the Tornado encoding method, and sends them to the receiving end together; the receiving end performs error correction according to the Tornado decoding method to obtain video stream data.

由于实际IP网络带宽等因素是经常变化而不稳定的,因此固定的保护策略将带来低效率或者高误码率等问题,因此本实施例预先设定了保护力度不同等级的保护策略系列,分别用于在不同通信质量等级情况下保护视频流数据。可见,不同等级的保护策略可以适应网络通信质量的变化,不但能够满足信道劣化情况下的保护力度要求,而且能够在信号改善情况下适当调低保护力度,以减少系统开销,节约处理、带宽资源。Since factors such as the actual IP network bandwidth are often changing and unstable, a fixed protection strategy will bring problems such as low efficiency or high bit error rate. Therefore, this embodiment pre-sets a series of protection strategies with different levels of protection strength. They are respectively used to protect video stream data at different communication quality levels. It can be seen that different levels of protection strategies can adapt to changes in network communication quality, not only can meet the requirements of protection strength in the case of channel degradation, but also can properly reduce the protection strength in the case of signal improvement, so as to reduce system overhead and save processing and bandwidth resources .

为了给定不同等级保护策略,需要设定不同参数的Tornado纠删码。根据前述影响Tornado纠删码保护性能的参数主要有数据节点数目、校验节点数目及二部图两侧节点度向量的随机分布,为简单起见,不同能力的Tornado码,一般不会有统一的二部图的,采用不同的数据节点数目和校验节点数目来给出不同保护力度的Tornado纠删码保护策略。根据Tornado纠删码原理,不同数据节点数目和校验节点数目即能确定不同码率或冗余率的Tornado纠删码,从而给出不同的保护力度和系统开销。In order to specify different levels of protection strategies, Tornado erasure codes with different parameters need to be set. According to the aforementioned parameters that affect the protection performance of Tornado erasure codes mainly include the number of data nodes, the number of check nodes, and the random distribution of node degree vectors on both sides of the bipartite graph, for the sake of simplicity, Tornado codes with different capabilities generally do not have a unified For bipartite graphs, different numbers of data nodes and check nodes are used to give Tornado erasure code protection strategies with different protection strengths. According to the principle of Tornado erasure codes, different numbers of data nodes and check nodes can determine Tornado erasure codes with different code rates or redundancy rates, thus giving different protection strengths and system overheads.

接收端接收数据并进行Tornado纠删码解码得到视频流数据,同时根据数据丢失情况进行统计,得到统计信息表征通信质量。The receiving end receives the data and performs Tornado erasure code decoding to obtain the video stream data. At the same time, statistics are made according to the data loss, and the statistical information is obtained to represent the communication quality.

发送端需要根据通信质量状况来进行保护策略调整,因此需要对传送情况进行统计,接收端根据H.264视频流程数据的NALU的序列号来统计传送情况。在基于H.264双向视频通信中,通信系统的各个终端都既有编码器、又有解码器。而NALU是序列编号的,即所有发送端发送出去的NALU具有统一的序列编号,因此,接收端可以根据收到NALU的序号,判断是否有NALU丢失。如果有NALU序号不连续就说明存在NALU丢失,中断的NALU序号就是丢失NALU的序号,其个数就是丢失的NALU数目。经过一段时间的累计,即可计算得到该段时间内丢失的NALU的总数目,再对该时间段内所有NALU数目进行归一化,即可得到累计丢包率(Accumulated Lost SliceRate,简称"ALSR")。当然,接收端也可以将丢包信息直接发回给发送端,由发送端进行统计。采用NALU序号来进行统计,不但能保证统计信息精确无误,而且直接利用现有数据信息,不需要额外的承载开销。The sending end needs to adjust the protection strategy according to the communication quality status, so it needs to make statistics on the transmission status, and the receiving end calculates the transmission status according to the serial number of the NALU of the H.264 video flow data. In the two-way video communication based on H.264, each terminal of the communication system has both an encoder and a decoder. The NALUs are sequence numbered, that is, all NALUs sent by the sending end have a unified sequence number, so the receiving end can judge whether any NALU is lost according to the sequence number of the received NALU. If there are discontinuous NALU serial numbers, it means that there is a NALU loss. The interrupted NALU serial number is the serial number of the lost NALU, and its number is the number of lost NALUs. After a period of accumulation, the total number of NALUs lost during this period can be calculated, and then the number of all NALUs within this period can be normalized to obtain the Accumulated Lost Slice Rate (Accumulated Lost Slice Rate, referred to as "ALSR") "). Of course, the receiving end can also directly send the packet loss information back to the sending end, and the sending end will make statistics. The use of NALU serial numbers for statistics not only ensures the accuracy of statistical information, but also directly utilizes existing data information without additional bearer overhead.

接收端将统计信息以及其他数据丢失信息通过扩展SEI消息发回发送端。在接收端统计得到关于传送情况的统计信息后,需要发回给发送端,本实施例定义了扩展SEI消息结构,专门用于承载从接收端发回的传送情况统计信息。接收端在完成统计后,将该信息写入专门定义的扩展SEI消息体中,然后写入该终端发回的编码码流的SEI域中,发回发送端。发送端收到该SEI消息后,即可直接得知统计信息,或者统计得到ALSR,从而建立发送端对于网络丢包率的真实感知机制。The receiving end sends statistical information and other data loss information back to the sending end through the extended SEI message. After the statistical information about the transmission situation is obtained by the receiving end, it needs to be sent back to the sending end. This embodiment defines an extended SEI message structure, which is specially used to carry the statistical information about the transmission situation sent back from the receiving end. After finishing the statistics, the receiving end writes the information into the specially defined extended SEI message body, then writes it into the SEI field of the coded stream sent back by the terminal, and sends it back to the sending end. After receiving the SEI message, the sender can directly know the statistical information, or obtain the ALSR through statistics, so as to establish a true perception mechanism of the sender for the network packet loss rate.

如前所述SEI消息也由H.264码流的基本单位NALU所承载,每个SEI域包含一个或多个SEI消息,而SEI消息又由SEI头信息和SEI有效载荷组成。SEI头信息包括两个码字:载荷类型和载荷大小。其中载荷类型的长度不一定,比如类型在0到255之间时用一个字节表示,当类型在256到511之间时用两个字节0xFF00到0xFFFE表示,依次类推,这样用户可以自定义任意多种载荷类型。在现有H.264标准中,类型0到类型18标准中已定义为特定的信息,如缓存周期、图像定时等。由此可见H.264中定义的SEI域可根据需求存放足够多的用户自定义信息。在本发明的第一实施例中,在预留的SEI载荷类型中定义一种用于承载统计信息的扩展SEI消息。As mentioned above, the SEI message is also carried by the basic unit NALU of the H.264 code stream. Each SEI field contains one or more SEI messages, and the SEI message is composed of SEI header information and SEI payload. The SEI header information includes two codewords: payload type and payload size. The length of the payload type is not necessarily the same. For example, when the type is between 0 and 255, it is represented by one byte, when the type is between 256 and 511, it is represented by two bytes 0xFF00 to 0xFFFE, and so on, so that users can customize Any number of load types. In the existing H.264 standard, types 0 to 18 have been defined as specific information, such as cache period, image timing, and the like. It can be seen that the SEI field defined in H.264 can store enough user-defined information according to requirements. In the first embodiment of the present invention, an extended SEI message for carrying statistical information is defined in the reserved SEI payload type.

最后发送端根据发回的统计信息进行Tornado纠删码的调整,使用更加合适当前传送情况的保护策略。最后发送端将要根据统计信息来调整保护策略,即选择合适等级的保护策略。这里发送端还要预先设定对应于不同保护等级的判断阈值系列,设定进入各个级别的阈值,然后根据ALSR所落在阈值选择其相应的等级。由此建立的传送情况的统计、反馈、调整机制能够准确、及时地适应网络传送需求,提高保护能力。Finally, the sender adjusts the Tornado erasure code according to the statistical information sent back, and uses a protection strategy that is more suitable for the current transmission situation. Finally, the sender will adjust the protection strategy according to the statistical information, that is, select a protection strategy with an appropriate level. Here, the sender also presets a series of judgment thresholds corresponding to different protection levels, sets the thresholds for entering each level, and then selects the corresponding level according to the threshold where the ALSR falls. The statistics, feedback and adjustment mechanism of the transmission situation thus established can accurately and timely adapt to the network transmission requirements and improve the protection capability.

对不同重要性的数据采用了不同的保护策略系列。考虑到关键数据和非关键数据的保护力度要求不同,为了进一步提高适应度,设定了两个不同的保护策略系列,分别用于保护关键数据和非关键数据。这样,两种不同通信需求的数据,即可独立处理,按适合各自需求的保护力度选择保护策略,提高系统效率。Different series of protection strategies are adopted for data of different importance. Considering the different protection requirements of key data and non-key data, in order to further improve the adaptability, two different protection strategy series are set up, which are used to protect key data and non-key data respectively. In this way, the data of two different communication requirements can be processed independently, and the protection strategy can be selected according to the protection strength suitable for each requirement, so as to improve the system efficiency.

例如,用不同等级的Tornado码作为保护方案系列,其保护能力等级用参数n、1表征,其中n表示数据节点数,1表示校验节点数。用TN(n+1,n)表示由参数n、l确定的Tornado码保护方案。因此对应于关键数据的保护方案系列为:TNK(n0+l0,n0),TNK(n1+l1,n1),........,TNK(nL-1+lL-1,nL-1);同样的对于非关键数据的保护方案系列为:TNNK(n0+l0,n0),TNNK(n1+l1,n1),........,TNNK(nL-1+lL-1,nL-1)。设定阈值系列0<G1,G2,......,GL-1<1,即用于判断选择保护等级。发送端在调整保护策略时,根据ALSR和阈值G1,G2,......,GL-1的关系,进行如下操作:For example, different levels of Tornado codes are used as a series of protection schemes, and their protection capability levels are characterized by parameters n and 1, where n represents the number of data nodes and 1 represents the number of check nodes. Use TN(n+1, n) to represent the Tornado code protection scheme determined by parameters n and l. Therefore, the series of protection schemes corresponding to key data are: TN K (n 0 +l 0 , n 0 ), TN K (n 1 +l 1 , n 1 ), ......, TN K (n L-1 +l L-1 , n L-1 ); the same series of protection schemes for non-critical data are: TN NK (n 0 +l 0 , n 0 ), TN NK (n 1 +l 1 , n 1 ), ......, TN NK (n L-1 +l L-1 , n L-1 ). Set the threshold value series 0<G 1 , G 2 , . . . , G L-1 <1, which is used to judge and select the protection level. When adjusting the protection strategy, the sending end performs the following operations according to the relationship between ALSR and the thresholds G 1 , G 2 , . . . , GL-1 :

如果0<AlSR<G1,则采用TNK(n0+l0,n0)对于关键数据进行保护,采用TNNK(n0+l0,n0)对于非关键数据进行保护;If 0<AlSR<G1, use TN K (n 0 +l 0 , n 0 ) to protect critical data, and use TN NK (n 0 +l 0 , n 0 ) to protect non-critical data;

如果Gi<AlSR<Gi+1,i=1,2,.....,L-2,则采用TNK(ni+li,ni)对于关键数据进行保护,采用TNNK(ni+li,ni)对于非关键数据进行保护;If G i <AlSR<G i+1 , i=1, 2, ..., L-2, then use T NK (n i +l i , n i ) to protect key data, and use TN NK (n i +l i , n i ) protect non-critical data;

如果GL-1<AlSR<1,则采用TNK(nL-1+lL-1,nL-1)对于关键数据进行保护,采用TNNK(nL-1+lL-1,nL-1)对于非关键数据进行保护。If G L-1 <AlSR<1, then use TN K (n L-1 +l L-1 , n L-1 ) to protect key data, and use TN NK (n L-1 +l L-1 , n L-1 ) Protect non-critical data.

此外,发送端还根据接收端发回的丢失数据信息,重新发送这些信息。接收端在统计丢失的NALU信息时,同时获得丢失的NALU所对应包含的图像帧的定位信息,该信息包含所在帧的序号及帧中的位置。接收端将定位信息发回给发送端,发送端即可定位到对应的视频流数据,并重新发送。在实时视频通信中,延时太长的视频流数据已经失去了价值,但在某些业务需求情况下或者某种机制下,具有一定延时的数据仍然具有价值,比如在缓冲范围较大的视频通信中,只要延时的视频流数据仍然落在缓冲区内,这些数据就可以用于避免视频流播放的中断。可见重发机制对于提高视频流通信的可靠性和服务质量具有重要价值的。In addition, the sending end also resends the lost data information sent back by the receiving end. When the receiving end counts the lost NALU information, it also obtains the positioning information of the image frame corresponding to the lost NALU, and the information includes the sequence number of the frame and the position in the frame. The receiving end sends the positioning information back to the sending end, and the sending end can locate the corresponding video stream data and send it again. In real-time video communication, the video stream data with too long delay has lost its value, but under certain business requirements or a certain mechanism, data with a certain delay still has value, such as in a large buffer range In video communication, as long as the delayed video stream data still falls in the buffer, these data can be used to avoid interruption of video stream playback. It can be seen that the retransmission mechanism is of great value in improving the reliability and service quality of video stream communication.

除了采用错误弹性保护策略以外,本发明的第六实施例在第五实施例的基础上,从错误掩盖和误码扩散消除两个方面出发,结合接收端错误掩盖策略和发送端的误码扩散消除策略,以实现既能尽量减少误码带来的视频质量损失又能避免误码引起扩散的目的。对于错误掩盖,采用简单替代方案即可达到以尽量低的复杂度实现补偿误码损失的效果;对于误码扩散消除,通过H.264已有通道建立误码信息反馈机制,根据反馈实施帧内编码,以达到扩散消除效果,且不增加额外网络负担,确保视频码流对误码问题的鲁棒性,也因此避免因错误掩盖引起的误码扩散。In addition to adopting the error resilience protection strategy, the sixth embodiment of the present invention starts from the two aspects of error concealment and error diffusion elimination on the basis of the fifth embodiment, and combines the error concealment strategy of the receiving end and the error diffusion elimination of the sending end Strategy, in order to achieve the goal of minimizing the loss of video quality caused by bit errors and avoiding the spread caused by bit errors. For error concealment, a simple alternative can be used to achieve the effect of compensating bit error losses with as little complexity as possible; for bit error diffusion elimination, an error information feedback mechanism is established through the existing channels of H.264, and intra-frame implementation is implemented according to the feedback. Encoding to achieve the effect of diffusion elimination without adding additional network loads, ensuring the robustness of the video stream to bit errors, and thus avoiding error diffusion caused by error concealment.

该方案的基本思路是,在接收端通过对NALU序号的统计,发现丢失数据信息,如Slice的位置等,一方面采用高效算法对丢失数据进行简单替代以掩盖误码损失,另一方面将误码信息反馈给发送端。通过H.264的扩展SEI消息,建立从接收端到发送端的误码信息反馈通道。发送端获知误码信息后,立即采取分段逐次进行帧内编码的策略,将误码Slice分段刷新,以防止误码扩散。The basic idea of this scheme is to discover the missing data information, such as the location of the Slice, by counting the NALU sequence numbers at the receiving end. On the one hand, an efficient algorithm is used to simply replace the missing Code information is fed back to the sender. Through the extended SEI message of H.264, an error information feedback channel from the receiving end to the sending end is established. After learning the bit error information, the sender immediately adopts the strategy of performing intra-frame coding segment by segment and refreshes the bit error Slice segment by segment to prevent bit error from spreading.

H.264视频通信过程中,发送端对待发送视频流数据进行编码,得到视频码流,然后封装NALU并通过分组报文传送给接收端。接收端接收报文并进行解码,此时接收端需要判断视频流数据是否有丢失,以进行后续的误码消除操作。误码消除流程大致分为掩盖、反馈、扩散消除三个大步骤。During the H.264 video communication process, the sending end encodes the video stream data to be sent to obtain the video code stream, then encapsulates the NALU and sends it to the receiving end through a packet message. The receiving end receives the message and decodes it. At this time, the receiving end needs to judge whether the video stream data is lost, so as to perform subsequent error elimination operations. The error elimination process is roughly divided into three major steps: cover, feedback, and diffusion elimination.

首先,接收端根据NALU序号中断情况来判断是否丢失数据,并统计丢失数据的信息,即误码信息。如前所述,NALU是H.264视频流数据传送的基本单位,每个NALU都有唯一连续的序号。因此,接收端根据接收到NALU序号是否有中断,获知哪些NALU丢失了。从而可以实施针对丢失数据的错误掩盖策略。采用NALU序号来进行统计,不但能保证统计信息精确无误,而且直接利用现有数据信息,不需要额外的承载开销。First, the receiving end judges whether data is lost according to the interruption of the NALU sequence number, and counts information about the lost data, that is, bit error information. As mentioned above, NALU is the basic unit of H.264 video stream data transmission, and each NALU has a unique serial number. Therefore, the receiving end knows which NALUs are lost according to whether the received NALU sequence numbers are interrupted. An error masking strategy for missing data can thus be implemented. The use of NALU serial numbers for statistics not only ensures the accuracy of statistical information, but also directly utilizes existing data information without additional bearer overhead.

首先接收端通过识别接收到的NALU头信息获知序号,由序号的不连续检测误码发生,通过前面NALU得知中间缺失的NALU应该承载的视频数据,对误码引起的数据丢失进行定位,比如丢失NALU的前一个NALU承载的是第N帧的第1个Slice,则按传送顺序可以推断丢失NALU所承载的Slice的位置,应该是本帧的后一个Slice。First, the receiving end obtains the serial number by identifying the received NALU header information, and detects the occurrence of bit errors due to the discontinuity of the serial number, and learns the video data that the missing NALU should carry through the previous NALU, and locates the data loss caused by the bit error, for example The previous NALU of the lost NALU carried the first Slice of the Nth frame, and it can be inferred that the position of the Slice carried by the lost NALU should be the next Slice of the frame according to the transmission sequence.

接着接收端需要进行视频信息的重同步,由于H.264视频码流连续传送过程中,接收端与数据流需要同步,然后才能正确接收,一旦数据流有中断之后,接收端需要重新进行同步,通过找到中断处之后的下一个NALU头信息来完成解码器的重同步。这一过程,接收端也需要通过下一个NALU的序号来判断中间丢失的NALU个数及其定位信息。Then the receiving end needs to re-synchronize the video information. Since the H.264 video stream is continuously transmitted, the receiving end and the data stream need to be synchronized before receiving it correctly. Once the data stream is interrupted, the receiving end needs to re-synchronize. Resynchronization of the decoder is done by finding the next NALU header after the break. In this process, the receiving end also needs to judge the number of NALUs lost in the middle and their location information through the serial number of the next NALU.

之后,接收端需要进行错误掩盖,丢失数据的NALU被整个丢弃,因此该NALU所承载整个Slice丢失,错误掩盖策略就是通过简单替代,用时间域或者空间域相邻的数据代替丢失的数据,比如采用丢失数据所在帧的前一帧对应位置的Slice恢复图像数据进行掩盖。After that, the receiving end needs to perform error concealment, and the NALU with lost data is completely discarded, so the entire Slice carried by the NALU is lost. The error concealment strategy is to replace the lost data with adjacent data in the time domain or space domain through simple replacement, such as Use the Slice recovery image data of the frame corresponding to the frame where the missing data is located to cover up.

接收端在获得误码信息后,将其反馈给发送端。反馈误码信息需要一条反馈通道,为了减少网络负担、简化实现机制,本发明的第一实施例中采用现有的H.264通信机制,定义扩展SEI消息,用于承载误码信息建立反馈,以便发送端结合误码信息防止误码扩散。事实上,结合误码信息反馈机制和发送端的误码扩散消除策略,才能避免因前面接收端实施的错误掩盖策略导致的误码扩散。After obtaining the bit error information, the receiving end feeds it back to the sending end. Feedback error information requires a feedback channel. In order to reduce the network burden and simplify the implementation mechanism, the first embodiment of the present invention uses the existing H.264 communication mechanism to define extended SEI messages for carrying error information and establishing feedback. So that the sender can combine the bit error information to prevent bit error from spreading. In fact, only by combining the bit error information feedback mechanism and the bit error diffusion elimination strategy of the sending end, can the bit error diffusion caused by the previous error concealment strategy implemented by the receiving end be avoided.

在前面的实施例中,利用H.264的扩展SEI消息提供一种从接收端到发送端的信息反馈机制,使得发送端可以及时了解哪些NALU丢失了,这样可以及时进行有效的误码扩散消除,防止因这些丢失的数据引起以后的误码扩散。In the previous embodiment, the extended SEI message of H.264 is used to provide an information feedback mechanism from the receiving end to the sending end, so that the sending end can know which NALUs are lost in time, so that effective bit error diffusion can be effectively eliminated in time, Prevent future error propagation caused by these lost data.

在H.264体系内部建立信息反馈机制的好处在于节约网络带宽开销、节省系统处理资源、且不影响互通性。下面介绍如何定义扩展SEI消息。如前所述SEI消息也由H.264码流的基本单位NALU所承载,每个SEI域包含一个或多个SEI消息,而SEI消息又由SEI头信息和SEI有效载荷组成。SEI头信息包括两个码字:载荷类型和载荷大小。其中载荷类型的长度不一定,比如类型在0到255之间时用一个字节表示,当类型在256到511之间时用两个字节0xFF00到0xFFFE表示,依次类推,这样用户可以自定义任意多种载荷类型。在现有H.264标准中,类型0到类型18标准中已定义为特定的信息,如缓存周期、图像定时等。由此可见H.264中定义的SEI域可根据需求存放足够多的用户自定义信息The advantage of establishing an information feedback mechanism within the H.264 system is to save network bandwidth overhead, save system processing resources, and not affect interoperability. The following describes how to define an extended SEI message. As mentioned above, the SEI message is also carried by the basic unit NALU of the H.264 code stream. Each SEI field contains one or more SEI messages, and the SEI message is composed of SEI header information and SEI payload. The SEI header information includes two codewords: payload type and payload size. The length of the payload type is not necessarily the same. For example, when the type is between 0 and 255, it is represented by one byte, when the type is between 256 and 511, it is represented by two bytes 0xFF00 to 0xFFFE, and so on, so that users can customize Any number of load types. In the existing H.264 standard, types 0 to 18 have been defined as specific information, such as cache period, image timing, and the like. It can be seen that the SEI domain defined in H.264 can store enough user-defined information according to requirements

然后,发送端即开始根据反馈的误码信息进行误码扩散消除。联合误码信息的误码扩散消除方法,其效果要比现有的无反馈的误码扩散消除好。利用误码信息,比如丢失Slice的位置,发送端可以有目的的针对所丢失Slice采取防止措施,比如在以后的编码中避免以丢失Slice作为参考帧,这样可以尽量缩短接收端解码时对该Slice的依赖。Then, the sending end starts to eliminate bit error diffusion according to the fed back bit error information. The error diffusion elimination method of joint error information is better than the existing error diffusion elimination method without feedback. Using bit error information, such as the location of the lost slice, the sender can take preventive measures against the lost slice, for example, avoid using the lost slice as a reference frame in future encoding, so as to shorten the decoding time of the receiver as much as possible. dependency.

由于H.264编码是基于Slice的,即前后帧的同一Slice的数据是具有参考关联的,后续帧的同一Slice数据是通过前面帧的Slice预测编码的,因此误码扩散也将限定在同一Slice内部。本发明的第二实施例中,采用分段逐次进行帧内编码的策略,即在发送误码之后,对以后帧的该Slice区域分段分割为新的Slice,比如划分处P个宏块作为一个新Slice,然后对其采用帧内编码,以消除该Slice对前面丢失的Slice的参考或依赖。由于H.264视频实时传送系统为了保证传送质量,采用数据率控制方案来限制每帧数据的波动,使得每帧数据量均衡,提高视频传送的稳定性。因此,在每帧中一次进行帧内编码的数据量即宏块数目不能太多,否则将会超过H.264数据率控制范围。Since H.264 encoding is based on Slice, that is, the data of the same Slice in the previous and subsequent frames has a reference relationship, and the data of the same Slice in the subsequent frame is encoded through the Slice prediction of the previous frame, so the error diffusion will also be limited to the same Slice internal. In the second embodiment of the present invention, the strategy of performing intra-frame coding segment by segment is adopted, that is, after sending a bit error, the Slice area of the subsequent frame is segmented into new Slices, such as dividing P macroblocks as A new Slice is then intra-coded to remove any reference or dependency of the Slice to the previously lost Slice. In order to ensure the transmission quality, the H.264 video real-time transmission system uses a data rate control scheme to limit the fluctuation of each frame of data, so that the amount of data per frame is balanced and the stability of video transmission is improved. Therefore, the amount of data to be intra-coded once in each frame, that is, the number of macroblocks, cannot be too large, otherwise it will exceed the H.264 data rate control range.

图9示出了分段逐次帧内编码的误码扩散消除的原理。当接收端出现无法恢复的丢包错误后,检测并反馈误码信息给发送端,即丢失数据的Slice所在帧及帧内定位信息通过扩展的SEI消息发回给发送端。发送端从SEI消息中提取丢失的slice定位信息,比如图9中的每帧划分为三个Slice,即Slice#0、Slice#1、Slice#2,而第n帧的Slice#1在传送中丢失,之后需要执行分段逐次帧内编码。Fig. 9 shows the principle of error diffusion elimination in segmented successive intra-frame coding. When an unrecoverable packet loss error occurs at the receiving end, it detects and feeds back the error information to the sending end, that is, the frame where the Slice of the lost data is located and the positioning information within the frame are sent back to the sending end through the extended SEI message. The sender extracts the missing slice positioning information from the SEI message. For example, each frame in Figure 9 is divided into three Slices, namely Slice#0, Slice#1, and Slice#2, and Slice#1 of the nth frame is being transmitted lost, after which segmental successive intra coding needs to be performed.

首先,在第n帧中,编码端对Slice#1按宏块扫描顺序,从起始位置开始分割P个宏块组成新的Slice#3,剩余宏块仍然为Slice#1,此时有四个Slice,其中对新的Slice#3进行帧内编码。First, in the nth frame, the encoding end divides Slice#1 in the order of macroblock scanning, and divides P macroblocks from the starting position to form a new Slice#3, and the remaining macroblocks are still Slice#1. At this time, there are four Slices, in which the new Slice#3 is intra-coded.

接着,在第n+1帧中,上一步中分割新组成的Slice#3在帧内编码之后,作为Slice#3发送出去,而其他Slice仍然按照常规编码。Next, in the n+1th frame, the newly formed Slice#3 divided in the previous step is sent as Slice#3 after intra-frame encoding, while other Slices are still coded according to the routine.

此后,需要判断Slice#1中是否还剩余宏块,如果还有没有分割的,则返回第一步在下一帧中继续将Slice#1剩余宏块分段组成新的帧,实施帧内编码并发送,直到所有宏块处理完毕。After that, it is necessary to judge whether there are still remaining macroblocks in Slice#1. If there are any remaining macroblocks, return to the first step and continue to segment the remaining macroblocks of Slice#1 into a new frame in the next frame, perform intra-frame coding and Sent until all macroblocks have been processed.

上面每次划分的宏块个数P应该满足以下条件,尽量大,以避免分割次数、减少处理时延、缩短影响范围,但是需要满足前述H.264数据率控制范围。每次划分的宏块个数可以不一样,但最后一次划分的宏块数将使得丢失Slice中的所有宏块都处理完毕。The number P of macroblocks divided each time above should meet the following conditions and be as large as possible to avoid the number of divisions, reduce processing delay, and shorten the scope of influence, but it needs to meet the aforementioned H.264 data rate control range. The number of macroblocks divided each time may be different, but the number of macroblocks divided last time will make all the macroblocks in the lost slice be processed.

比如说视频流数据的一帧由240个宏块组成,初始划分每80个宏块为一个Slice,即1-80宏块为Slice#0,81-160宏块为Slice#1,161-240宏块为Slice#2。而根据数据率计算确定合适分段数值P为12个宏块一段。则第n帧中丢失Slice#1后,Slice#1的80个宏块应该进行分段逐次帧内编码,首先在第n+1帧中选前12个宏块进行帧内编码组成为Slice#3,这样在第n+2帧中Slice#3即可采用常规预测编码,而接着的12个宏块再进行帧内编码组成Slice#4,依次直到第n+7帧时最后剩余为8个宏块进行帧内编码组成Slice#9,才完成分段逐次帧内编码的误码扩散方法流程。For example, a frame of video stream data consists of 240 macroblocks. Initially, every 80 macroblocks is divided into a Slice, that is, 1-80 macroblocks are Slice#0, 81-160 macroblocks are Slice#1, 161-240 The macroblock is Slice#2. According to the calculation of the data rate, the appropriate segment value P is determined to be a segment of 12 macroblocks. Then, after Slice#1 is lost in the nth frame, the 80 macroblocks of Slice#1 should be segmented and sequentially intraframe coded. First, the first 12 macroblocks in the n+1th frame are selected for intraframe coding to form Slice#3 , so that in the n+2th frame, Slice#3 can use conventional predictive coding, and then the next 12 macroblocks are intraframe coded to form Slice#4, until the n+7th frame, the last remaining 8 macroblocks The blocks are intra-frame coded to form Slice#9, and then the flow of the error diffusion method of segmental successive intra-frame coding is completed.

根据实验结果发现采用本发明的错误掩盖和误码扩散消除联合的方法后,得到的视频图像效果非常好。According to the experimental results, it is found that after adopting the combined method of error concealment and bit error diffusion elimination of the present invention, the effect of the obtained video image is very good.

在最后还要提出一种改进的Tornado编码方案,该方案在本发明的第七实施例中使用这种Tornado编码方案作为错误弹性保护策略。下面简单指出该种Tornado编码方案与传统的编码方案的主要区别。Finally, an improved Tornado coding scheme is proposed, which is used as an error resilience protection strategy in the seventh embodiment of the present invention. The main difference between this Tornado coding scheme and the traditional coding scheme is briefly pointed out below.

在采用Tornado码进行数据传送保护的过程中,设置多层的Tornado码校验节点层会在一定程度上增强数据传送保护能力,但是,设置多层的Tornado码校验节点层也会使Tornado码的运算量大,从而使数据在进行传送保护过程中付出了时间延迟长的代价。如果能够在确保数据传送保护能力没有显著下降的情况下,减少校验节点层的层数,就能够有效减少Tornado码的运算量,大大减小数据传送过程中的时间延迟,从而寻求到更高的数据传送保护性能-代价比。因此,本发明的第七实施例是:设置仅具有一层校验节点层的纠删码,根据所述纠删码进行数据传送保护。In the process of using Tornado codes for data transmission protection, setting multi-layer Tornado code check node layers will enhance the data transmission protection capability to a certain extent, but setting multi-layer Tornado code check node layers will also make Tornado code check nodes layer The amount of calculation is large, so that the data is paid for a long time delay in the process of transmission protection. If the number of check node layers can be reduced while ensuring that the data transmission protection capability is not significantly reduced, the calculation load of the Tornado code can be effectively reduced, and the time delay in the data transmission process can be greatly reduced, thereby seeking a higher data transfer protection performance-cost ratio. Therefore, the seventh embodiment of the present invention is: setting an erasure code with only one check node layer, and performing data transmission protection according to the erasure code.

该Tornado纠删码方案仅具有一层校验节点层,去掉了Tornado码的中间校验节点层,同样,也去掉了Tornado码中按照Reed-Solomon编码产生最后一层校验节点的固有要求,这样,本发明的纠删码如附图10所示,仅具有一层数据节点层和一层校验节点层,可以说本发明的纠删码是一种结构简化的Tornado码,是一种改进的Tornado码。The Tornado erasure code scheme only has one layer of check nodes, and the intermediate check node layer of the Tornado code is removed. Similarly, the inherent requirement of generating the last layer of check nodes according to the Reed-Solomon code in the Tornado code is also removed. In this way, the erasure code of the present invention, as shown in Figure 10, has only one layer of data node layer and one layer of check node layer. It can be said that the erasure code of the present invention is a Tornado code with a simplified structure, which is a Improved Tornado code.

本发明改进的Tornado码的数据节点大小L1、数据节点层中数据节点的个数n、校验节点层中校验节点个数L可根据实际需求来确定。如根据数据传送速率、数据类型如音频数据/视频数据等、数据保护能力要求、能够接收的最大网络延迟等因素确定数据节点层中数据节点大小L1、数据节点层中包含的数据节点个数n、校验节点层中包含的校验节点个数L。The data node size L1 of the improved Tornado code of the present invention, the number n of data nodes in the data node layer, and the number L of check nodes in the check node layer can be determined according to actual requirements. For example, determine the data node size L1 in the data node layer and the number of data nodes n contained in the data node layer based on factors such as data transmission rate, data type such as audio data/video data, data protection capability requirements, and the maximum network delay that can be received , the number L of check nodes included in the check node layer.

如果设定现有技术中Tornado码具有m层中间校验节点层,且从数据节点层至第m个中间层,相邻两层之间的节点数目的等比递缩因子为

Figure C200510110013D0082153300QIETU
最后层与第m层之间的节点数目的等比递缩因子为则现有技术中Tornado码的总节点数TotalNode为:If it is assumed that the Tornado code in the prior art has m layers of intermediate check node layers, and from the data node layer to the mth intermediate layer, the proportional shrinkage factor of the number of nodes between adjacent two layers is
Figure C200510110013D0082153300QIETU
The proportional shrinkage factor of the number of nodes between the last layer and the mth layer is Then the total number of nodes Total Node of Tornado code in the prior art is:

Figure C200510110013D00823
Figure C200510110013D00823

由于TotalNode=n+L,因此,L的设置是有限制的,

Figure C200510110013D00824
L不能够任意设定。由于需要保证Tornado码中每层节点的节点个数都是整数,因此需要以及
Figure C200510110013D00826
都是整数,这个条件叫做隐含整数节点数条件。根据该条件,如果给定Tornado码中的m和就可以计算出n需要满足的条件,如当m=3,
Figure C200510110013D00828
则可以计算出n=16k,其中k为任意自然数。由此可知,n能够取得的最小值为16,且现有技术中Tornado码的码率r为:
Figure C200510110013D00829
而冗余率1-r为:
Figure C200510110013D008210
Figure C200510110013D008211
Since Total Node = n+L, therefore, the setting of L is limited,
Figure C200510110013D00824
L cannot be set arbitrarily. Since it is necessary to ensure that the number of nodes in each layer of the Tornado code is an integer, it is necessary as well as
Figure C200510110013D00826
are all integers, this condition is called the implicit integer node number condition. According to this condition, if m and You can calculate the conditions that n needs to meet, such as when m=3,
Figure C200510110013D00828
Then n=16k can be calculated, where k is any natural number. It can be seen that the minimum value that n can obtain is 16, and the code rate r of the Tornado code in the prior art is:
Figure C200510110013D00829
And the redundancy rate 1-r is:
Figure C200510110013D008210
Figure C200510110013D008211

本发明改进的Tornado码由于不存在中间校验节点层,使改进的Tornado码不再需要上述隐含整数节点数目的条件,本发明改进的Tornado码的校验节点层的校验节点个数L为:本发明的改进的Tornado码的数据节点层与校验节点层的节点数目的等比递缩因子

Figure C200510110013D008213
可以任意设置,在给定数据节点个数n的条件下,L可以灵活设定。The improved Tornado code of the present invention does not need the condition of the above-mentioned implicit integer node number because there is no intermediate check node layer in the improved Tornado code, and the check node number L of the check node layer of the improved Tornado code of the present invention for: The proportional shrinkage factor of the data node layer and the node number of the check node layer of the improved Tornado code of the present invention
Figure C200510110013D008213
It can be set arbitrarily. Under the condition of a given number of data nodes n, L can be set flexibly.

本发明改进的Tornado码的码率r为:

Figure C200510110013D008214
本发明改进的Tornado码的冗余率1-r为:
Figure C200510110013D008215
The code rate r of the improved Tornado code of the present invention is:
Figure C200510110013D008214
The redundancy rate 1-r of the improved Tornado sign indicating number of the present invention is:
Figure C200510110013D008215

本发明改进的Tornado码可表示为TN(n+L,n),如TN(30,20),表示数据节点层中数据节点数目n=20、校验节点层中校验节点数目L=10。此时,本发明改进的Tornado码的

Figure C200510110013D008216
而码率r=2/3=66.7%。The improved Tornado code of the present invention can be represented as TN (n+L, n), such as TN (30, 20), indicating that the number of data nodes in the data node layer is n=20, and the number of check nodes in the check node layer is L=10 . At this point, the improved Tornado code of the present invention
Figure C200510110013D008216
And the code rate r=2/3=66.7%.

综上所述,本发明在综合上述六种增强技术的基础上,将整个H.264/ERRTP传送架构模块化实现,并且相互结合在一个协议栈上,不仅能够实现各自的优点,而且经过相互增强之后能够体现更好的可靠性和服务质量。In summary, on the basis of synthesizing the above six enhancement technologies, the present invention modularizes the entire H.264/ERRTP transmission architecture, and combines each other on a protocol stack. Enhanced to reflect better reliability and quality of service.

熟悉本领域的技术人员可以理解,上述七个实施例的描述中,涉及到各种具体实现细节和参数选择等,都是可以根据具体应用另外确定,并不影响本发明的实质和范围。Those skilled in the art can understand that the descriptions of the above seven embodiments involve various specific implementation details and parameter selections, which can be determined according to specific applications and do not affect the essence and scope of the present invention.

虽然通过参照本发明的某些优选实施方式,已经对本发明进行了图示和描述,但本领域的普通技术人员应该明白,可以在形式上和细节上对其作各种改变,而不偏离本发明的精神和范围。Although the present invention has been illustrated and described with reference to certain preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the present invention. The spirit and scope of the invention.

Claims (44)

1. a multimedia communication method is characterized in that, its communication process comprises following steps,
The A transmitting terminal is protected multi-medium data according to the error elasticity protection strategy based on real time transport protocol, and send it to receiving terminal, the relevant information of described error elasticity protection strategy is carried in the error elasticity real time transport protocol bag of the described multi-medium data of encapsulation, wherein, described error elasticity real time transport protocol bag is a kind of packet that includes multi-medium data simultaneously and the error elasticity that this multi-medium data is protected is protected the relevant information of strategy;
The described receiving terminal of B receives described multi-medium data, occur to transmit under the wrong situation, and the error elasticity protection strategy that carries by the error elasticity real time transport protocol bag of the described multi-medium data of described encapsulation recovers or part is recovered described multi-medium data.
2. multimedia communication method according to claim 1 is characterized in that its communication process also comprises following steps,
The described receiving terminal statistics of C communication quality generates service quality report, and it is sent back to described transmitting terminal;
In the described steps A, described transmitting terminal is adjusted described error elasticity protection strategy according to described service quality report.
3. multimedia communication method according to claim 2 is characterized in that its communication process also comprises following steps,
The described receiving terminal statistics of D transmits error message, and implementation mistake is covered strategy;
Among the described step C, described receiving terminal also feeds back to described transmitting terminal with described transmission error message;
The described transmitting terminal of E is implemented the error code diffusion according to described transmission error message and is eliminated strategy.
4. multimedia communication method according to claim 3 is characterized in that, described steps A comprises following substep:
The A1 transmitting terminal selects the error elasticity encoding scheme that multi-medium data is carried out forward error correction coding;
Multi-medium data after the described transmitting terminal of A2 is encoded with the encapsulation of error elasticity real time transport protocol, and in described error elasticity real time transport protocol header packet information, carry described error elasticity encoding scheme relevant information, send to receiving terminal then;
Described step B comprises following substep:
The described receiving terminal of B1 goes the error elasticity real time transport protocol bag of receiving to encapsulation, and extracts described forward error correction coding scheme relevant information from described error elasticity real time transport protocol header packet information;
If the losing of error elasticity real time transport protocol bag of back end correspondence taken place in B2 in transport process, so described receiving terminal is according to described error elasticity encoding scheme relevant information, select described decoding FEC scheme to carry out decoding FEC, recover or partly recover the described multi-medium data of losing; If losing of back end do not taken place, then do not need to carry out decoding FEC.
5. multimedia communication method according to claim 4 is characterized in that, the multi-medium data behind the described forward error correction coding is divided into back end and check-node two classes.
6. multimedia communication method according to claim 5 is characterized in that, in described steps A 1, described transmitting terminal is selected described forward error correction coding scheme according to one of following factor at least:
Current network transmission status and service of multimedia data credit rating to be sent, service of multimedia data credit rating wherein to be sent depends on the relative importance of different pieces of information.
7. multimedia communication method according to claim 6 is characterized in that, comprises in the described error elasticity real time transport protocol header packet information:
Error elasticity real time transport protocol identification field is used for indication to be different from real time transport protocol;
The forward error correction coding type field is used to the forward error correction code type of indicating described forward error correction coding scheme to adopt;
The forward error correction coding sub-type field is used to indicate the relative parameters setting of described forward error correction coding scheme;
The data packet length field is used to indicate the length of described forward error correction coding scheme at the node that described multi-medium data is carried out obtain behind the forward error correction coding;
The number of data packets field is used to indicate the number of the described node that this error elasticity real time transport protocol bag carried.
8. multimedia communication method according to claim 7, it is characterized in that, in described steps A 1, described transmitting terminal with at least one H.264 network abstraction layer unit be divided at least one isometric back end, then it is carried out forward error correction coding, obtain at least one check-node;
In described steps A 2, described transmitting terminal sends described back end and described check-node packet encapsulation at least one described error elasticity real time transport protocol bag;
In described step B1, described receiving terminal goes encapsulation to obtain described back end and described check-node after receiving described error elasticity real time transport protocol bag;
In described step B2, if the back end in the transport process has taken place to be lost, then described receiving terminal carries out recovering based on the recovery or the part of decoding FEC to the described back end of losing according to described check-node, and division obtains described H.264 network abstraction layer unit.
9. multimedia communication method according to claim 8 is characterized in that, before beginning transmission, also comprises step,
Described transmitting terminal and described receiving terminal are consulted to determine: for various described forward error correction code types, the corresponding relation of the relative parameters setting of this kind forward error correction that the value of described forward error correction sub-type field is indicated with it.
10. multimedia communication method according to claim 9, it is characterized in that, described transmitting terminal and described receiving terminal are all set up mapping table according to the corresponding relation of described forward error correction coding sub-type field indication, are used for inquiring about pairing forward error correction coding or decoding FEC processing module according to described forward error correction coding type field and described forward error correction coding sub-type field;
In described steps A 1, described transmitting terminal calls corresponding forward error correction coding processing module and carries out forward error correction coding;
In described step B2, described receiving terminal calls corresponding decoding FEC processing module and carries out decoding FEC.
11. multimedia communication method according to claim 10, it is characterized in that, in described steps A 1, described transmitting terminal is according to the combination of network abstract layer reference identification field in the header of described H.264 network abstraction layer unit and any one or both in the network abstraction layer unit type field, and any other can predefined rule assess the relative importance of corresponding data, thereby determine described service quality rating, and then select described forward error correction coding scheme, determine described forward error correction coding type field and forward error correction coding sub-type field.
12. multimedia communication method according to claim 10, it is characterized in that, in described steps A, described transmitting terminal is estimated described network transmission status according to the transmission report of described receiving terminal feedback, and then select described forward error correction coding scheme, determine described forward error correction coding type field and forward error correction coding sub-type field.
13. multimedia communication method according to claim 12 is characterized in that, the version information field value in the described error elasticity real time transport protocol header packet information is binary value " 11 " or decimal value " 3 ", to be different from real time transport protocol;
Described forward error correction coding type field is positioned at after the tabulation of contribution source identifier, accounts for 4 bits;
Described forward error correction coding sub-type field is positioned at after the described forward error correction coding type field, accounts for 9 bits;
Described data packet length field is positioned at after the described forward error correction coding sub-type field, accounts for 11 bits;
Described number of data packets field is positioned at after the described data packet length field, accounts for 8 bits.
14. multimedia communication method according to claim 8, it is characterized in that, in the described steps A, divide together again, encode after at least one network abstraction layer unit that described transmitting terminal is identical with header is removed its header and be encapsulated into described error elasticity real time transport protocol bag, and with identical header that this network abstraction layer unit had comprehensively in the header of this error elasticity real time transport protocol bag;
Among the described step B, described receiving terminal obtains the header that is carried from the header of the described error elasticity real time transport protocol bag that receives, and add the head of the network abstraction layer unit of having peeled off header that extracts from described error elasticity real time transport protocol bag to, obtain whole network level of abstraction unit; Transmit mistake if exist, then have the decoding FEC of carrying out to recover or part restore data node, and then therefrom extract network abstraction layer unit according to presetting strategy.
15. multimedia communication method according to claim 14, it is characterized in that, in described error elasticity real time transport protocol header, network abstract layer reference identification field in the described network abstraction layer unit header and type field are filled in the payload type field of described error elasticity real time transport protocol header packet information, and this payload type field is positioned at back 7 bits of the 2nd byte of described error elasticity real time transport protocol header packet information.
16. multimedia communication method according to claim 15, it is characterized in that, described error elasticity real time transport protocol identification field is the version information field of described error elasticity real time transport protocol header packet information, and this version information field is positioned at preceding 2 bits of the 1st byte of described error elasticity real time transport protocol header packet information.
17. multimedia communication method according to claim 16, it is characterized in that, in described error elasticity real time transport protocol encapsulation format, the bit field of forbidding in the described network abstraction layer unit header is filled in the tag field of described error elasticity real time transport protocol header packet information, and this tag field is positioned at preceding 1 bit of the 2nd byte of described error elasticity real time transport protocol header packet information;
And in described step B, receiving terminal judges according to the tag field of described error elasticity real time transport protocol bag whether its network abstraction layer unit of carrying makes mistakes.
18. multimedia communication method according to claim 15, it is characterized in that, in described steps A, B, described error elasticity real time transport protocol is designated the tag field value of described error elasticity real time transport protocol header packet information, and this tag field is positioned at preceding 1 bit of the 2nd byte of described error elasticity real time transport protocol header packet information.
19. multimedia communication method according to claim 18 is characterized in that, described steps A comprises following substep:
Described transmitting terminal at first judges to forbid whether bit field is effective in the header of at least one described network abstraction layer unit, in view of the above it is divided into proper network level of abstraction unit and the network abstraction layer unit of makeing mistakes;
By described error elasticity real time transport protocol encapsulation format described proper network level of abstraction unit package is become described error elasticity real time transport protocol bag then, and establish described error elasticity real time transport protocol sign;
By described real time transport protocol encapsulation format the described network abstraction layer unit of makeing mistakes is packaged into described real time transport protocol bag;
Described step B comprises following substep:
Described receiving terminal judges that at first whether the header of the bag that receives establishes described error elasticity real time transport protocol sign, is divided into described error elasticity real time transport protocol bag and described real time transport protocol bag with it;
Handle described error elasticity real time transport protocol bag according to described error elasticity real time transport protocol encapsulation format then, seal the described real time transport protocol bag of dress format analysis processing according to described real time transport protocol.
20. multimedia communication method according to claim 8 is characterized in that, the following substep of described step C,
The described receiving terminal statistics of C1 generates described service quality report;
The described receiving terminal of C2 carries described service quality report with supplemental enhancement information, issues described transmitting terminal.
21. multimedia communication method according to claim 20 is characterized in that,
The encapsulation format of described service quality report in described supplemental enhancement information is as follows:
The 1st byte is load type field, and being used to indicate load is the corresponding with service quality report;
2nd, 3 bytes are payload length field, are used to indicate corresponding with service quality report length;
After the 4th byte reaches is load, is used to fill the corresponding with service quality report.
22. multimedia communication method according to claim 21 is characterized in that, described service quality report is divided into transmit leg report and recipient's report, is distinguished by described load type field indication;
When described service quality report was filled in the load of described supplemental enhancement information, the load of described supplemental enhancement information comprised:
The version information field accounts for 2 bits;
Fill field, account for 1 bit, be used to indicate whether to have the filling content;
Receive the number of reports field, account for 5 bits, be used for indicating this service quality report to report and receive the report blocks number;
Transmit leg synchronous source identifier field accounts for 32 bits, is used to identify the transmit leg of this service quality report;
When described service quality report is transmit leg when report, also comprise the caller information piece, be used to describe the relevant information of the transmit leg of this service quality report;
Comprise at least one described reception report blocks, be used to describe from the multimedia statistical information of homology not;
Comprise specific aspect expansion, be used for the reservation function expansion of specific aspect.
23. multimedia communication method according to claim 21 is characterized in that, the described supplemental enhancement information that is used to carry described service quality report is further carried by abstract network layer unit;
The reliability requirement that described communication terminal transmits according to described service quality report is provided with the network abstract layer reference identification of this abstract network layer unit.
24. multimedia communication method according to claim 21 is characterized in that, described communication terminal is dynamically adjusted the statistics generation of described service quality report and the cycle that sends according to current network state and higher layer applications demand.
25. multimedia communication method according to claim 21 is characterized in that, when described communication terminal mixes the service quality report of at least a Media Stream of carrying with described additional enhancing message,
Comprise the corresponding described reception report blocks of at least a Media Stream that is carried in this service quality report.
26. multimedia communication method according to claim 20, it is characterized in that, among the described step C, described receiving terminal is according to the network abstraction layer unit sequence number of the described video stream data that receives, the described network abstraction layer unit number that statistics is lost, generate described service quality report, send back to described transmitting terminal;
In the described steps A, described transmitting terminal calculates described accumulative total packet loss according to described network abstraction layer unit sequence number of losing, and adjusts described error elasticity protection strategy in view of the above.
27. multimedia communication method according to claim 26 is characterized in that, described receiving terminal is according to the service quality report that receives, analytical calculation network condition parameter; Described parameter comprises instant bandwidth, time-delay and shake end to end.
28. multimedia communication method according to claim 27; it is characterized in that; described transmitting terminal is provided with the error elasticity protection strategy series of different brackets, selects to use corresponding described error elasticity protection strategy according to described service quality report in described steps A.
29. multimedia communication method according to claim 28, it is characterized in that among the described step C, described receiving terminal is according to the network abstraction layer unit sequence number of the described video stream data that receives, add up the locating information that obtains losing video stream data, and it is sent back to described transmitting terminal;
In the described steps A, described transmitting terminal resends the described video stream data of losing and gives described receiving terminal according to described locating information of losing video stream data.
30. multimedia communication method according to claim 8, it is characterized in that, in described step e, described transmitting terminal obtains described locating information of losing band according to described transmission error message, carry out segmentation intraframe coding one by one by this being lost band, eliminate strategy to realize described error code diffusion.
31. multimedia communication method according to claim 30 is characterized in that, described segmentation intraframe coding one by one comprises following steps,
E1 is cut apart one group of continuous macro block from described losing the band, form new band, and remaining described macro block still belongs to the described band of losing, and enters step e 2;
E2 carries out intraframe coding to described new band, sends when next frame, and this new band is done conventional coding after this, enters step e 3;
When E3 encodes at next frame, judge whether the described band of losing also comprises untreated macro block, if then return step e 1, otherwise finish intraframe coding.
32. multimedia communication method according to claim 31, it is characterized in that, in described step e 1, the size of each described one group of continuous macro block of separating satisfies: after this was organized continuous macro block and carries out intraframe coding, the data transfer rate of this frame was in data transfer rate control range H.264.
33. multimedia communication method according to claim 8 is characterized in that, described step D comprises following substep,
The described receiving terminal of D1 detects and transmits mistake, and statistics transmits error message;
The described receiving terminal of D2 carries out video information and weighs synchronously after taking place to transmit mistake;
The described receiving terminal of D3 is implemented described error concealment strategy according to described transmission error message.
34. multimedia communication method according to claim 33 is characterized in that, among the described step D1, described receiving terminal detects and adds up the transmission error message according to the discontinuous situation of network abstraction layer unit sequence number.
35. multimedia communication method according to claim 34, it is characterized in that, in described step D1, the locating information that described receiving terminal is lost band according to the interruption situation acquisition of described network abstraction layer unit sequence number, this locating information comprises described frame number and described position of losing band at this frame of losing the band place.
36. multimedia communication method according to claim 35 is characterized in that, described error concealment strategy comprises step: described receiving terminal substitutes this and loses band with described respective strap of losing the former frame of band place frame.
37. any described multimedia communication method in 36 is characterized in that described error elasticity encoding scheme comprises improved " Tornado " correcting and eleting codes according to Claim 8;
Described improved " Tornado " correcting and eleting codes only generates the described check-node of one deck for one group of described back end.
38., it is characterized in that the transmission mistake among the described step B comprises data-bag lost or random bit mistake according to each described multimedia communication method in the claim 2 to 36.
39. a multimedia communication terminal comprises the basic function module that is used to realize multimedia communication, wherein comprises the coding/decoding module that is used to realize the multi-media decoding and encoding function, it is characterized in that, also comprises with lower module:
Error elasticity transmits the control protocol module in real time; be used for and undertaken by the multi-medium data behind the described coding/decoding module coding transmitting at network side again after the error elasticity protection; described multi-medium data from network side is carried out passing to described coding/decoding module again after the error correction decodes; the relevant information of described error elasticity protection strategy is carried in the error elasticity real time transport protocol bag of the described multi-medium data of encapsulation; wherein, described error elasticity real time transport protocol bag is a kind of packet that includes multi-medium data simultaneously and the error elasticity that this multi-medium data is protected is protected the relevant information of strategy.
40. according to the described multimedia communication terminal of claim 39, it is characterized in that, also comprise with lower module:
Guard method and policy conferring module are used to be responsible for carry out error elasticity protection policy conferring between communicating pair, determine the protection strategy set, transmit the control protocol module in real time for described error elasticity and select;
Forward error correction block; be used to realize at least a forward error correction guard method; safeguard the relevant parameter of described forward error correction guard method; wherein said guard method and the described forward error correction block of policy conferring module controls are to realize unequal loss protection and adaptive hierarchical defencive function, and described error elasticity transmits the control protocol module in real time and realizes error elasticity protection and error correction by calling this forward error correction block.
41. according to the described multimedia communication terminal of claim 40, it is characterized in that, also comprise
The error concealment module is used to realize the error concealment function;
Described coding/decoding module is used to realize the H.264 encoding and decoding of encoding and decoding standard, also is used for the error code diffusion and eliminates function;
Also comprise network condition analytical calculation module, be used for the analytical calculation network condition, and provide information to described error concealment module and described coding/decoding module.
42. according to the described multimedia communication terminal of claim 41, it is characterized in that, also comprise
Replenish enhancing extension of message processing module, be used to realize service quality report and network condition function of reporting, and report is sent to described network condition analytical calculation module.
43., it is characterized in that its transport layer is used to realize supporting the multimedia transmitting function of error elasticity based on described error elasticity real time transport protocol/transmit control protocol in real time according to the described multimedia communication terminal of claim 42;
Its application protocol layer comprises protection mechanism and policy conferring sublayer, is used to realize cascade protection and unequal loss protection function;
Its H.264 the video coding layer comprise and replenish to strengthen extension of message report layer, be used to realize the function of reporting that strengthens extension of message based on replenishing;
H.264, it comprises the forward error correction coding layer in network abstract layer, is used to realize the forward error correction coding function.
44., it is characterized in that the described basic function module that is used to realize multimedia communication comprises one of following or its combination in any according to each described multimedia communication terminal in the claim 39 to 43:
Main control module is used for being responsible for the control of whole terminal;
Subscriber Interface Module SIM is used for the demonstration of the mutual and information of responsible user's input and output;
Network communication module is used for being responsible for and network communicates, and provides lower floor to transmit passage;
Input and output and bottom layer driving module are used for being responsible for driving for hardware device;
Business module is used to realize high-level business;
The communication course control module is used to control communication process;
The application protocol module is used to realize the application protocol function;
Transmit the control protocol module in real time, be used to realize transmit in real time the control protocol function;
The network abstract layer module is used to realize the network abstract layer function;
The audio coding decoding module is used to realize the audio coding decoding function.
CNB2006100690163A 2005-11-03 2005-11-03 Multimedia communication method and terminal thereof Expired - Fee Related CN100466725C (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CNB2006100690163A CN100466725C (en) 2005-11-03 2005-11-03 Multimedia communication method and terminal thereof
PCT/CN2006/002961 WO2007051425A1 (en) 2005-11-03 2006-11-03 A multimedia communication method and the terminal thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100690163A CN100466725C (en) 2005-11-03 2005-11-03 Multimedia communication method and terminal thereof

Publications (2)

Publication Number Publication Date
CN1863302A CN1863302A (en) 2006-11-15
CN100466725C true CN100466725C (en) 2009-03-04

Family

ID=37390610

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100690163A Expired - Fee Related CN100466725C (en) 2005-11-03 2005-11-03 Multimedia communication method and terminal thereof

Country Status (2)

Country Link
CN (1) CN100466725C (en)
WO (1) WO2007051425A1 (en)

Families Citing this family (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8356331B2 (en) * 2007-05-08 2013-01-15 Qualcomm Incorporated Packet structure for a mobile display digital interface
JP4488027B2 (en) * 2007-05-17 2010-06-23 ソニー株式会社 Information processing apparatus and method, and information processing system
KR101571341B1 (en) 2008-02-05 2015-11-25 톰슨 라이센싱 Methods and apparatus for implicit block segmentation in video encoding and decoding
CN101800751B (en) * 2010-03-09 2013-07-24 上海雅海网络科技有限公司 Distributed real-time data-coding transmission method
CN102075312B (en) * 2011-01-10 2013-03-20 西安电子科技大学 Video service quality-based hybrid selective repeat method
CN102438002B (en) * 2011-08-10 2016-08-03 中山大学深圳研究院 A kind of based on the video file data transmission under Ad hoc network
CN103167319B (en) * 2011-12-16 2016-06-22 中国移动通信集团公司 The transfer processing method of a kind of Streaming Media, Apparatus and system
US8549570B2 (en) * 2012-02-23 2013-10-01 Ericsson Television Inc. Methods and apparatus for managing network resources used by multimedia streams in a virtual pipe
CN103118241B (en) * 2012-02-24 2016-01-06 金三立视频科技(深圳)有限公司 Based on the mobile video monitor streaming media self-adapting regulation method of 3G network
CN102956233B (en) * 2012-10-10 2015-07-08 深圳广晟信源技术有限公司 Extension structure of additional data for digital audio coding and corresponding extension device
CN105653530B (en) * 2014-11-12 2021-11-30 上海交通大学 Efficient and scalable multimedia transmission, storage and presentation method
FR3031428A1 (en) * 2015-01-07 2016-07-08 Orange SYSTEM FOR TRANSMITTING DATA PACKETS ACCORDING TO A MULTIPLE ACCESS PROTOCOL
CN105307050B (en) * 2015-10-26 2018-10-26 何震宇 A kind of network flow-medium application system and method based on HEVC
CN107181783B (en) * 2016-03-11 2020-06-23 上汽通用汽车有限公司 Method and device for transmitting data in a vehicle using Ethernet
CN105916058B (en) * 2016-05-05 2019-09-20 青岛海信宽带多媒体技术有限公司 A kind of streaming media buffer playback method, device and display equipment
CN106921843B (en) * 2017-01-18 2020-06-26 苏州科达科技股份有限公司 Data transmission method and device
CN109756468B (en) * 2017-11-07 2021-08-17 中兴通讯股份有限公司 Data packet repairing method, base station and computer readable storage medium
WO2019095382A1 (en) * 2017-11-20 2019-05-23 深圳市大疆创新科技有限公司 Image transmission method and apparatus for unmanned aerial vehicle
EP3550919B1 (en) 2018-02-08 2020-06-03 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Wireless communication method and terminal
CN110139150A (en) * 2019-04-12 2019-08-16 北京物资学院 A kind of method for processing video frequency and device
CN110233716A (en) * 2019-05-31 2019-09-13 北京文香信息技术有限公司 A kind of communication interaction method, apparatus, storage medium, terminal device and server
CN110740135A (en) * 2019-10-21 2020-01-31 湖南新云网科技有限公司 Same-screen data transmission method, device and system for multimedia classrooms
CN111010593A (en) * 2019-11-08 2020-04-14 深圳市麦谷科技有限公司 Method and device for packaging H.265 video data based on FLV format
CN110769206B (en) * 2019-11-19 2022-01-07 深圳开立生物医疗科技股份有限公司 Electronic endoscope signal transmission method, device and system and electronic equipment
CN112866178B (en) * 2019-11-27 2023-09-05 北京沃东天骏信息技术有限公司 Method and device for transmitting audio data
CN111083510A (en) * 2019-12-18 2020-04-28 深圳市麦谷科技有限公司 Method and device for pushing HEVC (high efficiency video coding) video
CN113381838A (en) * 2020-03-09 2021-09-10 华为技术有限公司 Data transmission method and communication device
CN111490984B (en) * 2020-04-03 2022-03-29 上海宽创国际文化科技股份有限公司 Network data coding and encryption algorithm thereof
CN111629282B (en) * 2020-04-13 2021-02-09 北京创享苑科技文化有限公司 Real-time erasure code coding redundancy dynamic adjustment method
CN111629279B (en) * 2020-04-13 2021-04-16 北京创享苑科技文化有限公司 Video data transmission method based on fixed-length format
CN111800388A (en) * 2020-06-09 2020-10-20 盐城网之易传媒有限公司 Media information processing method and media information processing device
CN114070458B (en) * 2020-08-04 2023-07-11 成都鼎桥通信技术有限公司 Data transmission method, device, equipment and storage medium
CN112311802B (en) * 2020-11-05 2023-10-27 维沃移动通信有限公司 Information transmission method and information transmission device
CN113873340B (en) * 2021-09-18 2024-01-16 恒安嘉新(北京)科技股份公司 Data processing method, device, equipment, system and storage medium
CN113938881A (en) * 2021-10-18 2022-01-14 上海华讯网络系统有限公司 Transmission system and method applicable to Internet data
CN114615549B (en) * 2022-05-11 2022-09-20 北京搜狐新动力信息技术有限公司 Streaming media seek method, client, storage medium and mobile device
CN115189810B (en) * 2022-07-07 2024-04-16 福州大学 A low-delay real-time video FEC coding transmission control method
CN115580601A (en) * 2022-09-29 2023-01-06 维沃移动通信有限公司 Method and device for sending data packets
CN115866082A (en) * 2022-11-15 2023-03-28 阿里巴巴(中国)有限公司 Computing system, data processing method, uninstall card and storage medium
CN115801900B (en) * 2022-11-16 2025-01-17 昆山星际舟智能科技有限公司 Data compression method for low-level embedded systems
CN115844325B (en) * 2022-11-17 2024-10-18 天津大学 Distributed fNIRS brain function imaging system for super-scanning application

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1065168A (en) * 1991-02-19 1992-10-07 菲利浦光灯制造公司 Transmission system and the receiver that is used for this system
US20030229822A1 (en) * 2002-04-24 2003-12-11 Joohee Kim Methods and systems for multiple substream unequal error protection and error concealment
WO2004036760A1 (en) * 2002-10-15 2004-04-29 Koninklijke Philips Electronics N.V. System and method for providing error recovery for streaming fgs encoded video over an ip network
US6940903B2 (en) * 2001-03-05 2005-09-06 Intervideo, Inc. Systems and methods for performing bit rate allocation for a video data stream
US6944802B2 (en) * 2000-03-29 2005-09-13 The Regents Of The University Of California Method and apparatus for transmitting and receiving wireless packet

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1065168A (en) * 1991-02-19 1992-10-07 菲利浦光灯制造公司 Transmission system and the receiver that is used for this system
US6944802B2 (en) * 2000-03-29 2005-09-13 The Regents Of The University Of California Method and apparatus for transmitting and receiving wireless packet
US6940903B2 (en) * 2001-03-05 2005-09-06 Intervideo, Inc. Systems and methods for performing bit rate allocation for a video data stream
US20030229822A1 (en) * 2002-04-24 2003-12-11 Joohee Kim Methods and systems for multiple substream unequal error protection and error concealment
WO2004036760A1 (en) * 2002-10-15 2004-04-29 Koninklijke Philips Electronics N.V. System and method for providing error recovery for streaming fgs encoded video over an ip network

Also Published As

Publication number Publication date
WO2007051425A1 (en) 2007-05-10
CN1863302A (en) 2006-11-15

Similar Documents

Publication Publication Date Title
CN100466725C (en) Multimedia communication method and terminal thereof
CN100456834C (en) Quality of Service Monitoring Method for H.264 Multimedia Communication
Wenger et al. RTP payload format for H. 264 video
Wang et al. RTP payload format for H. 264 video
WO2007045141A1 (en) A method for supporting multimedia data transmission with error resilience
Turletti et al. RTP payload format for H. 261 video streams
CA2674710C (en) Improved systems and methods for error resilience in video communication systems
AU2006321552B2 (en) Systems and methods for error resilience and random access in video communication systems
CN100558167C (en) Multimedia video communication method and system
WO2006105713A1 (en) Video transmission protection method based on h.264
CN100592670C (en) A system and method for dynamic adaptive forward error control in IPTV network
WO2006111087A1 (en) H.264-based error eliminating method for compressed video transmission
WO2007045140A1 (en) A real-time method for transporting multimedia data
Frossard et al. AMISP: a complete content-based MPEG-2 error-resilient scheme
Wenger et al. RFC 3984: RTP payload format for H. 264 video
CN108429921B (en) A video encoding and decoding method and device
KR101953580B1 (en) Data Transceiving Apparatus and Method in Telepresence System
JP2002064472A (en) Communication system, transmitter, and method of preventing transmission error
Chung-How et al. Loss resilient H. 263+ video over the Internet
Wang et al. RFC 6184: RTP Payload Format for H. 264 Video
CN101176353B (en) Decoder architecture for optimized error management in streaming multimedia
Qu et al. Source-adaptive FEC/UEP coding for video transport over bursty packet loss 3G UMTS networks: a cross-layer approach
Turletti et al. RFC2032: RTP payload format for H. 261 video streams
Chung-How et al. Robust H. 263+ video for real-time Internet applications
Jung Transition from circuit-switched to packet-switched 3G mobile multimedia telephony

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20171012

Address after: 221000 Jiangsu city in Xuzhou Province Economic Development Zone No. 15 District Jinshan paradise Building 1 unit 804 room

Patentee after: Wang Miaomiao

Address before: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee before: Huawei Technologies Co., Ltd.

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090304

Termination date: 20171103

CF01 Termination of patent right due to non-payment of annual fee