WO2025040105A1

WO2025040105A1 - Video bitstream processing method and apparatus, and device and readable storage medium

Info

Publication number: WO2025040105A1
Application number: PCT/CN2024/113519
Authority: WO
Inventors: 赵璐; 李琳; 邢刚; 冯亚楠; 柳建龙
Original assignee: China Mobile Communications Group Co Ltd; MIGU Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; MIGU Co Ltd
Priority date: 2023-08-23
Filing date: 2024-08-21
Publication date: 2025-02-27
Anticipated expiration: 2026-02-23
Also published as: CN118802873A

Abstract

The present application relates to the technical field of video coding. Disclosed are a video bitstream processing method and apparatus, and a device and a readable storage medium, which aim to realize the support of an audio video coding standard for an RTP. The method comprises: performing RTP package on bitstream metadata or a bitstream metadata segment to obtain an RTP data packet; and sending the RTP data packet to a decoding end. The RTP data packet comprises an RTP header, and an RTP load of video bitstream data, wherein the RTP load of the video bitstream data comprises a universal load header, a decoding order indication identifier, load data headers corresponding to different types of RTP loads, and load data; or the RTP load of the video bitstream data comprises a universal load header, load data headers corresponding to different types of RTP loads, and load data.

Description

A video code stream processing method, device, equipment and readable storage medium

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本申请主张在2023年08月23日在中国提交的中国专利申请No.202311071302.3的优先权，其全部内容通过引用包含于此。This application claims priority to Chinese Patent Application No. 202311071302.3 filed in China on August 23, 2023, the entire contents of which are incorporated herein by reference.

Technical Field

本申请涉及视频编码技术领域，尤其涉及一种视频码流处理方法、装置、设备及可读存储介质。The present application relates to the field of video coding technology, and in particular to a video code stream processing method, device, equipment and readable storage medium.

Background Art

实时传输协议(Real-time Transport Protocol，RTP)是由国际互联网工程任务组(Internet Engineering Task Force，IETF)组织为视音频的实时传输而设计的传输协议。RTP协议位于用户数据报协议(User Datagram Protocol，UDP)协议之上，通常是利用低层的UDP协议对实时视音频数据进行组播(Multicast)或单播(Unicast)，从而实现多点或单点视音频数据的传输。The Real-time Transport Protocol (RTP) is a transport protocol designed by the Internet Engineering Task Force (IETF) for real-time transmission of video and audio. The RTP protocol is located on top of the User Datagram Protocol (UDP) protocol, and usually uses the low-level UDP protocol to multicast or unicast real-time video and audio data, thereby achieving multi-point or single-point video and audio data transmission.

但是，当前例如音视频编码标准(Audio Video coding Standard，AVS)等不支持RTP协议。However, currently, for example, the Audio Video coding Standard (AVS) does not support the RTP protocol.

发明内容Summary of the invention

本申请实施例提供一种视频码流处理方法、装置、设备及可读存储介质，以实现音视频编码标准对RTP的支持。The embodiments of the present application provide a video code stream processing method, apparatus, device and readable storage medium to implement the support of audio and video coding standards for RTP.

第一方面，本申请实施例提供了一种视频码流处理方法，应用于编码端，包括：In a first aspect, an embodiment of the present application provides a video code stream processing method, which is applied to an encoding end, including:

将元码流数据或元码流数据分片进行RTP封装，得到RTP数据包；Perform RTP encapsulation on the meta-stream data or the meta-stream data fragments to obtain an RTP data packet;

向解码端发送所述RTP数据包；Sending the RTP data packet to a decoding end;

其中，所述RTP数据包包括：RTP头和视频位流数据的RTP负载；Wherein, the RTP data packet includes: an RTP header and an RTP payload of video bit stream data;

所述视频位流数据的RTP负载包括：通用负载头、解码顺序指示标识、不同类型的RTP负载对应的负载数据头以及负载数据；The RTP payload of the video bit stream data includes: a general payload header, a decoding order indicator, payload data headers corresponding to different types of RTP payloads, and payload data;

或者or

所述视频位流数据的RTP负载包括：通用负载头、不同类型的RTP负载对应的负载数据头以及负载数据。The RTP payload of the video bit stream data includes: a general payload header, payload data headers corresponding to different types of RTP payloads, and payload data.

第二方面，本申请实施例还提供一种视频码流处理方法，应用于解码端，包括：In a second aspect, an embodiment of the present application further provides a video code stream processing method, which is applied to a decoding end, comprising:

接收RTP数据包，其中，所述RTP数据包是将元码流数据或元码流数据分片进行RTP封装得到的；Receiving an RTP data packet, wherein the RTP data packet is obtained by performing RTP encapsulation on the meta-stream data or the meta-stream data fragments;

解码所述RTP数据包； Decoding the RTP data packet;

或者or

第三方面，本申请实施例还提供一种视频码流处理装置，应用于编码端，包括：In a third aspect, an embodiment of the present application further provides a video code stream processing device, applied to an encoding end, comprising:

第一处理模块，用于将元码流数据或元码流数据分片进行RTP封装，得到RTP数据包；The first processing module is used to perform RTP encapsulation on the meta-stream data or the meta-stream data fragments to obtain an RTP data packet;

第一发送模块，用于向解码端发送所述RTP数据包；A first sending module, used for sending the RTP data packet to the decoding end;

或者or

第四方面，本申请实施例还提供一种视频码流处理装置，应用于解码端，包括：In a fourth aspect, an embodiment of the present application further provides a video code stream processing device, applied to a decoding end, comprising:

第一接收模块，用于接收RTP数据包，其中，所述RTP数据包是将元码流数据或元码流数据分片进行RTP封装得到的；A first receiving module is used to receive an RTP data packet, wherein the RTP data packet is obtained by performing RTP encapsulation on the meta-stream data or the meta-stream data fragments;

第一处理模块，用于解码所述RTP数据包；A first processing module, used for decoding the RTP data packet;

或者or

第五方面，本申请实施例还提供一种通信设备，包括：存储器、处理器及存储在存储器上并可在处理器上运行的程序，所述处理器执行所述程序时实现如上所述的视频码流处理方法中的步骤。In a fifth aspect, an embodiment of the present application further provides a communication device, comprising: a memory, a processor, and a program stored in the memory and executable on the processor, wherein the processor implements the steps in the video stream processing method described above when executing the program.

第六方面，本申请实施例还提供一种可读存储介质，所述可读存储介质上存储程序，所述程序被处理器执行时实现如上所述的视频码流处理方法中的步骤。In a sixth aspect, an embodiment of the present application further provides a readable storage medium, on which a program is stored, and when the program is executed by a processor, the steps in the video stream processing method as described above are implemented.

在本申请实施例中，将元码流数据或元码流数据分片进行RTP封装，得到RTP数据包，并发送该RTP数据包，从而实现了音视频编码标准对RTP的支持。In the embodiment of the present application, the meta-stream data or the meta-stream data fragments are RTP encapsulated to obtain an RTP data packet, and the RTP data packet is sent, thereby realizing the support of the audio and video coding standard for RTP.

BRIEF DESCRIPTION OF THE DRAWINGS

图1是本申请实施例提供的视频码流处理方法的流程图之一；FIG1 is a flowchart of a video code stream processing method provided in an embodiment of the present application;

图2是本申请实施例中的RTP头的示意图；FIG2 is a schematic diagram of an RTP header in an embodiment of the present application;

图3是本申请实施例中形成的RTP数据包的示意图；FIG3 is a schematic diagram of an RTP data packet formed in an embodiment of the present application;

图4(a)是本申请实施例中的元码流的示意图之一；FIG4( a ) is a schematic diagram of a meta-code stream in an embodiment of the present application;

图4(b)是本申请实施例中的元码流的示意图之二；FIG4( b ) is a second schematic diagram of a meta-code stream in an embodiment of the present application;

图5是本申请实施例中的单一负载的示意图之一；FIG5 is one of schematic diagrams of a single load in an embodiment of the present application;

图6是本申请实施例中的单一负载的示意图之二；FIG6 is a second schematic diagram of a single load in an embodiment of the present application;

图7是本申请实施例中的分片负载的示意图之一；FIG7 is a schematic diagram of a shard load in an embodiment of the present application;

图8是本申请实施例中的分片负载的示意图之二；FIG8 is a second schematic diagram of a shard load in an embodiment of the present application;

图9是本申请实施例中的分片负载的示意图之三；FIG9 is a third schematic diagram of a shard load in an embodiment of the present application;

图10是本申请实施例中的聚合负载的示意图之一；FIG10 is one of the schematic diagrams of the polymerized load in the embodiment of the present application;

图11是本申请实施例中的聚合负载的示意图之二；FIG11 is a second schematic diagram of the polymer load in an embodiment of the present application;

图12是本申请实施例中的聚合负载的示意图之三；FIG12 is a third schematic diagram of the polymerized load in an embodiment of the present application;

图13是本申请实施例中的聚合负载的示意图之四；FIG13 is a fourth schematic diagram of the polymerized load in an embodiment of the present application;

图14是本申请实施例中的聚合负载的示意图之五；FIG14 is a fifth schematic diagram of the polymer load in the embodiment of the present application;

图15是本申请实施例中的聚合负载的示意图之六；FIG15 is a sixth schematic diagram of the polymerized load in the embodiment of the present application;

图16是本申请实施例中的聚合负载的示意图之七；FIG16 is a seventh schematic diagram of the polymer load in the embodiment of the present application;

图17是本申请实施例中的聚合负载的示意图之八；FIG17 is a schematic diagram of the eighth embodiment of the polymerized load in the present application;

图18是本申请实施例提供的视频码流处理方法的流程图之二；FIG18 is a second flowchart of the video code stream processing method provided in an embodiment of the present application;

图19是本申请实施例提供的视频码流处理装置的结构图之一；FIG19 is a structural diagram of a video code stream processing device provided in an embodiment of the present application;

图20是本申请实施例提供的视频码流处理装置的结构图之二。FIG. 20 is a second structural diagram of the video code stream processing device provided in an embodiment of the present application.

DETAILED DESCRIPTION

本申请实施例中术语“和/或”，描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。In the embodiments of the present application, the term "and/or" describes the association relationship of the associated objects, indicating that there may be three relationships. For example, A and/or B may represent: A exists alone, A and B exist at the same time, and B exists alone. The character "/" generally indicates that the associated objects before and after are in an "or" relationship.

本申请实施例中术语“多个”是指两个或两个以上，其它量词与之类似。In the embodiments of the present application, the term "plurality" refers to two or more than two, and other quantifiers are similar.

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本申请一部分实施例，并不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The following will be combined with the drawings in the embodiments of the present application to clearly and completely describe the technical solutions in the embodiments of the present application. Obviously, the described embodiments are only part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of this application.

参见图1，图1是本申请实施例提供的视频码流处理方法的流程图，应用于编码端，如图1所示，包括以下步骤：Referring to FIG. 1 , FIG. 1 is a flowchart of a video code stream processing method provided in an embodiment of the present application, which is applied to an encoding end, as shown in FIG. 1 , and includes the following steps:

步骤101、将元码流数据或元码流数据分片进行RTP封装，得到RTP数据包。Step 101: perform RTP encapsulation on the meta-stream data or the meta-stream data fragments to obtain an RTP data packet.

在本申请实施例中，元码流数据或元码流数据分片可以为多种标准支持的元码流数据或元码流数据分片，例如，AVS3的元码流数据或元码流数据分片。其中，所述RTP数据包包括：RTP头和视频位流数据的RTP负载，所述视频位流数据的RTP负载包括：通用负载头、解码顺序指示标识、不同类型的RTP负载对应的负载数据头以及负载数据。或者，所述视频位流数据的RTP负载包括：通用负载头、不同类型的RTP负载对应的负载数据头以及负载数据，也即在此情况下，所述视频位流数据的RTP负载不包括所述解码顺序指示标识。In an embodiment of the present application, the meta-stream data or meta-stream data fragments may be meta-stream data or meta-stream data fragments supported by multiple standards, for example, the meta-stream data or meta-stream data fragments of AVS3. Wherein, the RTP data packet includes: an RTP header and an RTP payload of video bitstream data, and the RTP payload of the video bitstream data includes: a general payload header, a decoding order indication mark, payload data headers corresponding to different types of RTP payloads, and payload data. Alternatively, the RTP payload of the video bitstream data includes: a general payload header, payload data headers corresponding to different types of RTP payloads, and payload data, that is, in this case, the RTP payload of the video bitstream data does not include the decoding order indication mark.

在本申请实施例中，增加解码顺序指示标识sprop-decoder-order：这个参数用来表示发送端的码流特征，表示传输顺序与解码顺序的关系，当其取值为0时，表示传输顺序与解码顺序相同；当其取值大于0时，表示传输顺序与解码顺序不同。该参数不传(也即形成的RTP包中不携带该参数)，则默认该参数的取值为0。In the embodiment of the present application, a decoding order indicator sprop-decoder-order is added: this parameter is used to indicate the code stream characteristics of the sender, indicating the relationship between the transmission order and the decoding order. When its value is 0, it indicates that the transmission order is the same as the decoding order; when its value is greater than 0, it indicates that the transmission order is different from the decoding order. If this parameter is not transmitted (that is, the parameter is not carried in the formed RTP packet), the default value of this parameter is 0.

在此，以元码流数据或元码流数据分片为例，将AVS3的元码流数据或元码流数据分片进行RTP封装，可以理解为是将所述AVS3的元码流数据或元码流数据分片按照RTP进行封装的过程，从而按照上述方式形成的RTP数据包包括：RTP头和AVS3视频位流数据的RTP负载。在此，根据AVS3的码流特性，设置通用负载头，并且针对每种不同类型的RTP负载设计不同的负载结构，从而实现更加灵活。Here, taking the meta-stream data or meta-stream data fragment as an example, the meta-stream data or meta-stream data fragment of AVS3 is RTP encapsulated, which can be understood as the process of encapsulating the meta-stream data or meta-stream data fragment of AVS3 according to RTP, so that the RTP data packet formed in the above manner includes: RTP header and RTP payload of AVS3 video bitstream data. Here, according to the code stream characteristics of AVS3, a universal payload header is set, and different payload structures are designed for each different type of RTP payload, so as to achieve greater flexibility.

可选的，在本申请实施例中，若在IP层传输的元码流数据的大小大于MTU，将所述IP层传输的元码流数据进行切分，得到元码流数据分片，从而便于进行传输。Optionally, in an embodiment of the present application, if the size of the meta-stream data transmitted at the IP layer is larger than the MTU, the meta-stream data transmitted at the IP layer is segmented to obtain meta-stream data fragments, thereby facilitating transmission.

在本申请实施例中，在现有协议的基础上，对RTP头进行设置。如图2所示，为RTP头的示意图。如图3所示，为本申请实施例中形成的RTP数据的示意图，其中，图3中以AVS3负载为例进行的描述。In the embodiment of the present application, the RTP header is set on the basis of the existing protocol. As shown in Figure 2, it is a schematic diagram of the RTP header. As shown in Figure 3, it is a schematic diagram of the RTP data formed in the embodiment of the present application, wherein Figure 3 is described by taking the AVS3 load as an example.

其中，在本申请实施例中，所述RTP头包括以下一项或多项：In this embodiment of the present application, the RTP header includes one or more of the following:

标记字段(Marker bit，M)，用于指示视频帧的边界。具体的，该字段为1位，M指定帧的边界，视频帧的最后一个RTP数据包的标记位设为1，其他RTP数据包标记位均设为0。The marker field (Marker bit, M) is used to indicate the boundary of the video frame. Specifically, this field is 1 bit, M specifies the boundary of the frame, the marker bit of the last RTP data packet of the video frame is set to 1, and the marker bits of other RTP data packets are set to 0.

载荷字段(Payload Type，PT)，用于标识RTP会话，并指示负载格式。具体的，该字段为7位，根据定义，上层应用在每个RTP会话中(如SDP中)应动态分配一个PT值并维护PT值与负载格式的映射关系。PT值的取值范围应遵循协议的规定。The payload field (Payload Type, PT) is used to identify the RTP session and indicate the payload format. Specifically, this field is 7 bits. According to the definition, the upper-layer application should dynamically allocate a PT value in each RTP session (such as SDP) and maintain the mapping relationship between the PT value and the payload format. The value range of the PT value should comply with the provisions of the protocol.

时间戳字段Timestamp，用于指示RTP数据包的采样时间；具体的，表示RTP时间戳设为编码内容的采样时间戳，所用的时钟为90000Hz，即RTP时间戳的时间单位为1/90000s。和序列号一样，时间戳的初始值也应该是随机的，同一帧图像的不同RTP数据包的时间戳应该一致。对于序列头、用户扩展数据、图像头不具备时间属性的负载，其时间戳应与紧随其后的图像数据的时间戳一致。The timestamp field Timestamp is used to indicate the sampling time of the RTP data packet; specifically, it means that the RTP timestamp is set to the sampling timestamp of the encoded content, and the clock used is 90000Hz, that is, the time unit of the RTP timestamp is 1/90000s. Like the sequence number, the initial value of the timestamp should also be random, and the timestamps of different RTP data packets of the same frame image should be consistent. For the sequence header, user extension data, and image header payloads that do not have time attributes, their timestamps should be consistent with the timestamps of the image data that follows them.

序列号字段Sequence number，用于表示RTP数据包的传输顺序。每发送一个RTP数据包，序列号加1，接收端可以用它来检测数据包丢失和恢复数据包序列。序列号的初始值应该是随机的。 The sequence number field is used to indicate the transmission order of the RTP data packets. The sequence number is incremented by 1 for each RTP data packet sent. The receiver can use it to detect data packet loss and restore the data packet sequence. The initial value of the sequence number should be random.

其他字段的含义可参照现有技术的相关规定，例如，V表示版本，P表示填充标识，X表示扩展，SSRC段表示同步源，CSRC段表示作用源等。The meanings of other fields may refer to the relevant provisions of the prior art, for example, V represents the version, P represents the fill identifier, X represents the extension, the SSRC segment represents the synchronization source, and the CSRC segment represents the action source.

在本申请实施例中，视频位流数据的RTP负载包括单一负载、分片负载或聚合负载。单一负载指的是负载数据包括一个元码流数据，分片负载指的是负载数据包括一个元码流数据分片(segmentation)，聚合负载指的是负载数据包括多个元码流数据。也就是说，视频位流数据的RTP负载可以包含一个或者多个元码流数据，也可以是元码流数据进一步切分得到的元码流数据分片。In the embodiment of the present application, the RTP payload of the video bitstream data includes a single payload, a fragmented payload, or an aggregated payload. A single payload means that the payload data includes one metastream data, a fragmented payload means that the payload data includes one metastream data segmentation, and an aggregated payload means that the payload data includes multiple metastream data. In other words, the RTP payload of the video bitstream data may include one or more metastream data, or may be a metastream data segment obtained by further segmenting the metastream data.

可选的，在本申请实施例中，为准确的区分不同层次的元码流数据，编码端可确定指示信息，其中，所述指示信息用于指示所述元码流数据或所述元码流数据分片所属的元码流数据的形成方式。其中，形成方式包括前述的层次一和层次二两种方式。该指示信息可通过显式的方式指示给解码端，也可通过隐式的方式指示给解码端。Optionally, in an embodiment of the present application, in order to accurately distinguish meta-stream data of different levels, the encoder may determine indication information, wherein the indication information is used to indicate a formation mode of the meta-stream data or the meta-stream data fragment to which the meta-stream data belongs. The formation mode includes the aforementioned level 1 and level 2 modes. The indication information may be indicated to the decoder in an explicit manner or in an implicit manner.

其中，所述元码流数据或所述元码流数据分片所属的元码流数据是由起始码开始的位流片段，包括序列头、序列头后用户数据、序列头后扩展数据、帧内预测图像、帧间预测图像；或者，所述元码流数据或所述元码流数据分片所属的元码流数据是视频位流中每两个相邻的起始码前缀之间的数据且包含起始码前缀片段，包括序列头、序列头后用户数据、序列头后扩展数据、I帧图像头、RL帧图像头、帧间图像头、图像头后用户数据、图像头后扩展数据、帧片数据。The meta-codestream data or the meta-codestream data to which the meta-codestream data fragment belongs is a bit stream segment starting with a start code, including a sequence header, user data after the sequence header, extended data after the sequence header, an intra-frame predicted image, and an inter-frame predicted image; or, the meta-codestream data or the meta-codestream data to which the meta-codestream data fragment belongs is data between every two adjacent start code prefixes in a video bitstream and contains a start code prefix segment, including a sequence header, user data after the sequence header, extended data after the sequence header, an I-frame image header, an RL-frame image header, an inter-frame image header, user data after the image header, extended data after the image header, and frame slice data.

如图4(a)所示，元码流是由起始码开始的位流片段，如序列头(或视频序列头)、用户数据(序列头后)、扩展数据(序列头后)、帧内预测图像、帧间预测图像等(即层次一)。或者，如图4(b)所示，所述元码流数据或所述元码流数据分片所属的元码流数据是视频位流中每两个相邻的起始码前缀之间的数据且包含起始码前缀片段(即层次二)。元码流数据切片是元码流数据的整数个连续字节的片段。As shown in FIG4(a), the metastream is a bitstream segment starting with a start code, such as a sequence header (or a video sequence header), user data (after the sequence header), extended data (after the sequence header), intra-frame prediction images, inter-frame prediction images, etc. (i.e., layer 1). Alternatively, as shown in FIG4(b), the metastream data or the metastream data to which the metastream data segment belongs is the data between every two adjacent start code prefixes in the video bitstream and includes the start code prefix segment (i.e., layer 2). The metastream data slice is a segment of an integer number of consecutive bytes of the metastream data.

例如，针对后续定义的RTP的封装类型(单一、分片、聚合)，业务方(编码端)可根据应用场景，选择某一个固定的层次的元码流，在会话描述协议(Session Description Protocol，SDP)通信时通过可选字段video_bitstream_unit来描述。这个参数用来描述发送端选择的元码流数据的形成方式。若显式的向解码端发送该参数，当该参数取值为1时，表示选用的是按照层次一定义的元码流数据；当该参数取值为2时，表示选用的是按照层次二定义的元码流数据。该参数不传，则默认取值为1(即隐式方式)。For example, for the RTP encapsulation type (single, fragmented, aggregated) defined later, the service party (encoder) can select a fixed level of meta-stream according to the application scenario, and describe it through the optional field video_bitstream_unit during Session Description Protocol (SDP) communication. This parameter is used to describe the formation method of the meta-stream data selected by the sender. If this parameter is explicitly sent to the decoder, when the parameter value is 1, it means that the meta-stream data defined in level 1 is selected; when the parameter value is 2, it means that the meta-stream data defined in level 2 is selected. If this parameter is not transmitted, the default value is 1 (that is, implicit mode).

在进行封装的时候，根据RTP数据包的负载类型，可按照如下原则进行封装。When encapsulating, according to the payload type of the RTP data packet, encapsulation can be performed according to the following principles.

1)不同类型的元码流数据(或位流数据)不可以聚合在同一个RTP数据包中，例如非独立主位流与知识位流的元码流数据不可以聚合在同一个RTP数据包中。1) Meta-stream data (or bitstream data) of different types cannot be aggregated in the same RTP data packet. For example, meta-stream data of a dependent main bitstream and a knowledge bitstream cannot be aggregated in the same RTP data packet.

2)当一个元码流数据在IP层传输时尺寸超过最大传输单元(Maximum Transmission Unit，MTU)时，将元码流数据进行分片，并且采用分片RTP负载格式对分割后的元码流数据切片按照分割顺序进行传输。2) When the size of a meta-stream data exceeds the Maximum Transmission Unit (MTU) when transmitted at the IP layer, the meta-stream data is fragmented and the fragmented meta-stream data slices are transmitted in the fragmentation order using the fragmented RTP payload format.

3)聚合RTP负载中不包含元码流数据的切片。 3) The aggregated RTP payload does not contain slices of metastream data.

4)RTP的负载数据不允许嵌套，即RTP负载中不可以包含完整的RTP数据包。4) RTP payload data is not allowed to be nested, that is, the RTP payload cannot contain a complete RTP data packet.

5)针对按层次二的方式定义的元码流数据，有一些元码流数据，如图像头数据，应尽可能的与其后的图像片数据封装在一个聚合包中，前提是聚合包大小不超过MTU。5) For meta-codestream data defined in the layer 2 manner, some meta-codestream data, such as image header data, should be encapsulated in an aggregate packet with the subsequent image slice data as much as possible, provided that the size of the aggregate packet does not exceed the MTU.

6)数据量小的元码流聚合封装到一个聚合包中，以防止一些不必要的打包开销。例如，序列头、视频扩展、用户扩展等这类数据，在满足整个聚合包尺寸在IP层的大小小于MTU的前下，可以整合封装到一个聚合包中。6) Aggregate and encapsulate small-volume metastreams into an aggregate packet to prevent unnecessary packaging overhead. For example, sequence headers, video extensions, user extensions, and other data can be integrated and encapsulated into an aggregate packet as long as the size of the entire aggregate packet at the IP layer is smaller than the MTU.

以下，详细描述获得不同类型的视频位流数据的RTP负载的过程。The following describes in detail the process of obtaining RTP payloads of different types of video bit stream data.

一、单一负载：单一负载中有且仅有一个元码流数据，并且通用负载头中的参数payload_package_type取值为0。1. Single payload: A single payload contains only one metastream data, and the value of the parameter payload_package_type in the general payload header is 0.

所述单一负载的负载数据头包括：单一负载头，用于指示与元码流数据相关的信息；所述负载数据包括：一个元码流数据。The payload data header of the single payload includes: a single payload header, used to indicate information related to the meta-codestream data; the payload data includes: one meta-codestream data.

其中，视频位流数据的RTP负载的第一个固定字节为通用负载头，表示AVS3视频码流的RTP负载头数据。通用负载头的描述符语法如表1所示。The first fixed byte of the RTP payload of the video bit stream data is the general payload header, which indicates the RTP payload header data of the AVS3 video code stream. The descriptor syntax of the general payload header is shown in Table 1.

表1
Table 1

其中：in:

负载封装类型字段payload_package_type，该字段为2位，用于指示RTP负载类型，所述RTP负载类型包括单一负载、分片负载或聚合负载；例如，取值0表示单一负载，取值1表示分片负载，取值2表示聚合负载(负载中包含多个大于或等于2个元码流数据)；Payload package type field payload_package_type, this field is 2 bits, used to indicate the RTP payload type, the RTP payload type includes a single payload, a fragmented payload, or an aggregated payload; for example, a value of 0 indicates a single payload, a value of 1 indicates a fragmented payload, and a value of 2 indicates an aggregated payload (the payload contains more than or equal to 2 metastream data);

时域标识字段temporal_id，该字段为3位，无符号整数，用于指示RTP负载所属的时间层标识；取值范围是0～7，其取值应与标准中的temporal_id的值相同；Temporal_id: This field is a 3-bit, unsigned integer used to indicate the time layer identifier to which the RTP payload belongs. The value range is 0 to 7, and its value should be the same as the value of temporal_id in the standard.

库依赖字段library_dependency，该字段为2位，无符号整数，用于指示RTP负载所属的基本流类型；所述基本流类型包括以下任意一种：独立主位流、非独立主位流、知识位流；例如，取值为0时，表示负载仅包含独立主位流；取值为1时，表示负载仅包含知识位流；取值为2时，表示负载仅包含非独立主位流。取值为3时，保留。The library dependency field library_dependency is a 2-bit unsigned integer used to indicate the type of basic stream to which the RTP payload belongs; the basic stream type includes any of the following: independent main bit stream, dependent main bit stream, knowledge bit stream; for example, when the value is 0, it means that the payload only contains the independent main bit stream; when the value is 1, it means that the payload only contains the knowledge bit stream; when the value is 2, it means that the payload only contains the dependent main bit stream. When the value is 3, it is reserved.

x，该字段为1位，表示是否有扩展头，取值为0时，表示不含扩展头，否则需添加1个字节的扩展头。所述扩展头包括分片扩展负载头或聚合扩展负载头，也即对于单一负载，不定义扩展负载头。x, this field is 1 bit, indicating whether there is an extension header. When the value is 0, it means that there is no extension header, otherwise a 1-byte extension header needs to be added. The extension header includes a fragment extension payload header or an aggregate extension payload header, that is, for a single payload, no extension payload header is defined.

其中，视频位流数据的RTP负载的第二个字节为单一负载头，用于描述单一元码流的具体负载的信息，其字段结构及含义如表2所示。The second byte of the RTP payload of the video bit stream data is a single payload header, which is used to describe the specific payload information of a single metastream. Its field structure and meaning are shown in Table 2.

表2
Table 2

其中：in:

负载数据类型字段payload_data_type，该字段为4位，无符号整数，用于指示元码流数据的数据类型；其取值及类型说明具体如表3(a)或表3(b)所示。其中，表3(a)可对应前述按层次一划分的元码流数据，表3(b)可对应前述按层次二划分的元码流数据。The payload data type field payload_data_type is a 4-bit unsigned integer used to indicate the data type of the meta-codestream data; its value and type description are specifically shown in Table 3(a) or Table 3(b). Table 3(a) may correspond to the meta-codestream data divided by level 1, and Table 3(b) may correspond to the meta-codestream data divided by level 2.

负载解码顺序字段decode_order_size，该字段为2位，无符号整数，用于指示描述元码流数据的解码顺序的图像标识(Picture ID，PID)所需的字节数；例如，若SDP传输的可选参数sprop-decoder-order的取值为第一值(如0)或所述视频位流数据的RTP负载不包括所述解码顺序指示标识(即默认取值为0)，表示元码流数据的解码顺序与传输顺序一致(也即，RTP数据包的解码顺序与传输顺序一致)。若SDP传输的可选参数sprop-decoder-order的取值为第二值(如大于0的值)，表示元码流数据的解码顺序与传输顺序不同(也即，RTP数据包的解码顺序与传输顺序不同)，这时，所述视频位流数据的RTP负载还包括：N个字节PID，用于指示元码流数据的解码顺序。也即，在这种情况下，增加decode_order_size+1(N的值)个字节参数PID(picture ID)用于表示RTP负载的解码顺序。同一帧数据的不同元码流数据应具有相同的PID。PID的起始值为随机数，且是以1为单位递增的连续整数，到达最大值后可以选择继续扩展PID的字节数DOS的取值(最大4个字节)，也可以从0开始编号。The decode_order_size field of the payload is a 2-bit, unsigned integer, which is used to indicate the number of bytes required for the picture identifier (Picture ID, PID) that describes the decoding order of the meta-stream data; for example, if the value of the optional parameter sprop-decoder-order transmitted by SDP is the first value (such as 0) or the RTP payload of the video bitstream data does not include the decoding order indicator (that is, the default value is 0), it means that the decoding order of the meta-stream data is consistent with the transmission order (that is, the decoding order of the RTP data packet is consistent with the transmission order). If the value of the optional parameter sprop-decoder-order transmitted by SDP is the second value (such as a value greater than 0), it means that the decoding order of the meta-stream data is different from the transmission order (that is, the decoding order of the RTP data packet is different from the transmission order). At this time, the RTP payload of the video bitstream data also includes: N bytes of PID, which are used to indicate the decoding order of the meta-stream data. That is, in this case, decode_order_size+1 (the value of N) bytes of the parameter PID (picture ID) are added to indicate the decoding order of the RTP payload. Different bitstreams of the same frame data should have the same PID. The starting value of the PID is a random number, and is a continuous integer that increases by 1. After reaching the maximum value, you can choose to continue to expand the PID byte number DOS value (maximum 4 bytes), or you can start numbering from 0.

保留字段(R)，该字段为2位，无符号整数，保留。Reserved field (R), this field is 2 bits, unsigned integer, reserved.

表3(a)

Table 3(a)

表3(b)

Table 3(b)

以AVS3为例，根据以上描述，结合sprop-decoder-order的取值，单一负载的组成结构分别如下：Taking AVS3 as an example, according to the above description and the value of sprop-decoder-order, the composition structure of a single load is as follows:

当sprop-decoder-order取值为0，视频位流数据的RTP负载如图5所示，视频位流数据的RTP负载中包含一个AVS3通用负载头、AVS3单一负载头，以及一个元码流数据(单个源码流)。When the value of sprop-decoder-order is 0, the RTP payload of the video bit stream data is shown in FIG5 . The RTP payload of the video bit stream data includes an AVS3 general payload header, an AVS3 single payload header, and a meta stream data (single source stream).

当sprop-decoder-order取值大于0，视频位流数据的RTP负载如图6所示(decode_order_size＝0)，视频位流数据的RTP负载中含一个AVS3通用负载头、AVS3单一负载头，一个元码流数据(单个源码流)，以及decode_order_size+1个字节无符号数PID，用于表示RTP数据包中元码流数据的解码顺序。When the value of sprop-decoder-order is greater than 0, the RTP payload of the video bitstream data is as shown in Figure 6 (decode_order_size=0). The RTP payload of the video bitstream data contains an AVS3 general payload header, an AVS3 single payload header, a meta-stream data (single source stream), and decode_order_size+1 byte unsigned number PID, which is used to indicate the decoding order of the meta-stream data in the RTP data packet.

二、分片负载：分片负载(payload_package_type＝1)包含1个元码流数据分片。当元码流数据经过封装后在IP层的大小大于MTU时，则需要在RTP数据进行封装前对元码流数据进行切分，其切分后得到的数据为元码流数据分片，可以作为分片RTP负载进行传输。相同的元码流数据分割并封装得到的元码流数据分片的分片RTP负载，按照分割顺序且具有连续无间断的RTP序列号(Sequence number)，且具有相同的时间戳。2. Fragmented payload: The fragmented payload (payload_package_type=1) contains one metastream data fragment. When the size of the metastream data at the IP layer after encapsulation is larger than the MTU, the metastream data needs to be segmented before RTP data is encapsulated. The data obtained after segmentation is the metastream data fragment, which can be transmitted as a fragmented RTP payload. The fragmented RTP payload of the metastream data fragment obtained by segmenting and encapsulating the same metastream data has a continuous and uninterrupted RTP sequence number (Sequence number) in the segmentation order and has the same timestamp.

其中，所述分片负载的负载数据头包括：分片负载头；所述负载数据包括：一个元码流数据分片。当通用负载头中的扩展字节标识x＝1时，在分片负载头后还有一个字节的分片扩展负载头。其中，所述分片负载头和/或所述分片扩展负载头用于指示与元码流数据分片相关的信息。The load data header of the slice load includes: a slice load header; the load data includes: a meta-stream data slice. When the extension byte identifier x in the general load header is 1, there is another byte of slice extension load header after the slice load header. The slice load header and/or the slice extension load header are used to indicate information related to the meta-stream data slice.

其中，视频位流数据的RTP负载的第二个字节为分片负载头，其字段结构及含义如表4所示。The second byte of the RTP payload of the video bit stream data is the fragment payload header, and its field structure and meaning are shown in Table 4.

表4
Table 4

其中：in:

负载数据类型字段payload_data_type，该字段为4位，无符号整数，用于指示元码流数据分片所属的元码流数据的数据类型；其取值及类型说明如表3(a)或表3(b)所示；Payload data type field payload_data_type, this field is 4 bits, unsigned integer, used to indicate the data type of the meta-stream data to which the meta-stream data fragment belongs; its value and type description are shown in Table 3(a) or Table 3(b);

分片起始字段fragment_start(S)，用于指示元码流数据的起始分片；该字段为1位，取值为1时表示它是一个元码流数据的起始切片。否则，取值为0；The fragment start field fragment_start (S) is used to indicate the starting fragment of the meta-stream data; this field is 1 bit, and when the value is 1, it indicates that it is the starting slice of a meta-stream data. Otherwise, the value is 0;

分片结束字段fragment_end，用于指示元码流数据的结束分片；该字段为1位，取值为1时表示它是一个元码流数据的结束切片。否则，取值为0。The fragment end field fragment_end is used to indicate the end fragment of the meta-stream data; this field is 1 bit, and when the value is 1, it indicates that it is the end slice of a meta-stream data. Otherwise, the value is 0.

负载解码顺序字段decode_order_size，该字段为2位，无符号整数，用于指示描述元码流数据分片的解码顺序的PID所需的字节数；若sprop-decoder-order的取值为第一值(如0)或所述视频位流数据的RTP负载不包括所述解码顺序指示标识(即默认取值为0)，表示元码流数据分片的解码顺序与传输顺序一致。若sprop-decoder-order的取值为第二值(如大于0的值)，表示元码流数据分片的解码顺序与传输顺序不同，所述视频位流数据的RTP负载还包括：N个字节的PID，用于指示元码流数据分片的解码顺序。也即，此时，增加decode_order_size+1(N的取值)个字节参数PID，用于表示元码流数据分片的解码顺序。同一帧数据的不同元码流数据分片应具有相同的PID。PID的起始值为随机数，且是以1为单位递增的连续整数，到达最大值后可以选择继续扩展PID的字节数DOS的取值(最大3个字节)，也可以从0开始编号。The payload decoding order field decode_order_size is a 2-bit, unsigned integer used to indicate the number of bytes required for the PID that describes the decoding order of the meta-stream data fragments; if the value of sprop-decoder-order is a first value (such as 0) or the RTP payload of the video bitstream data does not include the decoding order indicator (that is, the default value is 0), it indicates that the decoding order of the meta-stream data fragments is consistent with the transmission order. If the value of sprop-decoder-order is a second value (such as a value greater than 0), it indicates that the decoding order of the meta-stream data fragments is different from the transmission order, and the RTP payload of the video bitstream data also includes: N bytes of PID, which are used to indicate the decoding order of the meta-stream data fragments. That is, at this time, decode_order_size+1 (the value of N) bytes of parameter PID are added to indicate the decoding order of the meta-stream data fragments. Different meta-stream data fragments of the same frame data should have the same PID. The starting value of PID is a random number and a continuous integer that increases by 1. After reaching the maximum value, you can choose to continue to expand the PID byte number DOS value (maximum 3 bytes), or you can start numbering from 0.

其中，一个RTP数据包中，fragment_start和fragment_end不能同时为1。In one RTP data packet, fragment_start and fragment_end cannot be 1 at the same time.

其中，分片扩展负载头的字段结构及含义如表5所示。Among them, the field structure and meaning of the fragment extension payload header are shown in Table 5.

表5
Table 5

其中：in:

pic_header：该字段为1位，当选择层次一的元码流数据形式时，表示是否包含完整的图像头数据，取值为1，表示该元码流数据分片包中包含完整的图像头数据，取值为0，表示该元码流数据分片包中不包含完整的图像头数据；当选择层次二的元码流数据形式时，该字段为保留字段，不定义含义。pic_header: This field is 1 bit. When the meta-codestream data format of level 1 is selected, it indicates whether the complete image header data is included. The value is 1, indicating that the meta-codestream data fragment packet contains the complete image header data. The value is 0, indicating that the meta-codestream data fragment packet does not contain the complete image header data. When the meta-codestream data format of level 2 is selected, this field is a reserved field with no defined meaning.

full_patch_info：该字段为1位，取值为1，表示是该元码流数据分片是一个独立的图像的帧片数据，取值为0，表示是该元码流数据分片不是一个独立的图像的片数据； full_patch_info: This field is 1 bit. If the value is 1, it indicates that the meta-stream data fragment is the frame data of an independent image. If the value is 0, it indicates that the meta-stream data fragment is not the slice data of an independent image.

R：保留字段。R: Reserved field.

以AVS3为例，根据以上描述，结合sprop-decoder-order的取值，分片负载的组成结构分别如下：Taking AVS3 as an example, according to the above description and the value of sprop-decoder-order, the composition structure of the shard load is as follows:

当sprop-decoder-order取值为0，视频位流数据的RTP负载如图7所示(decode_order_size＝0)，视频位流数据的RTP负载中包含一个AVS3通用负载头、AVS3分片负载头，以及一个元码流数据分片(单一元码流分片)。在图7所示的情况中，x＝0。When the value of sprop-decoder-order is 0, the RTP payload of the video bitstream data is shown in FIG7 (decode_order_size=0), and the RTP payload of the video bitstream data includes an AVS3 general payload header, an AVS3 fragment payload header, and a metastream data fragment (single metastream fragment). In the case shown in FIG7, x=0.

当sprop-decoder-order取值大于0，视频位流数据的RTP负载如图8和图9所示(decode_order_size＝0)。其中，图8中，视频位流数据的RTP负载中含一个AVS3通用负载头、AVS3分片负载头，decode_order_size+1个字节无符号数PID，用于表示RTP数据包中元码流数据分片的解码顺序，一个元码流数据分片。在图8所示的情况中，x＝0。图9中，视频位流数据的RTP负载中含一个AVS3通用负载头、AVS3分片负载头、分片扩展负载头、decode_order_size+1个字节无符号数PID，用于表示RTP数据包中元码流数据分片的解码顺序，一个元码流数据分片。在图9所示的情况中，x＝1。When the value of sprop-decoder-order is greater than 0, the RTP payload of the video bitstream data is as shown in Figures 8 and 9 (decode_order_size=0). In Figure 8, the RTP payload of the video bitstream data contains an AVS3 general payload header, an AVS3 slice payload header, decode_order_size+1 byte unsigned PID, which is used to indicate the decoding order of the meta-stream data slices in the RTP data packet, and one meta-stream data slice. In the case shown in Figure 8, x=0. In Figure 9, the RTP payload of the video bitstream data contains an AVS3 general payload header, an AVS3 slice payload header, a slice extension payload header, and decode_order_size+1 byte unsigned PID, which is used to indicate the decoding order of the meta-stream data slices in the RTP data packet, and one meta-stream data slice. In the case shown in Figure 9, x=1.

三、聚合负载：聚合负载(payload_package_type＝2或3)包含至少2个元码流数据，且聚合负载中的每个元码流数据前有固定的2个字节的无符号整数size用于表示该聚合RTP负载的尺寸大小。聚合RTP负载中不可以有元码流的切片。3. Aggregate Payload: The aggregate payload (payload_package_type=2 or 3) contains at least 2 metastream data, and each metastream data in the aggregate payload is preceded by a fixed 2-byte unsigned integer size to indicate the size of the aggregate RTP payload. There cannot be metastream slices in the aggregate RTP payload.

所述聚合负载的负载数据头包括：聚合负载头以及元码流大小字段，或者，包括：聚合负载头、聚合扩展负载头以及元码流大小字段；或者，所述聚合负载的负载数据头包括：聚合扩展负载头，或者，所述聚合负载的负载数据头包括：聚合扩展负载头及元码流大小字段；或者，所述聚合负载的负载数据头包括：元码流大小字段；或者，所述聚合负载的负载数据头包括：聚合负载头；The payload data header of the aggregated payload includes: an aggregated payload header and a meta-codestream size field, or includes: an aggregated payload header, an aggregated extended payload header and a meta-codestream size field; or, the payload data header of the aggregated payload includes: an aggregated extended payload header, or, the payload data header of the aggregated payload includes: an aggregated extended payload header and a meta-codestream size field; or, the payload data header of the aggregated payload includes: a meta-codestream size field; or, the payload data header of the aggregated payload includes: an aggregated payload header;

其中，所述聚合负载头和/或所述聚合扩展负载头用于指示与元码流数据相关的信息；Wherein, the aggregate payload header and/or the aggregate extended payload header is used to indicate information related to the meta-stream data;

所述负载数据包括：至少两个元码流数据。The payload data includes: at least two meta-stream data.

对于上述类型的聚合负载，视频位流数据的RTP负载的第一个固定字节为通用负载头，表示视频码流的RTP负载头数据。通用负载头的描述符语法如表1所示。For the above-mentioned type of aggregated payload, the first fixed byte of the RTP payload of the video bitstream data is the general payload header, which indicates the RTP payload header data of the video bitstream. The descriptor syntax of the general payload header is shown in Table 1.

其中，视频位流数据的RTP负载的聚合负载头，其字段结构及含义如表6所示。Among them, the field structure and meaning of the aggregate payload header of the RTP payload of the video bit stream data are shown in Table 6.

表6
Table 6

其中： in:

负载数据类型字段payload_data_type，该字段为4位，无符号整数，用于指示元码流数据的数据类型；其取值及类型说明如表3(a)或表3(b)所示；Payload data type field payload_data_type, this field is 4 bits, unsigned integer, used to indicate the data type of the metastream data; its value and type description are shown in Table 3(a) or Table 3(b);

负载解码顺序字段decode_order_size，该字段为2位，无符号整数，用于指示描述元码流数据的解码顺序的PID所需的字节数；若sprop-decoder-order的取值为第一值(如0)或所述视频位流数据的RTP负载不包括所述解码顺序指示标识(即默认取值为0)，表示元码流数据的解码顺序与传输顺序一致。若sprop-decoder-order的取值为第二值(如大于0的值)，表示元码流数据的解码顺序与传输顺序不同，所述视频位流数据的RTP负载还包括：N个字节的PID，用于指示元码流数据的解码顺序。也即，此时，增加decode_order_size+1(N的取值)个字节参数PID，用于表示元码流数据分片的解码顺序。同一帧数据的不同元码流数据分片应具有相同的PID。PID的起始值为随机数，且是以1为单位递增的连续整数，到达最大值后可以选择继续扩展PID的字节数decode_order_size的取值(最大4个字节)，也可以从0开始编号。The payload decoding order field decode_order_size is a 2-bit, unsigned integer used to indicate the number of bytes required for the PID that describes the decoding order of the meta-stream data; if the value of sprop-decoder-order is the first value (such as 0) or the RTP payload of the video bitstream data does not include the decoding order indicator (that is, the default value is 0), it indicates that the decoding order of the meta-stream data is consistent with the transmission order. If the value of sprop-decoder-order is the second value (such as a value greater than 0), it indicates that the decoding order of the meta-stream data is different from the transmission order, and the RTP payload of the video bitstream data also includes: N bytes of PID, which are used to indicate the decoding order of the meta-stream data. That is, at this time, decode_order_size+1 (the value of N) bytes of parameter PID are added to indicate the decoding order of the meta-stream data fragments. Different meta-stream data fragments of the same frame data should have the same PID. The starting value of PID is a random number and a continuous integer that increases by 1. After reaching the maximum value, you can choose to continue to expand the value of decode_order_size (maximum 4 bytes) of the PID byte number, or you can start numbering from 0.

其中，视频位流数据的RTP负载的聚合扩展负载头，其字段结构及含义如表7所示。Among them, the field structure and meaning of the aggregate extended payload header of the RTP payload of the video bit stream data are shown in Table 7.

表7
Table 7

其中：in:

序列头字段sequence_header_index，该字段为1位，无符号整数，用于指示所述聚合负载中是否包括序列头数据；A sequence header field sequence_header_index, which is 1 bit and an unsigned integer, is used to indicate whether the aggregated payload includes sequence header data;

随机接入索引字段AU_index，该字段为1位，无符号整数，用于指示所述聚合负载中是否包括随机接入帧，对于AVS3的随机接入帧有两种，一种是独立主位流(如library_dependency＝0)，随机接入帧是在序列头后的第一个I帧；一种是非独立主位流(如library_dependency＝2)对于非独立主位流，随机接入帧是在序列头后的第一个P帧或B帧。Random access index field AU_index, this field is 1 bit, unsigned integer, used to indicate whether the aggregated load includes a random access frame. There are two types of random access frames for AVS3, one is an independent main bit stream (such as library_dependency=0), and the random access frame is the first I frame after the sequence header; the other is a non-independent main bit stream (such as library_dependency=2). For the non-independent main bit stream, the random access frame is the first P frame or B frame after the sequence header.

保留字段(R)，该字段为6位，无符号整数，保留。Reserved field (R), this field is 6 bits, unsigned integer, reserved.

以AVS3为例，在本申请实施例中，根据decode_order_size的取值，聚合负载的组成结构分别如下：Taking AVS3 as an example, in the embodiment of the present application, according to the value of decode_order_size, the composition structure of the aggregated load is as follows:

当sprop-decoder-order取值为0：When sprop-decoder-order is 0:

当payload_package_type＝2且x＝0时，视频位流数据的RTP负载中包含AVS3通用负载头、至少2个元码流数据以及每个元码流数据前的1个字节的聚合负载头和2字节的size标识元码流的尺寸大小，如图10所示。When payload_package_type=2 and x=0, the RTP payload of the video bitstream data includes an AVS3 general payload header, at least 2 meta-stream data, a 1-byte aggregate payload header before each meta-stream data, and a 2-byte size indicating the size of the meta-stream, as shown in FIG10 .

当payload_package_type＝2且x＝1时，视频位流数据的RTP负载中包含AVS3通用负载头、聚合扩展负载头、至少2个元码流数据以及每个元码流数据前的1个字节的AVS3聚合负载头和2字节的size标识元码流的尺寸大小，如图11所示。When payload_package_type=2 and x=1, the RTP payload of the video bitstream data contains an AVS3 general payload header, an aggregate extended payload header, at least 2 meta-stream data, a 1-byte AVS3 aggregate payload header before each meta-stream data, and a 2-byte size indicating the size of the meta-stream, as shown in FIG11 .

当payload_package_type＝3，x＝0时，表示所有的元码流数据都属于同一类数据(payload_data_type相同)。聚合负载中包括通用负载头、一个聚合负载头，和至少2个元码流负载。结构如图12所示。When payload_package_type=3, x=0, it means that all meta-stream data belong to the same type of data (same payload_data_type). The aggregate payload includes a general payload header, an aggregate payload header, and at least 2 meta-stream payloads. The structure is shown in FIG12.

当payload_package_type＝3且x＝1时，视频位流数据的RTP负载中包括AVS3通用负载头、AVS3聚合扩展负载头、至少2个元码流数据。其结构如图13所示。When payload_package_type=3 and x=1, the RTP payload of the video bitstream data includes an AVS3 general payload header, an AVS3 aggregate extended payload header, and at least two metastream data, as shown in FIG13 .

当sprop-decoder-order取值大于0(以取值为1为例)：When the value of sprop-decoder-order is greater than 0 (taking the value of 1 as an example):

当payload_package_type＝2且x＝0时，视频位流数据的RTP负载中含一个AVS3通用负载头、至少2个元码流数据、每个元码流数据前的1个字节的AVS3聚合负载头和2字节的size标识元码流的尺寸大小，以及decode_order_size+1个字节无符号数PID，如图14所示(decode_order_size＝0)。When payload_package_type=2 and x=0, the RTP payload of the video bitstream data contains an AVS3 general payload header, at least 2 meta-stream data, a 1-byte AVS3 aggregate payload header before each meta-stream data and a 2-byte size to identify the size of the meta-stream, as well as decode_order_size+1-byte unsigned number PID, as shown in Figure 14 (decode_order_size=0).

当payload_package_type＝2且x＝1时，视频位流数据的RTP负载中含一个AVS3通用负载头、聚合扩展负载头、至少2个元码流数据、每个元码流数据前的1个字节的AVS3聚合负载头和2字节的size标识元码流的尺寸大小，以及decode_order_size+1个字节无符号数PID，如图15所示(decode_order_size＝0)。When payload_package_type=2 and x=1, the RTP payload of the video bitstream data contains an AVS3 general payload header, an aggregate extended payload header, at least 2 meta-stream data, a 1-byte AVS3 aggregate payload header before each meta-stream data and a 2-byte size to indicate the size of the meta-stream, as well as decode_order_size+1-byte unsigned number PID, as shown in Figure 15 (decode_order_size=0).

当payload_package_type＝3且x＝0时，RTP负载中含一个通用负载头、聚合负载头至少2个元码流数据以及每个元码流数据前有2字节的size标识元码流的尺寸大小，以及decode_order_size+1个字节无符号数PID(picture ID)用于用于表示RTP数据包中元码流数据/元码流数据分片的解码顺序，如图16所示(decode_order_size＝0)。When payload_package_type=3 and x=0, the RTP payload contains a general payload header, an aggregate payload header, at least two meta-stream data, and a 2-byte size in front of each meta-stream data to identify the size of the meta-stream, and decode_order_size+1-byte unsigned number PID (picture ID) used to indicate the decoding order of the meta-stream data/meta-stream data fragments in the RTP data packet, as shown in Figure 16 (decode_order_size=0).

当payload_package_type＝3且x＝1时，RTP负载中含一个通用负载头、聚合扩展负载头、至少2个元码流数据以及每个元码流前有2字节的size标识元码流的尺寸大小，以及decode_order_size+1个字节无符号数PID(picture ID)用于表示RTP数据包中元码流数据/元码流数据分片的解码顺序，如图17所示(decode_order_size＝0)。When payload_package_type=3 and x=1, the RTP payload contains a general payload header, an aggregate extended payload header, at least 2 meta-stream data, and a 2-byte size in front of each meta-stream to identify the size of the meta-stream, and decode_order_size+1-byte unsigned number PID (picture ID) used to indicate the decoding order of the meta-stream data/meta-stream data fragments in the RTP data packet, as shown in Figure 17 (decode_order_size=0).

在本申请实施例中，在聚合负载中，当不选择扩展负载头的情况(payload_package_type＝2)下，可以节约1个字节的设计。可扩展的PID的字节设计，可以根据数量的递增，灵活的调整PID字节的占用数量，当PID较小时，可以用1个字节描述。当PID增大的时候，可以扩展字节，也可以选择回滚到0，使用形式更加灵活，且节约字节。In the embodiment of the present application, in the aggregated payload, when the extended payload header is not selected (payload_package_type=2), a design of 1 byte can be saved. The byte design of the expandable PID can flexibly adjust the number of PID bytes occupied according to the increase in quantity. When the PID is small, it can be described with 1 byte. When the PID increases, the byte can be expanded or rolled back to 0, which is more flexible and saves bytes.

在以上的实施例中，是以2个元码流数据进行聚合为例进行的描述，在实际应用中，当有大于2个的元码流数据进行聚合时，其原理和以上描述的相同。In the above embodiment, the description is made by taking the aggregation of two elementary code stream data as an example. In actual application, when there are more than two elementary code stream data to be aggregated, the principle is the same as described above.

步骤102、向解码端发送所述RTP数据包。 Step 102: Send the RTP data packet to the decoding end.

参见图18，图18是本申请实施例提供的视频码流处理方法的流程图，应用于解码端，如图18所示，包括以下步骤：Referring to FIG. 18 , FIG. 18 is a flowchart of a video code stream processing method provided in an embodiment of the present application, which is applied to a decoding end, and as shown in FIG. 18 , includes the following steps:

步骤1801、接收RTP数据包，其中，所述RTP数据包是将AVS3的元码流数据或元码流数据分片进行RTP封装得到的；Step 1801: Receive an RTP data packet, wherein the RTP data packet is obtained by performing RTP encapsulation on AVS3 meta-stream data or meta-stream data fragments;

或者or

步骤1802、解码所述RTP数据包。Step 1802: Decode the RTP data packet.

其中，解包过程就是将码流从RTP码流中解析出来，并且按照解码顺序传送给解码器。在此，对所述RTP数据包进行解析，获得RTP包头和视频位流数据的RTP负载。之后，基于所述视频位流数据的RTP负载进行解码。The unpacking process is to parse the code stream from the RTP code stream and transmit it to the decoder in the decoding order. Here, the RTP data packet is parsed to obtain the RTP header and the RTP payload of the video bit stream data. After that, decoding is performed based on the RTP payload of the video bit stream data.

可选的，解码端还可获取指示信息，其中，所述指示信息用于指示所述元码流数据或所述元码流数据分片所属的元码流数据的形成方式；Optionally, the decoding end may further obtain indication information, wherein the indication information is used to indicate a formation method of the meta-stream data or the meta-stream data to which the meta-stream data fragment belongs;

其中，所述元码流数据或所述元码流数据分片所属的元码流数据是由起始码开始的位流片段，包括序列头、序列头后用户数据、序列头后扩展数据、帧内预测图像、帧间预测图像；The meta-stream data or the meta-stream data to which the meta-stream data fragment belongs is a bit stream segment starting with a start code, including a sequence header, user data after the sequence header, extended data after the sequence header, an intra-frame prediction image, and an inter-frame prediction image;

或者or

所述元码流数据或所述元码流数据分片所属的元码流数据是视频位流中每两个相邻的起始码前缀之间的数据且包含起始码前缀片段，包括序列头、序列头后用户数据、序列头后扩展数据、I帧图像头、RL帧图像头、帧间图像头、图像头后用户数据、图像头后扩展数据、帧片数据。The meta-codestream data or the meta-codestream data to which the meta-codestream data fragment belongs is the data between every two adjacent start code prefixes in the video bit stream and contains the start code prefix fragment, including a sequence header, user data after the sequence header, extended data after the sequence header, an I frame image header, an RL frame image header, an inter-frame image header, user data after the image header, extended data after the image header, and frame slice data.

其中，该指示信息可显示指示也可隐式指示。The indication information may be an explicit indication or an implicit indication.

例如，在SDP通信时获得可选字段video_bitstream_unit。当该参数取值为1时，表示选用的是按照层次一定义的元码流数据；当该参数取值为2时，表示选用的是按照层次二定义的元码流数据。该参数不传，则默认取值为1(即隐式方式)。For example, the optional field video_bitstream_unit is obtained during SDP communication. When the parameter value is 1, it indicates that the meta-stream data defined in layer 1 is selected; when the parameter value is 2, it indicates that the meta-stream data defined in layer 2 is selected. If this parameter is not transmitted, the default value is 1 (i.e., implicit mode).

在上述解码过程中，在根据所述通用负载头确定具体的RTP负载类型之后，即可结合不同类型的RTP负载对应的负载数据头以及负载数据进行解码。具体的，以下描述不同类型的RTP负载的解码方式。In the above decoding process, after determining the specific RTP payload type according to the general payload header, decoding can be performed in combination with the payload data header and payload data corresponding to different types of RTP payloads. Specifically, the decoding method of different types of RTP payloads is described below.

1、对于单一负载：1. For a single load:

若根据所述通用负载头(如负载封装类型字段payload_package_type)确定所述视频位流数据的RTP负载为单一负载且所述解码顺序指示标识的取值为第一值，获取所述负载数据中的元码流数据，得到元码流数据包，并解码所述元码流数据包；If it is determined according to the general payload header (such as the payload encapsulation type field payload_package_type) that the RTP payload of the video bitstream data is a single payload and the value of the decoding order indicator is the first value, obtaining the meta-stream data in the payload data, obtaining a meta-stream data packet, and decoding the meta-stream data packet;

若根据所述通用负载头确定所述视频位流数据的RTP负载为单一负载且所述解码顺序指示标识取值为第二值，获取所述负载数据中的元码流数据，得到元码流数据包；获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，并解码排序后的元码流数据包。If it is determined according to the universal load header that the RTP load of the video bit stream data is a single load and the decoding order indication identifier is a second value, obtain the meta-stream data in the load data to obtain a meta-stream data packet; obtain the PID of the RTP load of the video bit stream data, sort the meta-stream data packets according to the PID, and decode the sorted meta-stream data packets.

2、对于分片负载：2. For shard load:

若根据所述通用负载头(如负载封装类型字段payload_package_type)确定所述视频位流数据的RTP负载为分片负载且解码顺序指示标识的取值为第一值，根据所述分片负载头中的分片起始字段fragment_start和分片结束字段fragment_end，确定起始分片和结束分片；利用根据所述起始分片和所述结束分片确定的目标码流分片数据，形成元码流数据，根据所述元码流数据形成元码流数据包，并解码所述元码流数据包；If it is determined according to the general payload header (such as the payload encapsulation type field payload_package_type) that the RTP payload of the video bit stream data is a fragmented payload and the value of the decoding order indicator is the first value, determine the starting fragment and the ending fragment according to the fragment start field fragment_start and the fragment end field fragment_end in the fragment payload header; form meta-stream data using the target code stream fragment data determined according to the starting fragment and the ending fragment, form a meta-stream data packet according to the meta-stream data, and decode the meta-stream data packet;

若根据所述通用负载头确定所述视频位流数据的RTP负载为分片负载且解码顺序指示标识取值为第二值，根据所述分片负载头中的分片起始字段fragment_start和分片结束字段fragment_end，确定起始分片和结束分片；利用根据所述起始分片和所述结束分片确定的目标码流分片数据，形成元码流数据，根据所述元码流数据形成元码流数据包；获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，并解码排序后的元码流数据包。If it is determined according to the universal load header that the RTP load of the video bitstream data is a fragmented load and the decoding order indication identifier is a second value, determine the starting fragment and the ending fragment according to the fragment start field fragment_start and the fragment end field fragment_end in the fragment load header; use the target code stream fragment data determined according to the starting fragment and the ending fragment to form meta-code stream data, and form a meta-code stream data packet according to the meta-code stream data; obtain the PID of the RTP load of the video bitstream data, sort the meta-code stream data packets according to the PID, and decode the sorted meta-code stream data packets.

在上述过程中，若通用负载头中的扩展字节标识x＝1，还需结合分片扩展负载头进行解码。在结合分片扩展负载头解码的过程中，主要是结合分片扩展负载头的各个字段确定元码流数据分片中是否包括完整的图像头数据，以及，元码流数据分片是否为独立的图像的片数据，从而便于以图像为单位进行处理。In the above process, if the extension byte identifier x in the general payload header is 1, it is also necessary to decode in combination with the slice extension payload header. In the process of decoding in combination with the slice extension payload header, it is mainly to determine whether the meta-code stream data slice includes complete image header data and whether the meta-code stream data slice is independent image slice data in combination with each field of the slice extension payload header, so as to facilitate processing in units of images.

3、对于聚合负载：3. For aggregated load:

若根据所述通用负载头(如负载封装类型字段payload_package_type)确定所述视频位流数据的RTP负载为聚合负载，可按照以下任一方式进行解码：If it is determined according to the general payload header (such as the payload encapsulation type field payload_package_type) that the RTP payload of the video bit stream data is an aggregate payload, decoding may be performed in any of the following ways:

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的位于每个元码流数据之前的聚合负载头以及元码流大小字段对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码。在此过程中，若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述元码流大小字段获取元码流数据，得到元码流数据包；若所述解码顺序指示标识的取值为第二值，根据所述元码流大小字段获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the metastream data is processed according to the aggregate payload header and the metastream size field located before each metastream data in the payload data header of the aggregate payload to form a metastream data packet, and the metastream data packet is decoded. In this process, if the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, the metastream data is obtained according to the metastream size field to obtain a metastream data packet; if the value of the decoding order indicator is the second value, the metastream data is obtained according to the metastream size field to obtain a metastream data packet, the PID of the RTP payload of the video bitstream data is obtained, the metastream data packets are sorted according to the PID, and the sorted metastream data packets are obtained.

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的聚合扩展负载头、位于每个元码流数据之前的聚合负载头以及元码流大小字段对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码。在此过程中，若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述元码流大小字段以及所述扩展负载头获取元码流数据，得到元码流数据包；若所述解码顺序指示标识取值为第二值，根据所述元码流大小字段以及所述扩展负载头获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the metastream data is processed according to the aggregate extended payload header in the payload data header of the aggregate payload, the aggregate payload header located before each metastream data, and the metastream size field to form a metastream data packet, and the metastream data packet is decoded. In this process, if the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, the metastream data is obtained according to the metastream size field and the extended payload header to obtain a metastream data packet; if the value of the decoding order indicator is the second value, the metastream data is obtained according to the metastream size field and the extended payload header to obtain a metastream data packet, the PID of the RTP payload of the video bitstream data is obtained, the metastream data packets are sorted according to the PID to obtain sorted metastream data packets.

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的聚合扩展负载头对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码。在此过程中，若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述聚合扩展负载头获取元码流数据，得到元码流数据包；若所述解码顺序指示标识取值为第二值，根据所述元聚合扩展负载头获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate extended payload header in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded. In this process, if the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, the meta-stream data is obtained according to the aggregate extended payload header to obtain a meta-stream data packet; if the value of the decoding order indicator is the second value, the meta-stream data is obtained according to the meta-aggregate extended payload header to obtain a meta-stream data packet, the PID of the RTP payload of the video bitstream data is obtained, the meta-stream data packets are sorted according to the PID, and the sorted meta-stream data packets are obtained.

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的聚合扩展负载头以及元码流大小字段对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码。在此过程中，若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述聚合扩展负载头以及元码流大小字段获取元码流数据，得到元码流数据包；若所述解码顺序指示标识取值为第二值，根据所述聚合扩展负载头以及元码流大小字段获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate extended payload header and the meta-stream size field in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded. In this process, if the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, the meta-stream data is obtained according to the aggregate extended payload header and the meta-stream size field to obtain a meta-stream data packet; if the value of the decoding order indicator is the second value, the meta-stream data is obtained according to the aggregate extended payload header and the meta-stream size field to obtain a meta-stream data packet, the PID of the RTP payload of the video bitstream data is obtained, the meta-stream data packets are sorted according to the PID, and the sorted meta-stream data packets are obtained.

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的元码流大小字段对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码。在此过程中，若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述元码流大小字段获取元码流数据，得到元码流数据包；若所述解码顺序指示标识取值为第二值，根据所述元码流大小字段获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the meta-stream size field in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded. In this process, if the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, the meta-stream data is obtained according to the meta-stream size field to obtain a meta-stream data packet; if the value of the decoding order indicator is the second value, the meta-stream data is obtained according to the meta-stream size field to obtain a meta-stream data packet, the PID of the RTP payload of the video bitstream data is obtained, the meta-stream data packets are sorted according to the PID, and the sorted meta-stream data packets are obtained.

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的聚合负载头对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码。在此过程中，若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述聚合负载头获取元码流数据，得到元码流数据包；若所述解码顺序指示标识取值为第二值，根据所述聚合负载头获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If it is determined according to the general load header that the RTP load of the video bitstream data is an aggregate load, the meta-stream data is processed according to the aggregate load header in the load data header of the aggregate load to form a meta-stream data packet, and the meta-stream data packet is decoded. In this process, if the value of the decoding order indicator is the first value or the RTP load of the video bitstream data does not include the decoding order indicator, the meta-stream data is obtained according to the aggregate load header to obtain the meta-stream data packet; if the value of the decoding order indicator is the second value, the meta-stream data is obtained according to the aggregate load header to obtain the meta-stream data packet, and the PID of the RTP load of the video bitstream data is obtained. The meta-stream data packets are sorted according to the PID to obtain sorted meta-stream data packets.

其中，该第一值可以是0，第二值可以是1等。Among them, the first value can be 0, the second value can be 1, etc.

参见图19，图19是本申请实施例提供的视频码流处理装置的结构图，应用于编码端。如图19所示，视频码流处理装置包括：Referring to FIG. 19 , FIG. 19 is a structural diagram of a video code stream processing device provided in an embodiment of the present application, which is applied to an encoding end. As shown in FIG. 19 , the video code stream processing device includes:

第一处理模块1901，用于将AVS3的元码流数据或元码流数据分片进行RTP封装，得到RTP数据包；第一发送模块1902，用于向解码端发送所述RTP数据包。其中，所述RTP数据包包括：RTP头和视频位流数据的RTP负载；The first processing module 1901 is used to perform RTP encapsulation on the AVS3 meta-stream data or meta-stream data fragments to obtain an RTP data packet; the first sending module 1902 is used to send the RTP data packet to the decoding end. The RTP data packet includes: an RTP header and an RTP payload of video bit stream data;

或者or

可选的，所述装置还可包括：Optionally, the device may further include:

切分模块，用于若在IP层传输的元码流数据的大小大于MTU，将所述IP层传输的元码流数据进行切分，得到元码流数据分片。The segmentation module is used to segment the meta-stream data transmitted at the IP layer to obtain meta-stream data fragments if the size of the meta-stream data transmitted at the IP layer is larger than the MTU.

确定模块，用于确定指示信息，其中，所述指示信息用于指示所述元码流数据或所述元码流数据分片所属的元码流数据的形成方式；A determination module, used for determining indication information, wherein the indication information is used for indicating a formation mode of the meta-stream data or the meta-stream data to which the meta-stream data fragment belongs;

或者or

可选的，若所述视频位流数据的RTP负载包括单一负载；所述单一负载的负载数据头包括：单一负载头，用于指示与元码流数据相关的信息；所述单一负载的负载数据包括：一个元码流数据。Optionally, if the RTP payload of the video bitstream data includes a single payload; the payload data header of the single payload includes: a single payload header, used to indicate information related to the meta-codestream data; the payload data of the single payload includes: one meta-codestream data.

可选的，所述单一负载头包括：Optionally, the single load head comprises:

负载数据类型字段payload_data_type，用于指示元码流数据的数据类型；The payload data type field payload_data_type is used to indicate the data type of the metastream data;

负载解码顺序字段decode_order_size，用于指示描述元码流数据的解码顺序的PID所需的字节数。 The payload decoding order field decode_order_size is used to indicate the number of bytes required by the PID that describes the decoding order of the meta-codestream data.

可选的，若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，表示元码流数据的解码顺序与传输顺序一致；Optionally, if the value of the decoding order indicator is the first value or the RTP payload of the video bit stream data does not include the decoding order indicator, it indicates that the decoding order of the meta-code stream data is consistent with the transmission order;

若所述解码顺序指示标识的取值为第二值，表示元码流数据的解码顺序与传输顺序不同，所述视频位流数据的RTP负载还包括：N个字节的PID，用于指示元码流数据的解码顺序，其中，N等于负载解码顺序字段decode_order_size所指示的字节数与1的和。If the value of the decoding order indication identifier is the second value, it means that the decoding order of the meta-codestream data is different from the transmission order, and the RTP payload of the video bitstream data also includes: an N-byte PID, used to indicate the decoding order of the meta-codestream data, where N is equal to the sum of the number of bytes indicated by the payload decoding order field decode_order_size and 1.

可选的，若所述视频位流数据的RTP负载包括分片负载；所述分片负载包括的负载数据头包括：分片负载头，或者，包括：分片负载头和分片扩展负载头，其中，所述分片负载头和/或所述分片扩展负载头用于指示与元码流数据分片相关的信息；所述负载数据包括：一个元码流数据分片。Optionally, if the RTP payload of the video bitstream data includes a slice payload; the payload data header included in the slice payload includes: a slice payload header, or includes: a slice payload header and a slice extended payload header, wherein the slice payload header and/or the slice extended payload header are used to indicate information related to a meta-codestream data slice; the payload data includes: a meta-codestream data slice.

可选的，所述分片负载头包括：Optionally, the slice payload header includes:

负载数据类型字段payload_data_type，用于指示元码流数据分片所属的元码流数据的数据类型；The payload data type field payload_data_type is used to indicate the data type of the meta-stream data to which the meta-stream data fragment belongs;

分片起始字段fragment_start，用于指示元码流数据的起始分片；The fragment start field fragment_start is used to indicate the starting fragment of the meta stream data;

分片结束字段fragment_end，用于指示元码流数据的结束分片；The fragment end field fragment_end is used to indicate the end fragment of the meta stream data;

负载解码顺序字段decode_order_size，用于指示描述元码流数据分片的解码顺序的PID所需的字节数。The payload decoding order field decode_order_size is used to indicate the number of bytes required by the PID that describes the decoding order of the meta-codestream data fragments.

可选的，若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，表示元码流数据分片的解码顺序与传输顺序一致；Optionally, if the value of the decoding order indicator is the first value or the RTP payload of the video bit stream data does not include the decoding order indicator, it indicates that the decoding order of the meta-stream data fragments is consistent with the transmission order;

若所述解码顺序指示标识的取值为第二值，表示元码流数据分片的解码顺序与传输顺序不同，所述视频位流数据的RTP负载还包括：N个字节的PID，用于指示元码流数据分片的解码顺序，其中，N等于负载解码顺序字段decode_order_size所指示的字节数与1的和。If the value of the decoding order indication identifier is the second value, it means that the decoding order of the meta-stream data fragments is different from the transmission order, and the RTP payload of the video bitstream data also includes: an N-byte PID, used to indicate the decoding order of the meta-stream data fragments, where N is equal to the sum of the number of bytes indicated by the payload decoding order field decode_order_size and 1.

其中，所述分片扩展负载头包括：Wherein, the fragment extension payload header includes:

图像头pic_header字段，用于表示元码流数据分片中是否包括完整的图像头数据或保留；The pic_header field is used to indicate whether the meta-stream data fragment includes the complete picture header data or is reserved;

full_patch_info字段，用于表示元码流数据分片是否为独立的图像的帧片数据。The full_patch_info field is used to indicate whether the meta-stream data fragment is independent image frame data.

可选的，若所述视频位流数据的RTP负载包括聚合负载，所述聚合负载的负载数据头包括：聚合负载头以及元码流大小字段，或者，所述聚合负载的负载数据头包括：聚合负载头、聚合扩展负载头以及元码流大小字段，或者，所述聚合负载的负载数据头包括：聚合扩展负载头，或者，所述聚合负载的负载数据头包括：聚合扩展负载头及元码流大小字段；或者，所述聚合负载的负载数据头包括：元码流大小字段；或者，所述聚合负载的负载数据头包括：聚合负载头；其中，所述聚合负载头和/或所述聚合扩展负载头用于指示与元码流数据相关的信息；Optionally, if the RTP payload of the video bitstream data includes an aggregate payload, the payload data header of the aggregate payload includes: an aggregate payload header and a meta-codestream size field, or the payload data header of the aggregate payload includes: an aggregate payload header, an aggregate extended payload header and a meta-codestream size field, or the payload data header of the aggregate payload includes: an aggregate extended payload header, or the payload data header of the aggregate payload includes: an aggregate extended payload header and a meta-codestream size field; or the payload data header of the aggregate payload includes: a meta-codestream size field; or the payload data header of the aggregate payload includes: an aggregate payload header; wherein the aggregate payload header and/or the aggregate extended payload header are used to indicate information related to meta-codestream data;

可选的，所述聚合负载头包括：Optionally, the polymer load head comprises:

负载数据类型字段payload_data_type，用于指示元码流数据的数据类型； The payload data type field payload_data_type is used to indicate the data type of the metastream data;

负载解码顺序字段decode_order_size，用于指示描述元码流数据的解码顺序的PID所需的字节数。The payload decoding order field decode_order_size is used to indicate the number of bytes required by the PID that describes the decoding order of the meta-codestream data.

若所述解码顺序指示标识的取值为第二值，表示元码流数据的解码顺序与传输顺序不同，所述视频位流数据的RTP负载还包括：PID，用于指示元码流数据的解码顺序。If the value of the decoding order indicator is the second value, it means that the decoding order of the meta-stream data is different from the transmission order, and the RTP payload of the video bit stream data further includes: PID, which is used to indicate the decoding order of the meta-stream data.

可选的，所述聚合扩展负载头包括：Optionally, the aggregate extended payload header includes:

序列头字段sequence_header_index，用于指示所述聚合负载中是否包括序列头数据；A sequence header field sequence_header_index is used to indicate whether the aggregated payload includes sequence header data;

随机接入索引字段AU_index，用于指示所述聚合负载中是否包括随机接入帧。The random access index field AU_index is used to indicate whether the aggregated payload includes a random access frame.

可选的，所述通用负载头包括：Optionally, the universal load head includes:

负载封装类型字段payload_package_type，用于指示RTP负载类型，所述RTP负载类型包括单一负载、分片负载或聚合负载；The payload encapsulation type field payload_package_type is used to indicate the RTP payload type, which includes a single payload, a fragmented payload, or an aggregated payload;

时域标识字段temporal_id，用于指示RTP负载所属的时间层标识；The temporal_id field is used to indicate the time layer identifier to which the RTP payload belongs.

库依赖字段library_dependency，用于指示RTP负载所属的基本流类型；The library dependency field library_dependency is used to indicate the type of elementary stream to which the RTP payload belongs;

x字段，用于表示是否包括扩展头，所述扩展头包括分片扩展负载头或聚合扩展负载头。The x field is used to indicate whether an extension header is included, and the extension header includes a fragment extension payload header or an aggregate extension payload header.

可选的，所述元码流数据的数据类型或者元码流数据分片所属的元码流数据的数据类型，包括：Optionally, the data type of the meta-stream data or the data type of the meta-stream data to which the meta-stream data fragment belongs includes:

序列头、视频扩展数据、序列头后的用户数据、I帧图像、RL帧图像、帧间图像；或者Sequence header, video extension data, user data after the sequence header, I frame image, RL frame image, inter-frame image; or

所述元码流数据的数据类型或者元码流数据分片所属的元码流数据的数据类型，包括：The data type of the meta-stream data or the data type of the meta-stream data to which the meta-stream data fragment belongs includes:

序列头、视频扩展数据、序列头后的用户数据、I帧图像头、RL帧图像头、P帧图像头、B帧图像头、图像头后的视频扩展数据、图像头后的用户数据、帧片数据。Sequence header, video extension data, user data after the sequence header, I frame image header, RL frame image header, P frame image header, B frame image header, video extension data after the image header, user data after the image header, frame slice data.

可选的，所述RTP头包括以下一项或多项：Optionally, the RTP header includes one or more of the following:

标记字段M，用于指示视频帧的边界；A marker field M is used to indicate the boundary of a video frame;

载荷字段PT，用于标识RTP会话，并指示负载格式；The payload field PT is used to identify the RTP session and indicate the payload format;

时间戳字段Timestamp，用于指示RTP数据包的采样时间；The timestamp field Timestamp is used to indicate the sampling time of the RTP data packet;

序列号字段Sequence number，用于表示RTP数据包的传输顺序。The sequence number field, Sequence number, is used to indicate the transmission order of RTP data packets.

本申请实施例提供的装置，可以执行上述方法实施例，其实现原理和技术效果类似，本实施例此处不再赘述。The device provided in the embodiment of the present application can execute the above method embodiment, and its implementation principle and technical effect are similar, so this embodiment will not be repeated here.

参见图20，图20是本申请实施例提供的视频码流处理装置的结构图，应用于解码端。如图20所示，视频码流处理装置包括：Referring to FIG. 20 , FIG. 20 is a structural diagram of a video code stream processing device provided in an embodiment of the present application, which is applied to a decoding end. As shown in FIG. 20 , the video code stream processing device includes:

第一接收模块2001，用于接收RTP数据包，其中，所述RTP数据包是将AVS3的元码流数据或元码流数据分片进行RTP封装得到的；第一处理模块2002，用于解码所述RTP数据包。 The first receiving module 2001 is used to receive an RTP data packet, wherein the RTP data packet is obtained by RTP encapsulating the AVS3 meta-stream data or meta-stream data fragments; the first processing module 2002 is used to decode the RTP data packet.

或者or

获取模块，用于获取指示信息，其中，所述指示信息用于指示所述元码流数据或所述元码流数据分片所属的元码流数据的形成方式；An acquisition module, used for acquiring indication information, wherein the indication information is used for indicating a formation mode of the meta-stream data or the meta-stream data to which the meta-stream data fragment belongs;

其中，所述元码流数据或所述元码流数据分片所属的元码流数据是由起始码开始的位流片段，包括序列头、序列头后用户数据、序列头后扩展数据、帧内预测图像、帧间预测图像；或者The meta-stream data or the meta-stream data to which the meta-stream data fragment belongs is a bit stream segment starting with a start code, including a sequence header, user data after the sequence header, extended data after the sequence header, an intra-frame prediction image, and an inter-frame prediction image; or

可选的，所述第一处理模块还用于：Optionally, the first processing module is further used for:

对所述RTP数据包进行解析，获得RTP包头和视频位流数据的RTP负载；Parsing the RTP data packet to obtain an RTP packet header and an RTP payload of video bit stream data;

基于所述视频位流数据的RTP负载进行解码。Decoding is performed based on the RTP payload of the video bit stream data.

若根据所述通用负载头确定所述视频位流数据的RTP负载为单一负载且所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，获取所述负载数据中的元码流数据，得到元码流数据包，并解码所述元码流数据包；If it is determined according to the universal payload header that the RTP payload of the video bitstream data is a single payload and the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, obtaining the meta-codestream data in the payload data, obtaining a meta-codestream data packet, and decoding the meta-codestream data packet;

若根据所述通用负载头确定所述视频位流数据的RTP负载为分片负载且解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述分片负载头中的分片起始字段fragment_start和分片结束字段fragment_end，确定起始分片和结束分片；利用根据所述起始分片和所述结束分片确定的目标码流分片数据，形成元码流数据，根据所述元码流数据形成元码流数据包，并解码所述元码流数据包；If it is determined according to the universal payload header that the RTP payload of the video bitstream data is a fragment payload and the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, determine the starting fragment and the ending fragment according to the fragment start field fragment_start and the fragment end field fragment_end in the fragment payload header; form meta-stream data using the target code stream fragment data determined according to the starting fragment and the ending fragment, form a meta-stream data packet according to the meta-stream data, and decode the meta-stream data packet;

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的位于每个元码流数据之前的聚合负载头以及元码流大小字段对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码；或者If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate payload header and the meta-stream size field located before each meta-stream data in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded; or

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的聚合扩展负载头、位于每个元码流数据之前的聚合负载头以及元码流大小字段对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码；或者If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate extended payload header in the payload data header of the aggregate payload, the aggregate payload header located before each meta-stream data, and the meta-stream size field to form a meta-stream data packet, and the meta-stream data packet is decoded; or

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的聚合扩展负载头对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码；或者If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate extended payload header in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded; or

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的聚合扩展负载头以及元码流大小字段对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码；或者If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate extended payload header and the meta-stream size field in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded; or

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的元码流大小字段对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码；或者If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the meta-stream size field in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded; or

若根据所述通用负载头确定所述视频位流数据的RTP负载为聚合负载，根据所述聚合负载的负载数据头中的聚合负载头对所述元码流数据进行处理，形成元码流数据包，并对所述元码流数据包进行解码。If it is determined according to the general load header that the RTP load of the video bitstream data is an aggregate load, the meta-stream data is processed according to the aggregate load header in the load data header of the aggregate load to form a meta-stream data packet, and the meta-stream data packet is decoded.

若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述元码流大小字段获取元码流数据，得到元码流数据包；If the value of the decoding order indicator is the first value or the RTP payload of the video bit stream data does not include the decoding order indicator, obtaining the meta-code stream data according to the meta-code stream size field to obtain a meta-code stream data packet;

若所述解码顺序指示标识取值为第二值，根据所述元码流大小字段获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If the decoding order indication identifier is a second value, the meta-code stream data is obtained according to the meta-code stream size field to obtain a meta-code stream data packet, the PID of the RTP payload of the video bit stream data is obtained, and the meta-code stream data packets are sorted according to the PID to obtain sorted meta-code stream data packets.

若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述元码流大小字段以及所述扩展负载头获取元码流数据，得到元码流数据包；If the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, obtaining the meta-codestream data according to the meta-codestream size field and the extended payload header to obtain a meta-codestream data packet;

若所述解码顺序指示标识取值为第二值，根据所述元码流大小字段以及所述扩展负载头获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If the decoding order indication identifier is a second value, the meta-codestream data is obtained according to the meta-codestream size field and the extended payload header to obtain a meta-codestream data packet, the PID of the RTP payload of the video bitstream data is obtained, and the meta-codestream data packets are sorted according to the PID to obtain sorted meta-codestream data packets.

若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述聚合扩展负载头获取元码流数据，得到元码流数据包；If the value of the decoding order indicator is the first value or the RTP payload of the video bit stream data does not include the decoding order indicator, obtaining the meta-stream data according to the aggregate extended payload header to obtain a meta-stream data packet;

若所述解码顺序指示标识取值为第二值，根据所述元聚合扩展负载头获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If the decoding order indication identifier is a second value, the meta-code stream data is obtained according to the meta-aggregation extended payload header to obtain a meta-code stream data packet, the PID of the RTP payload of the video bit stream data is obtained, the meta-code stream data packets are sorted according to the PID to obtain sorted meta-code stream data packets.

若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述聚合扩展负载头以及元码流大小字段获取元码流数据，得到元码流数据包；If the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, obtaining the meta-stream data according to the aggregate extended payload header and the meta-stream size field to obtain a meta-stream data packet;

若所述解码顺序指示标识取值为第二值，根据所述聚合扩展负载头以及元码流大小字段获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If the decoding order indication identifier is a second value, the meta-code stream data is obtained according to the aggregate extended payload header and the meta-code stream size field to obtain a meta-code stream data packet, the PID of the RTP payload of the video bit stream data is obtained, the meta-code stream data packets are sorted according to the PID to obtain sorted meta-code stream data packets.

若所述解码顺序指示标识的取值为第一值或所述视频位流数据的RTP负载不包括所述解码顺序指示标识，根据所述聚合负载头获取元码流数据，得到元码流数据包；If the value of the decoding order indicator is the first value or the RTP payload of the video bit stream data does not include the decoding order indicator, obtaining the meta-stream data according to the aggregate payload header to obtain a meta-stream data packet;

若所述解码顺序指示标识取值为第二值，根据所述聚合负载头获取元码流数据，得到元码流数据包，获取所述视频位流数据的RTP负载的PID，根据所述PID对所述元码流数据包进行排序，得到排序后的元码流数据包。If the decoding order indication identifier is a second value, the meta-stream data is obtained according to the aggregate load header to obtain a meta-stream data packet, the PID of the RTP load of the video bit stream data is obtained, the meta-stream data packets are sorted according to the PID, and the sorted meta-stream data packets are obtained.

需要说明的是，本申请实施例中对单元的划分是示意性的，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式。另外，在本申请各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。It should be noted that the division of units in the embodiments of the present application is schematic and is only a logical function division. There may be other division methods in actual implementation. In addition, each functional unit in each embodiment of the present application may be integrated into a processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个处理器可读取存储介质中。基于这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(Read-Only Memory，ROM)、随机存取存储器(Random Access Memory，RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a processor-readable storage medium. Based on this understanding, the technical solution of the present application is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several instructions to enable a computer device (which can be a personal computer, server, or network device, etc.) or a processor (processor) to execute all or part of the steps of the method described in each embodiment of the present application. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk and other media that can store program code.

本申请实施例提供了一种通信设备，包括：存储器、处理器及存储在所述存储器上并可在所述处理器上运行的程序；所述处理器，用于读取存储器中的程序实现如前所述的视频码流处理方法中的步骤。An embodiment of the present application provides a communication device, including: a memory, a processor, and a program stored in the memory and executable on the processor; the processor is used to read the program in the memory to implement the steps in the video stream processing method as described above.

本申请实施例还提供一种可读存储介质，可读存储介质上存储有程序，该程序被处理器执行时实现上述视频码流处理方法实施例的各个过程，且能达到相同的技术效果，为避免重复，这里不再赘述。其中，所述的可读存储介质，可以是处理器能够存取的任何可用介质或数据存储设备，包括但不限于磁性存储器(例如软盘、硬盘、磁带、磁光盘(Magneto Optical，MO)等)、光学存储器(例如光盘(Compact Disc，CD)、数字视频光盘(Digital Video Disc，DVD)、蓝光光盘(Blu-ray Disc，BD)、高清通用光盘(High-definition Versatile Disc，HVD)等)、以及半导体存储器(例如ROM、可擦写可编程只读存储器(Erasable Programmable Read-Only Memory，EPROM)、带电可擦可编程只读存储器(Electrically Erasable Programmable read only memory，EEPROM)、非易失性存储器(NAND(Non-volatile Memory Device)FLASH)、固态硬盘(Solid State Drives，SSD))等。The embodiment of the present application also provides a readable storage medium, on which a program is stored. When the program is executed by a processor, the various processes of the above-mentioned video code stream processing method embodiment are implemented, and the same technical effect can be achieved. To avoid repetition, it will not be repeated here. Among them, the readable storage medium can be any available medium or data storage device that can be accessed by the processor, including but not limited to magnetic storage (such as floppy disk, hard disk, magnetic tape, magneto-optical disk (Magneto Optical, MO)), etc.), optical storage (such as compact disc (Compact Disc, CD), digital video disc (Digital Video Disc, DVD), Blu-ray Disc (Blu-ray Disc, BD), high-definition versatile disc (High-definition Versatile Disc, HVD), etc.), and semiconductor memory (such as ROM, erasable programmable read-only memory (Erasable Programmable Read-Only Memory, EPROM), electrically erasable programmable read-only memory (Electrically Erasable Programmable read only memory, EEPROM), non-volatile memory (NAND (Non-volatile Memory Device) FLASH), solid-state drives (Solid State Drives, SSD)), etc.

需要说明的是，在本文中，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, in this article, the terms "include", "comprises" or any other variations thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, an element defined by the sentence "comprises a ..." does not exclude the existence of other identical elements in the process, method, article or device including the element.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件，但很多情况下前者是更佳的实施方式。根据这样的理解，本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质(如ROM/RAM、磁盘、光盘)中，包括若干指令用以使得一台终端(可以是手机，计算机，服务器，空调器，或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above implementation methods, those skilled in the art can clearly understand that the above-mentioned embodiment methods can be implemented by means of software plus a necessary general hardware platform, and of course by hardware, but in many cases the former is a better implementation method. Based on this understanding, the technical solution of the present application, or the part that contributes to the prior art, can be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, disk, CD), and includes a number of instructions for a terminal (which can be a mobile phone, computer, server, air conditioner, or network device, etc.) to execute the methods described in each embodiment of the present application.

上面结合附图对本申请的实施例进行了描述，但是本申请并不局限于上述的具体实施方式，上述的具体实施方式仅仅是示意性的，而不是限制性的，本领域的普通技术人员在本申请的启示下，在不脱离本申请宗旨和权利要求所保护的范围情况下，还可做出很多形式，均属于本申请的保护之内。 The embodiments of the present application are described above in conjunction with the accompanying drawings, but the present application is not limited to the above-mentioned specific implementation methods. The above-mentioned specific implementation methods are merely illustrative and not restrictive. Under the guidance of the present application, ordinary technicians in this field can also make many forms without departing from the scope of protection of the purpose of the present application and the claims, all of which are within the protection of the present application.

Claims

A video code stream processing method, applied to an encoding end, comprising:

Performing RTP encapsulation on the meta-stream data or the meta-stream data fragments to obtain an RTP data packet;

Sending the RTP data packet to a decoding end;

Wherein, the RTP data packet includes: an RTP header and an RTP payload of video bit stream data;

The RTP payload of the video bit stream data includes: a general payload header, a decoding order indicator, payload data headers corresponding to different types of RTP payloads, and payload data;

or

The RTP payload of the video bit stream data includes: a general payload header, payload data headers corresponding to different types of RTP payloads, and payload data.

The method according to claim 1, wherein the method further comprises:

If the size of the meta-stream data transmitted at the IP layer is larger than the maximum transmission unit MTU, the meta-stream data transmitted at the IP layer is segmented to obtain the meta-stream data fragments.

The method according to claim 1, wherein the method further comprises:

Determine indication information, wherein the indication information is used to indicate a formation method of the meta-stream data or the meta-stream data to which the meta-stream data fragment belongs;

The meta-stream data or the meta-stream data to which the meta-stream data fragment belongs is a bit stream segment starting with a start code, including a sequence header, user data after the sequence header, extended data after the sequence header, an intra-frame prediction image, and an inter-frame prediction image;

or

The meta-codestream data or the meta-codestream data to which the meta-codestream data fragment belongs is the data between every two adjacent start code prefixes in the video bit stream and contains the start code prefix fragment, including a sequence header, user data after the sequence header, extended data after the sequence header, an I frame image header, an RL frame image header, an inter-frame image header, user data after the image header, extended data after the image header, and frame slice data.

The method according to claim 1, wherein if the RTP payload of the video bitstream data comprises a single payload;

The payload data header of the single payload includes: a single payload header, used to indicate information related to the meta-stream data;

The payload data of the single payload includes: a meta-code stream data.

The method of claim 4, wherein the single load head comprises:

The payload data type field payload_data_type is used to indicate the data type of the metastream data;

The payload decoding order field decode_order_size is used to indicate the number of bytes required for the picture identifier PID that describes the decoding order of the meta-codestream data.

The method according to claim 5, wherein

If the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, it indicates that the decoding order of the metastream data is consistent with the transmission order;

If the value of the decoding order indication identifier is the second value, it means that the decoding order of the meta-codestream data is different from the transmission order, and the RTP payload of the video bitstream data also includes: an N-byte PID, used to indicate the decoding order of the meta-codestream data, where N is equal to the sum of the number of bytes indicated by the payload decoding order field decode_order_size and 1.

The method according to claim 1, wherein if the RTP payload of the video bit stream data includes a fragment payload;

The load data header of the slice load includes: a slice load header, or includes: a slice load header and a slice extended load header, wherein the slice load header and/or the slice extended load header are used to indicate information related to the meta-codestream data slice;

The load data of the slice load includes: a meta stream data slice.

The method according to claim 7, wherein the slice payload header comprises:

The payload data type field payload_data_type is used to indicate the data type of the meta-stream data to which the meta-stream data fragment belongs;

The fragment start field fragment_start is used to indicate the starting fragment of the meta stream data;

The fragment end field fragment_end is used to indicate the end fragment of the meta stream data;

The payload decoding order field decode_order_size is used to indicate the number of bytes required by the PID that describes the decoding order of the meta-codestream data fragments.

The method according to claim 8, wherein

If the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, it indicates that the decoding order of the metastream data fragments is consistent with the transmission order;

If the value of the decoding order indication identifier is the second value, it means that the decoding order of the meta-stream data fragments is different from the transmission order, and the RTP payload of the video bitstream data also includes: an N-byte PID, used to indicate the decoding order of the meta-stream data fragments, where N is equal to the sum of the number of bytes indicated by the payload decoding order field decode_order_size and 1.

The method according to claim 8, wherein the slice extension payload header comprises:

The pic_header field is used to indicate whether the meta-stream data fragment includes the complete picture header data or is reserved;

The full_patch_info field is used to indicate whether the meta-stream data fragment is independent image frame data.

The method according to claim 1, wherein if the RTP payload of the video bitstream data includes an aggregation payload;

The payload data header of the aggregated payload includes: an aggregated payload header and a meta-codestream size field; or, the payload data header of the aggregated payload includes: an aggregated payload header, an aggregated extended payload header and a meta-codestream size field; or, the payload data header of the aggregated payload includes: an aggregated extended payload header and a meta-codestream size field; or, the payload data header of the aggregated payload includes: a meta-codestream size field; or, the payload data header of the aggregated payload includes: an aggregated payload header;

Wherein, the aggregate payload header and/or the aggregate extended payload header is used to indicate information related to the meta-stream data;

The payload data of the aggregate payload includes: at least two meta-stream data.

The method according to claim 11, wherein the polymer loading head comprises:

The payload decoding order field decode_order_size is used to indicate the number of bytes required by the PID that describes the decoding order of the meta-codestream data.

The method according to claim 12, wherein

The method according to claim 11, wherein the aggregate extended payload header comprises:

A sequence header field sequence_header_index is used to indicate whether the aggregated payload includes sequence header data;

The random access index field AU_index is used to indicate whether the aggregated payload includes a random access frame.

The method according to any one of claims 1 to 14, wherein the universal load head comprises:

The payload encapsulation type field payload_package_type is used to indicate the RTP payload type, which includes a single payload, a fragmented payload, or an aggregated payload;

The temporal_id field is used to indicate the time layer identifier to which the RTP payload belongs.

The library dependency field library_dependency is used to indicate the type of elementary stream to which the RTP payload belongs;

The x field is used to indicate whether an extension header is included, and the extension header includes a fragment extension payload header or an aggregate extension payload header.

The method according to any one of claims 1 to 14, wherein

The data type of the meta-stream data or the data type of the meta-stream data to which the meta-stream data fragment belongs includes:

Sequence header, video extension data, user data after the sequence header, I frame image, RL frame image, inter-frame image; or

Sequence header, video extension data, user data after the sequence header, I frame image header, RL frame image header, P frame image header, B frame image header, video extension data after the image header, user data after the image header, frame slice data.

The method according to claim 1, wherein the RTP header includes one or more of the following:

A marker field M is used to indicate the boundary of a video frame;

The payload field PT is used to identify the RTP session and indicate the payload format;

The timestamp field Timestamp is used to indicate the sampling time of the RTP data packet;

The sequence number field, Sequence number, is used to indicate the transmission order of RTP data packets.

A video code stream processing method, applied to a decoding end, comprising:

Receiving an RTP data packet, wherein the RTP data packet is obtained by performing RTP encapsulation on the meta-stream data or the meta-stream data fragments;

Decoding the RTP data packet;

or

The method according to claim 18, wherein the method further comprises:

Acquire indication information, wherein the indication information is used to indicate a formation method of the meta-stream data or the meta-stream data to which the meta-stream data fragment belongs;

or

The method according to claim 18, wherein the decoding of the RTP data packet comprises:

Parsing the RTP data packet to obtain an RTP packet header and an RTP payload of video bit stream data;

Decoding is performed based on the RTP payload of the video bit stream data.

The method according to claim 20, wherein the decoding based on the RTP payload of the video bit stream data comprises:

If it is determined according to the universal payload header that the RTP payload of the video bitstream data is a single payload and the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, obtaining the meta-codestream data in the payload data, obtaining a meta-codestream data packet, and decoding the meta-codestream data packet;

If it is determined according to the universal load header that the RTP load of the video bit stream data is a single load and the value of the decoding order indication identifier is the second value, obtain the meta-code stream data in the load data to obtain a meta-code stream data packet; obtain the PID of the RTP load of the video bit stream data, sort the meta-code stream data packets according to the PID, and decode the sorted meta-code stream data packets.

If it is determined according to the universal payload header that the RTP payload of the video bitstream data is a fragment payload and the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, determine the starting fragment and the ending fragment according to the fragment start field fragment_start and the fragment end field fragment_end in the fragment payload header; form meta-stream data using the target code stream fragment data determined according to the starting fragment and the ending fragment, form a meta-stream data packet according to the meta-stream data, and decode the meta-stream data packet;

If it is determined according to the universal load header that the RTP load of the video bitstream data is a fragmented load and the decoding order indication identifier is a second value, determine the starting fragment and the ending fragment according to the fragment start field fragment_start and the fragment end field fragment_end in the fragment load header; use the target code stream fragment data determined according to the starting fragment and the ending fragment to form meta-code stream data, and form a meta-code stream data packet according to the meta-code stream data; obtain the PID of the RTP load of the video bitstream data, sort the meta-code stream data packets according to the PID, and decode the sorted meta-code stream data packets.

If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate payload header and the meta-stream size field located before each meta-stream data in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded; or

If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate extended payload header in the payload data header of the aggregate payload, the aggregate payload header located before each meta-stream data, and the meta-stream size field to form a meta-stream data packet, and the meta-stream data packet is decoded; or

If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate extended payload header in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded; or

If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the aggregate extended payload header and the meta-stream size field in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded; or

If it is determined according to the general payload header that the RTP payload of the video bitstream data is an aggregate payload, the meta-stream data is processed according to the meta-stream size field in the payload data header of the aggregate payload to form a meta-stream data packet, and the meta-stream data packet is decoded; or

If it is determined according to the general load header that the RTP load of the video bitstream data is an aggregate load, the meta-stream data is processed according to the aggregate load header in the load data header of the aggregate load to form a meta-stream data packet, and the meta-stream data packet is decoded.

The method according to claim 23, wherein the meta-stream data is processed according to the aggregate payload header and the meta-stream size field located before each meta-stream data in the payload data header of the aggregate payload to form a meta-stream data packet, comprising:

If the value of the decoding order indicator is the first value or the RTP payload of the video bit stream data does not include the decoding order indicator, obtaining the meta-code stream data according to the meta-code stream size field to obtain a meta-code stream data packet;

If the value of the decoding order indication identifier is the second value, the meta-code stream data is obtained according to the meta-code stream size field to obtain a meta-code stream data packet, the PID of the RTP payload of the video bit stream data is obtained, and the meta-code stream data packets are sorted according to the PID to obtain sorted meta-code stream data packets.

The method according to claim 23, wherein the meta-codestream data is processed according to the aggregate extended payload header in the payload data header of the aggregate payload, the aggregate payload header located before each meta-codestream data, and the meta-codestream size field to form a meta-codestream data packet, comprising:

If the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, obtaining the meta-codestream data according to the meta-codestream size field and the extended payload header to obtain a meta-codestream data packet;

If the decoding order indication identifier is a second value, the meta-codestream data is obtained according to the meta-codestream size field and the extended payload header to obtain a meta-codestream data packet, the PID of the RTP payload of the video bitstream data is obtained, and the meta-codestream data packets are sorted according to the PID to obtain sorted meta-codestream data packets.

The method according to claim 23, wherein the processing of the meta-stream data according to the aggregate extended payload header in the payload data header of the aggregate payload to form a meta-stream data packet comprises:

If the value of the decoding order indicator is the first value or the RTP payload of the video bit stream data does not include the decoding order indicator, obtaining the meta-stream data according to the aggregate extended payload header to obtain a meta-stream data packet;

If the decoding order indication identifier is a second value, the meta-code stream data is obtained according to the meta-aggregation extended payload header to obtain a meta-code stream data packet, the PID of the RTP payload of the video bit stream data is obtained, the meta-code stream data packets are sorted according to the PID to obtain sorted meta-code stream data packets.

The method according to claim 23, wherein the processing of the meta-stream data according to the aggregate extended payload header and the meta-stream size field in the payload data header of the aggregate payload to form a meta-stream data packet comprises:

If the value of the decoding order indicator is the first value or the RTP payload of the video bitstream data does not include the decoding order indicator, obtaining the meta-stream data according to the aggregate extended payload header and the meta-stream size field to obtain a meta-stream data packet;

If the decoding order indication identifier is a second value, the meta-code stream data is obtained according to the aggregate extended payload header and the meta-code stream size field to obtain a meta-code stream data packet, the PID of the RTP payload of the video bit stream data is obtained, the meta-code stream data packets are sorted according to the PID to obtain sorted meta-code stream data packets.

The method according to claim 23, wherein the processing of the meta-stream data according to the meta-stream size field in the payload data header of the aggregate payload to form a meta-stream data packet comprises:

If the decoding order indication identifier is a second value, the meta-code stream data is obtained according to the meta-code stream size field to obtain a meta-code stream data packet, the PID of the RTP payload of the video bit stream data is obtained, and the meta-code stream data packets are sorted according to the PID to obtain sorted meta-code stream data packets.

The method according to claim 23, wherein the processing of the meta-stream data according to the aggregate payload header in the payload data header of the aggregate payload to form a meta-stream data packet comprises:

If the value of the decoding order indicator is the first value or the RTP payload of the video bit stream data does not include the decoding order indicator, obtaining the meta-stream data according to the aggregate payload header to obtain a meta-stream data packet;

If the decoding order indication identifier is a second value, the meta-stream data is obtained according to the aggregate load header to obtain a meta-stream data packet, the PID of the RTP load of the video bit stream data is obtained, the meta-stream data packets are sorted according to the PID, and the sorted meta-stream data packets are obtained.

A video code stream processing device, applied to an encoding end, comprising:

The first processing module is used to perform RTP encapsulation on the meta-stream data or the meta-stream data fragments to obtain an RTP data packet;

A first sending module, used for sending the RTP data packet to the decoding end;

or

A video code stream processing device, applied to a decoding end, comprising:

A first receiving module is used to receive an RTP data packet, wherein the RTP data packet is obtained by performing RTP encapsulation on the meta-stream data or the meta-stream data fragments;

A first processing module, used for decoding the RTP data packet;

or

A communication device, comprising: a memory, a processor, and a program stored in the memory and executable on the processor; the processor is used to read the program in the memory to implement the steps in the video code stream processing method as described in any one of claims 1 to 29.

A readable storage medium for storing a program, wherein when the program is executed by a processor, the steps in the video code stream processing method as described in any one of claims 1 to 29 are implemented.