[go: up one dir, main page]

WO2011113315A1 - 一种流媒体直播业务系统及实现方法 - Google Patents

一种流媒体直播业务系统及实现方法 Download PDF

Info

Publication number
WO2011113315A1
WO2011113315A1 PCT/CN2011/071090 CN2011071090W WO2011113315A1 WO 2011113315 A1 WO2011113315 A1 WO 2011113315A1 CN 2011071090 W CN2011071090 W CN 2011071090W WO 2011113315 A1 WO2011113315 A1 WO 2011113315A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
time
samples
video
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2011/071090
Other languages
English (en)
French (fr)
Inventor
陈晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Publication of WO2011113315A1 publication Critical patent/WO2011113315A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols
    • H04N21/6437Real-time Transport Protocol [RTP]

Definitions

  • the present invention relates to the field of communications and multimedia applications, and in particular, to a streaming media live broadcast service system and an implementation method thereof. Background technique
  • Wireless Streaming Live broadcasting is an emerging multimedia service where users can watch ongoing games, concerts, news sites and more on a wireless network on a handheld mobile terminal.
  • the Live Streaming Service implements control (start, pause, and end) between the streaming client and the streaming media server through Real Time Streaming Protocol (RTSP), and real-time transport protocol (RTP). ) Realizing the transmission of live content data from the collector to the receiver.
  • RTSP Real Time Streaming Protocol
  • RTP real-time transport protocol
  • Live streaming is a typical real-time sending/receiving system with the ability to transmit images, sounds, and text.
  • the time at which the live content is presented to the user at the receiving end is substantially the same as the time actually generated at the collecting end. Since the receiving terminal is convenient to carry and uses the wireless network as a physical bearer, the user can easily understand the situation of the live broadcast site in the first time. Therefore, the streaming media direct broadcast service has broad prospects in the broadband wireless era.
  • the streaming media service uses the real-time transport protocol RTP to transmit images, sounds, and subtitle data.
  • the RTP protocol stipulates that only one type of multimedia data can be transmitted per RTP channel on the network, so images, sounds, and subtitles must be implemented.
  • the transmission of data requires three RTP channels as bearers. Since different types of multimedia data samples, data sizes, network transmission speeds, etc. are different, it is necessary to input images, sounds, and subtitles on the receiving terminal. Synchronous control over the line time to accurately reproduce the live live content. Otherwise, images, sounds, and subtitles can cause serious inconsistencies in time, affecting the user's visual and audible experience. Summary of the invention
  • the main object of the present invention is to provide a streaming media live broadcast service system and a method for implementing the same. Synchronous playback of audio, video, and subtitles for streaming services using the Real-Time Transport Protocol.
  • an aspect of an embodiment of the present invention provides a method for implementing a live streaming service, the method comprising the following steps:
  • the streaming media server obtains the sampling time of the collected samples while collecting audio samples, video samples and subtitle samples;
  • the streaming client restores the audio samples, video samples, and subtitle sample sequences with the same time from the network packets of each channel, and reproduces the live content according to the time of occurrence according to the sampling time of the samples.
  • the streaming media client reproduces the live content according to the time of the sample according to the sampling time of the sample, which is specifically:
  • the standard playing time deviation is equal to the reference time minus the sampling time of the audio sample;
  • the ideal playing time of the video sample and the subtitle sample is equal to the standard playing time offset
  • the sampling time of the difference sample, the waiting time of the video sample and the subtitle sample is equal to its ideal playing time minus the current system time; if the waiting time of the video sample and the subtitle sample is less than zero, the sample is discarded.
  • a real-time transport protocol is used between the streaming media server and the streaming client to implement network transmission of media data, where the sampling time of the sample is located in a network packet containing sample content. In the header.
  • RTP real-time transport protocol
  • the present invention further provides a streaming media live broadcast service system, where the system includes:
  • the live streaming server is configured to collect audio samples, video samples, and subtitle samples, and simultaneously obtain the sampling time of the samples; respectively, the audio samples, the video samples, the subtitle samples, and the sampling time of the collected samples are encapsulated into a network. Packets are sent to the streaming client through their respective transmission channels;
  • the audio samples, video samples, and subtitle sample sequences of the sample time, and the live content is reproduced in the order of occurrence according to the sampling time of the audio samples, the video samples, and the subtitle samples.
  • the streaming media client plays the audio sample, the video sample and the subtitle sample sequence with the playing time of the audio sample as a reference time on the playing time axis; the streaming media client according to the audio Calculating the standard playback time deviation of the sampling time of the sample and the reference time; for the video sample and the subtitle sample, first calculating the ideal playing time according to the sampling time and the standard playing time deviation, and then calculating the waiting time by the ideal playing time time.
  • the standard play time deviation is equal to the reference time minus the sample time of the audio sample;
  • the ideal play time of the video sample and the caption sample is equal to the standard play time deviation plus the sample sample time, the video sample and the caption sample
  • the waiting time is equal to its ideal playing time minus the current system time;
  • the streaming media live server end and the streaming media client use RTP to implement network transmission of media data, and the sampling time of the sample is located in the packet header of the network data packet containing the sample content. in.
  • the present invention is directed to a streaming media live broadcast service using RTP as a data transmission protocol, and calculates a sampling time for input audio, video and subtitle content on the streaming media live server side, and the sample time is accurate to each sample sample in each medium.
  • each RTP packet containing the media sample carries the sample time of the media sample, and in the streaming client, based on the playing time of the audio sample, and according to the sampling time of the video sample and the subtitle sample Calculate the playing time of the video samples and subtitle samples, so as to achieve synchronous control playback of audio, video and subtitles in time, and achieve accurate restoration of the live scene.
  • FIG. 1 is a schematic diagram of a streaming media live broadcast system architecture according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram showing the form of a video data packet represented by video data transmission in a video channel according to an embodiment of the present invention
  • FIG. 3 is a sequence of media sample samples output by a streaming media client after processing an RTP channel according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram showing deviations of a video sample with an audio sample sample as a time reference and a subtitle sample with respect to a time reference according to an embodiment of the present invention
  • Fig. 5 is a flow chart showing temporal synchronization control of sample playback by sample sampling time according to an embodiment of the present invention. detailed description
  • the streaming media live broadcast system architecture is mainly composed of the following network elements: a streaming media live server, which is set as a collection, production, and publishing device for live content.
  • the streaming media client which is set as the receiving device of the live content, receives the content posted by the streaming live broadcast server through the network, and processes and plays the content.
  • FIG. 1 is a schematic diagram of a streaming media live broadcast system architecture according to an embodiment of the present invention.
  • the streaming media live server uses the audio, video collection device, and subtitle generating device as input, and converts the scene of the live scene into audio samples, video samples, and subtitles through the data collection module, and then, Through the network processing module, the samples of the audio, video and subtitles and the corresponding sampling time are encapsulated into an RTP network packet for network transmission. Finally, the RTP network packet is transmitted over the wireless network.
  • the present invention does not limit the real-time transmission protocol used, and those skilled in the art can extend to other real-time transmission protocols according to the technical solutions disclosed in the present invention.
  • the RTP network packet received from the network is processed by the network processing module and the playback control module, and then restored to a sequence of audio, video, and subtitle samples, and the scene of the live scene is restored to the user by playing these sample sequences.
  • the RTP network packet transmission between the streaming media client and the streaming media broadcast server uses the RTP protocol.
  • the control protocol between the streaming client and the streaming server uses the RTSP protocol (control information such as the start and end of the live broadcast).
  • FIG. 2 illustrates an RTP network packet sequence form including a sample sample in an RTP network transmission channel according to an embodiment of the present invention, and a structure of an RTP network packet. More importantly, it shows how a multimedia sample sample can be packaged with the corresponding sample time in the form of an RTP network packet.
  • FIG. 2 illustrates an RTP network packet sequence form including a sample sample in an RTP network transmission channel according to an embodiment of the present invention, and a structure of an RTP network packet. More importantly, it shows how a multimedia sample sample can be packaged with the corresponding sample time in the form of an RTP network packet.
  • it is very convenient to obtain the sampling time of a sample sample in the streaming media client for subsequent synchronization processing of audio, video and subtitles.
  • Step S102 Sample data of the input video data, audio data, and subtitles by the data collection module, and record the sampling time for each of the audio, video, and subtitle samples recorded by the collection device.
  • the sample time needs to be accurate enough to reflect the order in which the samples occur.
  • Step S104 The network processing module encapsulates the sample content and the sample sampling time into the same RTP network data packet according to the RFC 1889 protocol.
  • the sample content constitutes the payload of the RTP network packet, and the sample sample time is filled in the Time Stamp field position in the header of the RTP network packet.
  • the processed media samples become the sequence of RTP network packets transmitted on the network as shown in Figure 2.
  • FIG. 3 depicts the sequence of audio, video, and subtitle samples generated by the streaming media client after processing the received RTP packets.
  • Figure 4 depicts the deviation of the video samples, the sample sequence of the subtitles, and the reference time of the time axis in the case where the sample time of the audio sample is taken as the time axis reference time in the embodiment of the present invention.
  • the processing of receiving the RTP network packet by the streaming media client mainly includes the following steps: Step S106: cyclically processing the RTP audio channel, the video channel, and the subtitle channel, and reading the RTP network data packet therefrom.
  • the RTP network packet is parsed according to the RFC 1889 protocol, the sample time is read in the RTP header, and the sample content is read in the RTP payload.
  • FIG. 5 is a flow chart of synchronizing audio, video, and subtitle samples according to the sampling time in the streaming media client according to the embodiment of the present invention, and the flow description of step S108 is performed by the playback control module of the streaming media client.
  • the flow diagram shows how to synchronize the audio, video and subtitle samples according to the sample time to reproduce the time sequence they generated on the live scene. It mainly includes calculating the "play time deviation" of each sample; when playing the sample, the "play time deviation" of the audio sample is used as the time reference; and the video and subtitle samples advanced in the audio time are processed.
  • the specific steps are as follows:
  • Step 501 The playback control module cyclically reads the media sample transmitted from the RTP channel from the network processing module, and determines the end of transmission flag. If the transmission ends, the process ends; otherwise, step 502 is performed;
  • the first processing reads an audio sample, a video sample and a subtitle sample. At least one audio sample needs to be read in the subsequent loop processing. For the video and subtitle samples, if it is still in the sleep waiting state (see step 4, step 6), it is not read in from the RTP channel.
  • Step 502 Processing the audio sample, reading the sample time of the audio sample, playing the audio sample, and using the playing time of the audio sample as the "base time” on the time axis.
  • the "standard playback time deviation” is calculated based on the audio sample sampling time and playback time. The calculation method is:
  • Step 503 Process the video sample, read the sample time of the video sample, and calculate the ideal playback time of the video sample based on the current "standard play time deviation". And further calculating the video sample waiting time, if the video sample waiting time is less than 0, step 504 is performed, otherwise step 505 is performed;
  • the video sample waiting time can be calculated by the following method:
  • Video sample waiting time ideal playing time of the video sample - current system time (3)
  • Step 504 If the video sample waiting time is less than zero, indicating that the video sample lags behind the audio sample in time, the video sample is discarded (hop If the video sample waiting time is greater than zero, indicating that the video sample is ahead of the audio sample in time, the video sample is postponed to play, and the delayed playback time is equal to the video sample. Waiting time; if the video sample waiting time is equal to zero, indicating that the video sample and the audio sample are synchronized in time, the video sample is played directly. Then performing step 506;
  • the invention requires that the time on the sample device or the sample system is sufficiently accurate, and the sampling time of the audio sample, the video sample and the subtitle sample needs to be able to truly reflect the order of generation of each sample in time, and the sample time should be one. A variable that grows in threads. For synchronized playback, the system time of the streaming client should be accurate enough for the playback speed of the audio device.
  • the present invention is directed to a streaming media live broadcast service using RTP as a data transmission protocol, in streaming media
  • the live server side calculates the sampling time for the input audio, video and subtitle content, and the sampling time is accurate to each sample sample.
  • each RTP packet containing the media sample carries the media.
  • the sampling time of the sample in this way, provides conditions for accurate reproduction of live live content on the streaming client. Since the data bytes occupied by the sample time field are negligible compared to the data bytes occupied by the media samples, there is no impact on the real-time performance of the media data when transmitted over the network.
  • the playing time of the audio, video, and subtitle samples is calculated by the method used in the present invention, thereby achieving synchronous control playback of audio, video, and subtitles in time, and accurately reducing the effect of the live broadcast site.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明提供了一种流媒体直播业务系统及实现方法。用于实现使用实时传输协议的流媒体业务的音频、视频及字幕的同步播放。本发明针对使用RTP作为数据传输协议的流媒体直播业务,在流媒体直播服务器端,对输入的音频、视频和字幕内容计算采样时间,采样时间精确到每一个采样样本,在各媒体的RTP传输通道内,每一个包含媒体样本的RTP包都携带该媒体样本的采样时间,在流媒体客户端,以音频样本的播放时间为基准,并根据视频样本和字幕样本的采样时间计算视频样本和字幕样本的播放时间,从而达到对音频、视频和字幕在时间上进行同步控制播放,实现准确还原直播现场的效果。

Description

一种流媒体直播业务系统及实现方法 技术领域
本发明涉及通信领域及多媒体应用领域, 尤其涉及一种流媒体直播业 务系统及实现方法。 背景技术
近年来, 无线数据网络技术有了飞速的发展。 随着无线网络带宽的不 断增加, 使得密集型数据业务在无线环境中的应用成为可能。 无线流媒体 直播是一种新兴的多媒体业务, 用户可以在手持便携式终端上通过无线网 络来观看正在进行的比赛, 演唱会、 新闻现场等内容。 流媒体直播业务通 过实时流协议 ( Real Time Streaming Protocol, RTSP ) 实现流媒体客户端与 流媒体直播服务器之间的控制 (开始、 暂停、 结束)、 通过实时传输协议 ( Real-Time Transport Protocol, RTP )实现直播内容数据从釆集端到接收端 的传输。
流媒体直播是一种典型的实时发送 /接收系统, 具备传输图像、 声音、 文字的能力。 直播内容在接收端展现给用户的时间与在釆集端实际产生的 时间基本相同。 由于接收终端携带方便、 使用无线网络作为物理承载, 能 够使用户可以在第一时间非常方便的了解直播现场的情况, 因此流媒体直 播业务在宽带无线时代具有广阔的前景。
流媒体业务使用实时传输协议 RTP来实现对图像、 声音、 字幕数据的 传输, RTP协议规定, 在网络上的每一个 RTP通道只能传递一种类型的多 媒体数据, 因此要实现图像、 声音、 字幕数据的传输需要三个 RTP通道作 为承载。 由于不同类型的多媒体数据釆样样本、 釆样样本的数据大小、 网 络传输速度等都不尽相同, 因此需要在接收终端上对图像、 声音和字幕进 行时间上的同步控制, 以准确的重现直播现场的内容。 否则, 图像、 声音 和字幕会在时间上产生严重的不一致现象, 影响用户在视觉和听觉上的体 验。 发明内容
有鉴于此, 本发明的主要目的在于提供一种流媒体直播业务系统及实 现方法。 用于实现使用实时传输协议的流媒体业务的音频、 视频及字幕的 同步播放。
为实现上述目的, 根据本发明实施例的一方面提供了一种流媒体直播 业务实现方法, 该方法包括如下步骤:
流媒体直播服务器端在釆集音频样本、 视频样本和字幕样本的同时获 取所釆集样本的釆样时间;
在音频、 视频和字幕各自的传输通道内, 将音频样本、 视频样本和字 幕样本及样本釆样时间封装成网络数据包发送给流媒体客户端;
流媒体客户端从各通道的网络数据包中分别还原出具有釆样时间的音 频样本、 视频样本和字幕样本序列, 并依据样本的釆样时间, 按照发生时 间顺序重现直播内容。
进一步地, 所述流媒体客户端依据样本的釆样时间, 按照发生时间顺 序重现直播内容具体为:
以所述音频样本的播放时间作为播放时间轴上的基准时间播放所述音 频样本、 视频样本和字幕样本序列; 所述流媒体客户端根据音频样本的釆 样时间和所述基准时间计算标准播放时间偏差; 对于视频样本和字幕样本, 首先根据样本的釆样时间和所述标准播放时间偏差计算其理想播放时间, 然后通过理想播放时间计算其等待时间。
进一步地, 所述标准播放时间偏差等于所述基准时间减去音频样本的 釆样时间; 所述视频样本和字幕样本的理想播放时间等于标准播放时间偏 差加样本的釆样时间, 所述视频样本和字幕样本的等待时间等于其理想播 放时间减去当前的系统时间; 若所述视频样本和字幕样本的等待时间小于 零则丟弃该样本。
进一步地, 所述流媒体直播服务器端和所述流媒体客户端之间釆用实 时传输协议(RTP )实现媒体数据的网络传输, 所述样本的釆样时间位于包 含样本内容的网络数据包的包头中。
基于本发明实施例, 本发明还提出一种流媒体直播业务系统, 该系统 包括:
流媒体直播服务器端, 设置为釆集音频样本、 视频样本和字幕样本, 并同时获取样本的釆样时间; 分别将音频样本、 视频样本和字幕样本及所 釆集样本的釆样时间封装成网络数据包, 并通过各自的传输通道发送给流 媒体客户端;
样时间的音频样本、 视频样本和字幕样本序列, 并依据音频样本、 视频样 本和字幕样本的釆样时间, 按照发生时间顺序重现直播内容。
进一步地, 该系统中, 所述流媒体客户端以所述音频样本的播放时间 作为播放时间轴上的基准时间播放所述音频样本、 视频样本和字幕样本序 列; 所述流媒体客户端根据音频样本的釆样时间和所述基准时间计算标准 播放时间偏差; 对于视频样本和字幕样本, 首先根据其釆样时间和所述标 准播放时间偏差计算其理想播放时间, 然后通过理想播放时间计算其等待 时间。
所述标准播放时间偏差等于所述基准时间减去音频样本的釆样时间; 所述视频样本和字幕样本的理想播放时间等于标准播放时间偏差加样 本的釆样时间, 所述视频样本和字幕样本的等待时间等于其理想播放时间 减去当前的系统时间; 所述流媒体客户端在播放所述视频样本和字幕样本序列时, 若判断到 该样本的等待时间小于零时则丟弃该样本。
进一步地, 该系统中, 所述流媒体直播服务器端和所述流媒体客户端 之间釆用 RTP实现媒体数据的网络传输, 所述样本的釆样时间位于包含样 本内容的网络数据包的包头中。
本发明针对使用 RTP作为数据传输协议的流媒体直播业务, 在流媒体 直播服务器端, 对输入的音频、 视频和字幕内容计算釆样时间, 釆样时间 精确到每一个釆样样本, 在各媒体的 RTP传输通道内, 每一个包含媒体样 本的 RTP包都携带该媒体样本的釆样时间, 在流媒体客户端, 以音频样本 的播放时间为基准, 并根据视频样本和字幕样本的釆样时间计算视频样本 和字幕样本的播放时间, 从而达到对音频、 视频和字幕在时间上进行同步 控制播放, 实现准确还原直播现场的效果。 附图说明
图 1是本发明实施例的流媒体直播系统架构的示意图;
图 2是本发明实施例的以视频数据传输为说明代表的视频数据包在视 频通道中所具有的形式示意图;
图 3是本发明实施例的流媒体客户端对 RTP通道处理后输出的媒体釆 样样本序列;
图 4是本发明的实施例的以音频釆样样本为时间基准的视频样本以及 字幕样本相对于时间基准的偏差示意图;
图 5是本发明的实施例的通过样本釆样时间对样本播放进行时间上同 步控制的流程。 具体实施方式
为使本发明的目的、 技术方案和优点更加清楚明白, 以下举实施例并 参照附图, 对本发明进一步详细说明, 应当理解, 此处所描述的优选实施 例仅用于说明和解释本发明, 并不用于限定本发明。
在本发明实施例中, 所使用的流媒体直播系统架构主要由如下网络元 素构成: 流媒体直播服务器, 设置为直播内容的釆集、 制作、 和发布设备。 流媒体客户端, 设置为直播内容的接收设备, 通过网络接收流媒体直播服 务器发布的内容, 并进行处理和播放。
图 1是本发明实施例的流媒体直播系统架构的示意图。 如图 1所示, 流媒体直播服务器通过音频、 视频釆集设备、 字幕生成设备作为输入, 通 过数据釆集模块将直播现场的场景转化为音频釆样、 视频釆样和字幕釆样, 然后, 通过网络处理模块, 将音频、 视频和字幕的样本及对应的釆样时间 封装成为用于网络传输的 RTP网络包。 最后, 将 RTP网络包通过无线网络 传输。 本发明不限制所使用的实时传输协议, 本领域技术人员可依据本发 明所公开的技术方案扩展到其它实时传输协议。
在流媒体客户端, 从网络接收到的 RTP网络包经过网络处理模块、 播 放控制模块的处理之后, 还原成为音频、 视频和字幕样本序列, 并通过播 放这些样本序列将直播现场的场景还原给用户。 流媒体客户端与流媒体直 播服务器之间的 RTP网络包传输使用 RTP协议。 流媒体客户端与流媒体直 播服务器之间的控制协议使用 RTSP协议(实现直播的开始、结束等控制信 息)。
图 2说明了本发明实施例的在 RTP网络传输通道中, 包含有釆样样本 的 RTP网络包序列形式, 以及 RTP网络包的结构。 更为重要的, 说明了如 何将一个多媒体釆样样本与对应的釆样时间通过 RTP网络包的形式封装在 一起。 通过这种实现方式, 可以在流媒体客户端非常方便的获取一个釆样 样本的釆样时间, 以便后续对音频、 视频和字幕的同步处理。
如图 1及图 2所示, 流媒体直播服务器对于现场釆集的数据要经过如 下处理:
步骤 S102 , 由数据釆集模块对输入的视频数据、 音频数据、 字幕进行 样本釆样, 对于通过釆集设备记录的每一个音频、 视频及字幕样本, 分别 记录其釆样时间。 釆样时间需要足够精确的反映样本的发生顺序。
步骤 S104, 网络处理模块按照 RFC 1889协议将样本内容和样本釆样 时间封装到同一个 RTP网络数据包中。 其中, 样本内容构成 RTP网络包的 载荷, 样本釆样时间填写到 RTP网络包的包头中的 Time Stamp字段位置。
RTP网络数据包包头格式
Figure imgf000008_0001
在服务器端, 经过处理之后的媒体样本成为如图 2所示的, 在网络上 传输的 RTP网络包序列。
以下将结合图 3、图 4以及图 5详细介绍本发明实施例在流媒体客户端 的发明思想及实施过程。 图 3描述了在流媒体客户端, 将接收到的 RTP数 据包处理后, 生成的音频、 视频和字幕样本序列。 图 4描述了本发明实施 例在以音频样本的釆样时间作为时间轴基准时间的情况下, 视频样本、 字 幕样本序列与时间轴基准时间的偏差情况。
流媒体客户端对接收 RTP网络包的处理主要包括由步骤 S106实现:循 环处理 RTP音频通道、 视频通道、 字幕通道, 从中读取 RTP网络数据包。 根据 RFC 1889协议对 RTP网络包进行解析,在 RTP包头中读取釆样时间, 在 RTP载荷中读取样本内容。
图 5是本发明实施例在流媒体客户端根据釆样时间对音频、 视频和字 幕样本进行同步的流程框图, 暨步骤 S 108的流程说明, 该步骤由流媒体客 户端的播放控制模块来完成。 在流程框图中说明了如何根据釆样时间对音 频、 视频和字幕样本进行同步控制, 以重现它们在直播现场产生的时间序 列。 主要包括计算每一个样本的 "播放时间偏差"; 在播放样本时, 以音频 样本的 "播放时间偏差" 作为时间基准; 对釆样时间上提前于音频的视频、 字幕样本进行处理。 其具体步骤如下:
步骤 501 : 播放控制模块循环从网络处理模块读取从 RTP通道中传输 过来的媒体样本, 并判断传输结束标志, 若传输结束则结束流程, 否则执 行步骤 502;
第一次处理时读入一个音频样本, 一个视频样本和一个字幕样本。 在 其后循环处理时至少需要读入一个音频样本, 对于视频和字幕样本, 如果 仍然处于睡眠等待状态 (参见步骤四、 步骤六) 则不从 RTP通道中读入。
步骤 502: 处理音频样本, 读取音频样本的釆样时间, 播放音频样本, 将音频样本的播放时间作为时间轴上的 "基准时间"。 根据音频样本釆样时 间和播放时间计算 "标准播放时间偏差"。 计算方法为:
标准播放时间偏差 =基准时间-音频样本的釆样时间 ( 1 ) 步骤 503: 处理视频样本, 读取视频样本的釆样时间, 依据当前的 "标 准播放时间偏差" 计算视频样本的理想播放时间, 并进一步计算出视频样 本等待时间, 若视频样本等待时间小于 0则执行步骤 504, 否则执行步骤 505;
根据算式(1 )可以推出视频样本的理想播放时间, 计算方法为: 视频样本的理想播放时间 =标准播放时间偏差 +视频样本的釆样时间
(2) 根据视频样本的理想播放时间和当前的系统时间, 可以计算出视频样 本等待时间, 计算方法为:
视频样本等待时间 =视频样本的理想播放时间-当前的系统时间 (3) 步骤 504: 如果视频样本等待时间小于零,说明视频样本在时间上是滞 后于音频样本的, 则丟弃视频样本(跳过不播放), 然后执行步骤 506; 步骤 505: 如果视频样本等待时间大于零,说明视频样本在时间上提前 于音频样本, 则该视频样本延后播放, 延后播放的时间等于所述视频样本 等待时间; 如果视频样本等待时间等于零, 说明视频样本与音频样本在时 间上是同步的, 则直接播放视频样本。 然后执行步骤 506;
步骤 506: 处理字幕样本, 读取字幕样本的釆样时间, 依据当前的 "标 准播放时间偏差", 计算字幕样本的理想播放时间, 并进一步计算字幕样本 等待时间, 若字幕样本等待时间小于 0则执行步骤 507 , 否则步骤 508; 字幕样本的理想播放时间 =标准播放时间偏差 +字幕样本的釆样时间 字幕样本等待时间 =字幕样本的理想播放时间-当前的系统时间 步骤 507: 如果字幕样本等待时间小于零,说明字幕样本在时间上是滞 后于音频样本的, 则丟弃字幕样本(跳过不播放), 然后执行步骤 501 ; 步骤 508: 如果字幕样本等待时间大于零,说明字幕样本在时间上提前 于音频样本, 则该字幕样本延后播放, 延后播放的时间等于所述字幕样本 等待时间; 如果字幕样本等待时间等于零, 说明字幕样本与音频样本在时 间上是同步的, 则直接播放字幕样本; 然后执行步骤 501 ;
本发明要求釆样设备或釆样系统上的时间是足够精确的, 所述音频样 本、 视频样本与字幕样本的釆样时间需要能够真正反映各个样本在时间上 的产生顺序, 样本时间应该是一个线程增长的变量。 为实现同步播放, 流 媒体客户端的系统时间对于音频设备的播放速度来说应当足够精确。
本发明针对使用 RTP作为数据传输协议的流媒体直播业务, 在流媒体 直播服务器端, 对输入的音频、 视频和字幕内容计算釆样时间, 釆样时间 精确到每一个釆样样本, 在各媒体的 RTP传输通道内, 每一个包含媒体样 本的 RTP包都携带该媒体样本的釆样时间, 通过这种方式, 为在流媒体客 户端准确还原直播现场内容提供了条件。 由于釆样时间字段所占用的数据 字节与媒体样本所占用的数据字节相比可以忽略, 因此在网络上传输时对 媒体数据的实时性不会造成任何影响。 在流媒体客户端, 通过本发明中使 用的方法对音频、 视频、 字幕样本的播放时间进行计算, 从而达到对音频、 视频和字幕在时间上进行同步控制播放, 准确还原直播现场的效果。
以上所述仅为本发明的优选实施例而已, 并不用于限制本发明, 对于 本领域的技术人员来说, 本发明可以有各种更改和变化。 凡在本发明的精 神和原则之内, 所作的任何修改、 等同替换、 改进等, 均应包含在本发明 保护范围之内。

Claims

权利要求书
1、 一种流媒体直播业务实现方法, 其特征在于, 该方法包括: 流媒体直播服务器端在釆集音频样本、 视频样本和字幕样本的同时获 取所釆集样本的釆样时间;
在音频、 视频和字幕各自的传输通道内, 将音频样本、 视频样本和字 幕样本及样本釆样时间封装成网络数据包发送给流媒体客户端;
流媒体客户端从各通道的网络数据包中分别还原出具有釆样时间的音 频样本、 视频样本和字幕样本序列, 并依据样本的釆样时间, 按照发生时 间顺序重现直播内容。
2、 根据权利要求 1所述的方法, 其特征在于, 所述流媒体客户端依据 样本的釆样时间, 按照发生时间顺序重现直播内容具体为:
以所述音频样本的播放时间作为播放时间轴上的基准时间播放所述音 频样本、 视频样本和字幕样本序列; 所述流媒体客户端根据音频样本的釆 样时间和所述基准时间计算标准播放时间偏差; 对于视频样本和字幕样本, 首先根据样本的釆样时间和所述标准播放时间偏差计算其理想播放时间, 然后通过理想播放时间计算其等待时间。
3、 根据权利要求 2所述的方法, 其特征在于,
所述标准播放时间偏差等于所述基准时间减去音频样本的釆样时间; 所述视频样本和字幕样本的理想播放时间等于标准播放时间偏差加样 本的釆样时间, 所述视频样本和字幕样本的等待时间等于其理想播放时间 减去当前的系统时间;
若所述视频样本和字幕样本的等待时间小于零则丟弃该样本。
4、 根据权利要求 1所述的方法, 其特征在于, 所述流媒体直播服务器 端和所述流媒体客户端之间釆用实时传输协议( RTP )实现媒体数据的网络 传输。
5、 根据权利要求 4所述的方法, 其特征在于, 所述样本的釆样时间位 于包含样本内容的网络数据包的包头中。
6、 一种流媒体直播业务系统, 其特征在于, 该系统包括:
流媒体直播服务器端, 设置为釆集音频样本、 视频样本和字幕样本, 并同时获取样本的釆样时间; 分别将音频样本、 视频样本和字幕样本及所 釆集样本的釆样时间封装成网络数据包, 并通过各自的传输通道发送给流 媒体客户端;
样时间的音频样本、 视频样本和字幕样本序列, 并依据音频样本、 视频样 本和字幕样本的釆样时间, 按照发生时间顺序重现直播内容。
7、 根据权利要求 6所述的系统, 其特征在于, 所述流媒体客户端以所 述音频样本的播放时间作为播放时间轴上的基准时间播放所述音频样本、 视频样本和字幕样本序列; 所述流媒体客户端根据音频样本的釆样时间和 所述基准时间计算标准播放时间偏差; 对于视频样本和字幕样本, 首先根 据其釆样时间和所述标准播放时间偏差计算其理想播放时间, 然后通过理 想播放时间计算其等待时间。
8、 根据权利要求 7所述的系统, 其特征在于,
所述标准播放时间偏差等于所述基准时间减去音频样本的釆样时间; 所述视频样本和字幕样本的理想播放时间等于标准播放时间偏差加样 本的釆样时间, 所述视频样本和字幕样本的等待时间等于其理想播放时间 减去当前的系统时间;
所述流媒体客户端在播放所述视频样本和字幕样本序列时, 若判断到 该样本的等待时间小于零时则丟弃该样本。
9、 根据权利要求 6所述的系统, 其特征在于, 所述流媒体直播服务器 端和所述流媒体客户端之间釆用 RTP实现媒体数据的网络传输。
10、 根据权利要求 9所述的系统, 其特征在于, 所述样本的釆样时间 位于包含样本内容的网络数据包的包头中。
PCT/CN2011/071090 2010-03-17 2011-02-18 一种流媒体直播业务系统及实现方法 Ceased WO2011113315A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010139670.3 2010-03-17
CN2010101396703A CN102196319A (zh) 2010-03-17 2010-03-17 一种流媒体直播业务系统及实现方法

Publications (1)

Publication Number Publication Date
WO2011113315A1 true WO2011113315A1 (zh) 2011-09-22

Family

ID=44603589

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/071090 Ceased WO2011113315A1 (zh) 2010-03-17 2011-02-18 一种流媒体直播业务系统及实现方法

Country Status (2)

Country Link
CN (1) CN102196319A (zh)
WO (1) WO2011113315A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104185081B (zh) * 2013-05-22 2019-04-19 韩华泰科株式会社 在使用实时传输协议包播放的图像中显示时间的方法

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102412877B (zh) * 2011-12-23 2014-05-28 上海山景集成电路股份有限公司 一种基于a2dp协议的非音频数据传输方法
CN102630017B (zh) * 2012-04-10 2014-03-19 中兴通讯股份有限公司 一种移动多媒体广播字幕同步的方法和系统
CN103856828A (zh) * 2012-11-29 2014-06-11 北京千橡网景科技发展有限公司 视频数据传输方法及装置
JP2015023575A (ja) * 2013-07-19 2015-02-02 パナソニック インテレクチュアル プロパティ コーポレーション オブアメリカPanasonic Intellectual Property Corporation of America 送信方法、受信方法、送信装置及び受信装置
CN104918064B (zh) * 2015-05-27 2019-07-05 努比亚技术有限公司 一种移动终端视频快速播放的方法及装置
CN105187688B (zh) * 2015-09-01 2018-03-23 福建富士通信息软件有限公司 一种对手机采集的实时视频和音频进行同步的方法及系统
CN105959772B (zh) * 2015-12-22 2019-04-23 合一网络技术(北京)有限公司 流媒体与字幕即时同步显示、匹配处理方法、装置及系统
CN107872678B (zh) * 2016-09-26 2019-08-27 腾讯科技(深圳)有限公司 基于直播的文本展示方法和装置、直播方法和装置
CN108076349B (zh) * 2016-11-11 2021-03-19 铂渊信息技术(上海)有限公司 网络互动直播方法、系统及电子设备
CN108174264B (zh) * 2018-01-09 2020-12-15 武汉斗鱼网络科技有限公司 歌词同步显示方法、系统、装置、介质及设备
CN111654672A (zh) * 2020-06-01 2020-09-11 赛特斯信息科技股份有限公司 实现报警预触发录像文件生成处理的方法及其系统
CN113490007A (zh) * 2021-07-02 2021-10-08 广州博冠信息科技有限公司 直播处理系统、方法、存储介质与电子设备
CN119562114A (zh) * 2023-09-04 2025-03-04 北京字跳网络技术有限公司 基于音频的文本处理方法、装置、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0965303A (ja) * 1995-08-28 1997-03-07 Canon Inc 映像音声同期方法及び装置
CN1933594A (zh) * 2005-09-14 2007-03-21 王世刚 多路音视频数据网络传输与同步播放的方法
US20070180137A1 (en) * 2006-01-28 2007-08-02 Ravi Rajapakse Streaming Media System and Method
CN101282482A (zh) * 2008-05-04 2008-10-08 中兴通讯股份有限公司 视频数据与音频数据同步播放的装置、系统和方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127917B (zh) * 2007-09-06 2010-07-14 中兴通讯股份有限公司 一种互联网流媒体格式音视频同步的方法及其系统
CN101123611B (zh) * 2007-09-25 2012-05-23 中兴通讯股份有限公司 一种流媒体数据的发送方法
CN101635848B (zh) * 2008-07-22 2013-08-07 北大方正集团有限公司 一种视频文件的编辑方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0965303A (ja) * 1995-08-28 1997-03-07 Canon Inc 映像音声同期方法及び装置
CN1933594A (zh) * 2005-09-14 2007-03-21 王世刚 多路音视频数据网络传输与同步播放的方法
US20070180137A1 (en) * 2006-01-28 2007-08-02 Ravi Rajapakse Streaming Media System and Method
CN101282482A (zh) * 2008-05-04 2008-10-08 中兴通讯股份有限公司 视频数据与音频数据同步播放的装置、系统和方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104185081B (zh) * 2013-05-22 2019-04-19 韩华泰科株式会社 在使用实时传输协议包播放的图像中显示时间的方法

Also Published As

Publication number Publication date
CN102196319A (zh) 2011-09-21

Similar Documents

Publication Publication Date Title
WO2011113315A1 (zh) 一种流媒体直播业务系统及实现方法
CN101282482B (zh) 视频数据与音频数据同步播放的装置、系统和方法
CN104079870B (zh) 单路视频多路音频的视频监控方法及系统
WO2008055420A1 (en) A synchronizing method between different medium streams and a system
WO2008061416A1 (en) A method and a system for supporting media data of various coding formats
CN103546662A (zh) 一种网络监控系统中音视频同步方法
CN105308974A (zh) 传输装置、传输方法、再现装置、再现方法以及接收装置
EP2667625A2 (en) Apparatus and method for transmitting multimedia data in a broadcast system
CN101938606A (zh) 多媒体数据推送方法、系统和设备
CN101616060B (zh) 一种iptv终端组播转单播切换方法及系统
CN100450163C (zh) 一种移动多媒体广播视音频同步播放的方法
CN111447396B (zh) 一种音视频传输方法、装置、电子设备及存储介质
CN101202613B (zh) 一种用于时钟同步的终端
CN1960485B (zh) 一种移动多媒体广播视音频同步播放的方法
WO2008028367A1 (fr) Procédé permettant de réaliser des pistes audio multimédia pour un système de diffusion multimédia mobile
CN103269448A (zh) 基于rtp/rtcp反馈预警算法实现音视频同步方法
CN1972441A (zh) 流媒体存储及服务的方法
CN101267572B (zh) 一种节目流转换的方法及装置
JP5092493B2 (ja) 受信プログラム、受信装置、通信システム、及び、通信方法
JP7517389B2 (ja) 送信方法および受信装置
CN1988667A (zh) 一种广播网络中时钟同步的方法
CN101022558A (zh) 基于saf的信源适配
JP4561240B2 (ja) データ処理装置およびデータ処理方法並びにデータ送受信システム
CN1960509B (zh) 在传输移动多媒体广播媒体数据时实现错误隔离的方法
KR101112454B1 (ko) 무선 네트워크를 통한 디지털 멀티미디어 제어장치

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11755632

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11755632

Country of ref document: EP

Kind code of ref document: A1