CN108810575B

CN108810575B - A method and device for sending target video

Info

Publication number: CN108810575B
Application number: CN201710309537.XA
Authority: CN
Inventors: 张龙; 辛安民; 余翔; 黄凡夫
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Guangdong Gaohang Intellectual Property Operation Co ltd
Priority date: 2017-05-04
Filing date: 2017-05-04
Publication date: 2021-10-29
Anticipated expiration: 2037-05-04
Also published as: CN108810575A

Abstract

The present disclosure relates to a method and device for sending target video, belonging to the field of electronic technology. The method includes: acquiring a multi-track video file of a target video; converting the multi-track video file into a video file and multiple audio-track audio files, wherein the video file includes the multi-track video file The video data in, each audio track audio file respectively contains the audio data corresponding to the different audio track types contained in the multi-audio track video file; When receiving the acquisition request to the target video sent by the terminal, the acquisition request When the first audio track type is carried in the video file, the video file and the audio track audio file corresponding to the first audio track type are sent to the terminal. With the present disclosure, traffic waste can be prevented.

Description

Method and device for sending target video

Technical Field

The present disclosure relates to the field of electronic technologies, and in particular, to a method and an apparatus for transmitting a target video.

Background

With the development of electronic technology, various terminals are widely used, and the types and functions of application programs on the corresponding terminals are more and more abundant. A video playing application is a very common application.

Many video files stored in servers today are multi-track video files (where a multi-track video file includes audio data for multiple track types). When a user wants to play a certain video, a playing request corresponding to the video can be sent to the server, and correspondingly, after the server receives the playing request corresponding to the video, a multi-track video file containing video data and various audio track audio data corresponding to the video can be obtained and sent to the terminal. After receiving a multi-track video file containing video data and a plurality of audio track audio data, the terminal can play the video data and the audio data corresponding to the target audio track type according to the target audio track type set by a user.

In carrying out the present disclosure, the inventors found that at least the following problems exist:

for a multi-track video file of a certain video, a user often only plays audio data corresponding to a selected track type, but if the processing method is adopted, the server sends audio data corresponding to all track types to the terminal when the server receives a play request, that is, audio data of track types not selected by the user are also sent to the terminal, thereby causing traffic waste.

Disclosure of Invention

In order to overcome the problem of traffic waste in the related art, the present disclosure provides a method and apparatus for transmitting a target video.

According to a first aspect of embodiments of the present disclosure, there is provided a method of transmitting a target video, the method including:

acquiring a multi-track video file of a target video;

converting the multi-track video file into a video file and a plurality of track audio files, wherein the video file comprises video data in the multi-track video file, and each track audio file comprises audio data corresponding to different track types contained in the multi-track video file;

and when an acquisition request for the target video sent by a terminal is received, wherein the acquisition request carries a first audio track type, sending the video file and an audio track file corresponding to the first audio track type to the terminal.

Optionally, the converting the multi-track video file into a video file and a plurality of track audio files includes:

and acquiring each video frame to form a video file and acquiring each audio data packet corresponding to each audio track to form a plurality of audio track audio files according to the offset information of each video frame and each audio data packet corresponding to each audio track in the multi-audio track video file.

Optionally, when an acquisition request for the target video sent by a terminal is received, where the acquisition request carries a first audio track type, sending the video file and an audio track file corresponding to the first audio track type to the terminal includes:

when an acquisition request for the target video sent by a terminal is received, wherein the acquisition request carries a first audio track type, the video frames contained in the video file and the audio data packets contained in the audio track audio file corresponding to the first audio track type are sent to the terminal in sequence according to the playing sequence of each video frame contained in the video file and each audio data packet contained in the first audio track audio file.

Optionally, when an acquisition request for the target video sent by a terminal is received, where the acquisition request carries a first audio track type, according to a playing sequence of each video frame included in the video file and each audio data packet included in the first audio track audio file, sequentially sending the video frame included in the video file and the audio data packet included in the audio track audio file corresponding to the first audio track type to the terminal, includes:

when an acquisition request for the target video sent by a terminal is received, wherein the acquisition request carries a first audio track type, acquiring the playing sequence of each video frame contained in the video file and each audio data packet contained in the first audio track audio file recorded in a file header of the video file;

and sequentially sending video frames contained in the video file and audio data packets contained in the audio track file corresponding to the first audio track type to the terminal according to the playing sequence.

Optionally, the method further includes:

in the process of sending the video file and the audio track file corresponding to the first audio track type to the terminal, when receiving an audio track switching notification carrying a second audio track type and a first playing position sent by the terminal, stopping sending the audio track file corresponding to the first audio track type to the terminal, and sending audio data contained in the audio track file corresponding to the second audio track type with a playing position behind the first playing position to the terminal;

the first playing position is used for representing the playing position of the audio data to be played when the terminal receives an audio track switching instruction input by a user.

Optionally, the method further includes:

in the process of sending the video file and the audio file of the audio track corresponding to the first audio track type to the terminal, when a positioning playing request carrying a second playing position and sent by the terminal is received, sending video data contained in the video file and audio data contained in the audio file of the audio track corresponding to the first audio track type, the playing position of which is behind the second playing position, to the terminal;

the second playing position is used for representing the playing position of the video data to be played when the terminal receives a positioning playing instruction input by a user.

According to a first aspect of embodiments of the present disclosure, there is provided an apparatus for transmitting a target video, the apparatus including:

the acquisition module is used for acquiring a multi-track video file of a target video;

the conversion module is used for converting the multi-track video file into a video file and a plurality of track audio files, wherein the video file comprises video data in the multi-track video file, and each track audio file comprises audio data corresponding to different track types contained in the multi-track video file;

and the sending module is used for sending the video file and the audio track file corresponding to the first audio track type to the terminal when receiving an acquisition request of the target video sent by the terminal, wherein the acquisition request carries the first audio track type.

Optionally, the conversion module is configured to:

Optionally, the sending module is configured to:

Optionally, the sending module is further configured to:

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

in the embodiment of the disclosure, the server may convert a multi-track video file of the target video into a video file only containing video data and a plurality of track audio files respectively containing audio data corresponding to different track types contained in the multi-track video file, and further, when an acquisition request for the target video, which is sent by the terminal and carries the first track type, is received, the server may send the video file obtained in advance and the track audio file corresponding to the first track type to the terminal. Therefore, when a user wants to play a certain video, the server can only send the video file corresponding to the video and the audio data corresponding to the specific audio track type to the terminal of the user, and the audio data corresponding to all the audio track types do not need to be sent to the terminal, so that the waste of flow can be prevented.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. In the drawings:

FIG. 1 is a flow diagram illustrating a method of transmitting a target video in accordance with an exemplary embodiment;

FIG. 2 is a system architecture diagram illustrating an exemplary embodiment;

FIG. 3 is a schematic diagram illustrating an apparatus for transmitting a target video in accordance with an illustrative embodiment;

fig. 4 is a schematic diagram illustrating a server architecture according to an example embodiment.

With the foregoing drawings in mind, certain embodiments of the disclosure have been shown and described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

An exemplary embodiment of the present disclosure provides a method for transmitting a target video, which may be used in a server, wherein the server may be a background server of a video playing application. The server may have a processor for converting a multi-track video file to a video file and associated processing of a plurality of audio track audio files, a memory for storing data required and generated during the processing described below, and a transceiver for receiving and transmitting data.

The process flow shown in fig. 1 will be described in detail below with reference to the embodiments, and the contents may be as follows:

in step 101, a multi-track video file of a target video is obtained.

The target video can be any video pre-stored by the server, and the multi-track video file can be a file containing both video data of the target video and audio data corresponding to multiple track types.

In an alternative embodiment, a multi-track video file includes audio data for multiple track types, e.g., one track type for each language, e.g., a multi-track video file includes chinese audio data, english audio data, german audio data; as another example, a source sound is of a soundtrack type, e.g., a multi-track video file includes female treble audio data, male bass audio data, piano audio data.

In an alternative embodiment, the server may have pre-stored therein multi-track video files of multiple videos. The server may retrieve a multi-track video file of any video stored therein (which may be referred to as a target video) for subsequent processing by the server of the multi-track video of the target video. Specifically, the server may store a conversion period in advance, and each time the preset conversion period passes, the server may obtain a multi-track video file of a video received in the current conversion period (the video received in the current conversion period is the target video).

In step 102, a multi-track video file is converted into a video file and a plurality of audio track audio files, wherein the video file contains video data in the multi-track video file, and each audio track audio file contains audio data corresponding to different audio track types contained in the multi-track video file.

In an optional embodiment, after a multi-track video file of a target video is acquired, a server may convert the acquired multi-track video file into a video file and multiple track files, where the video file includes video data in the multi-track video file and does not include audio data corresponding to different track types included in the multi-track video file, that is, the video file only includes video data and does not include any audio data included in the multi-track video file, and each track audio file includes audio data corresponding to one track type included in the multi-track video file. That is, after the multi-track video file is acquired, the server may divide the video data in the multi-track video file and the audio data corresponding to different track types, and store the divided video data and audio data in different files. Each track audio file may be an audio file of an encoding type g.7xx (such as g.711, g.722, etc.), and the properties of each track audio file may be the same, such as the encoding type, sample rate, bit rate, number of channels, sample bit, frame time stamp, packing frame length, etc. of each track audio file. The video file may be an MP4 (where MP4 is a video format name) video file.

Specifically, the multi-track video file may store each video frame and each audio data packet according to a playing sequence of each video frame and each audio data packet, and the video file and the audio file may also be stored according to the playing sequence of the video frames and the playing sequence of the audio data packets, respectively, where the video frame is video data of a minimum unit and the audio data packet is audio data of the minimum unit.

The Data in the multi-track video file and the Data in the converted video file may be payload Data in an MDAT (Media Data) encapsulation structure, wherein the MDAT includes a header part and a body part (the body part may be payload Data of the MDAT), and the Data in the multi-track video file and the Data in the converted video file may be Data in the body part. According to different storage modes of data in the MDAT, the video data in the converted video file can be stored in a plurality of storage modes, and two feasible storage modes are provided as follows:

the first method is as follows: the video file only contains video data and does not retain the position information of the audio data packets, for example, the multi-track video file contains two track types of audio data (track 1 and track 2, respectively), and the data of the MDAT in the multi-track video file may be: video frame 1, audio data packet 1 of track 2, video frame 2, audio data packet 2 of track 1, audio data packet 2 … of track 2, video frame n, audio data packet n of track 1, audio data packet n … of track 2, wherein the playing sequence is video frame 1, audio data packet 1 (audio data packet 1 of track 1 or audio data packet 1 of track 2, i.e. the playing sequence corresponding to audio data packet 1 of track 1, audio data packet 1 of track 2 is the same), video frame 2, audio data packet 2 …, video frame n, audio data packet n. The data of the MDAT in the converted video file may be sequentially: video frame 1, video frame 2 … video frame n, the data stored in the audio file of track 1 may be in turn: audio data package 1 of track 1, audio data package 2 … of track 1, audio data package n of track 1, the data stored in the audio file of track 2 may be in turn: audio packet 1 of track 2, audio packet 2 of track 2 … audio packet n of track 2.

The second method is as follows: in addition to the video data, the video file may also keep a position corresponding to the audio data packet (the position may be recorded as null data IDLE, indicating that the position is null), and does not contain the audio data itself, in this case, the video file may be called a single track shadow file, for example, the data stored in the multi-track video file may be: video frame 1, audio data packet 1 of track 2, video frame 2, audio data packet 2 of track 1, audio data packet 2 … video frame n of track 2, audio data packet n of track 1, audio data packet n … of track 2, wherein the playing sequence is video frame 1, audio data packet 1, video frame 2, audio data packet 2 … video frame n, audio data packet n. The data stored in the converted video file may be in turn: video frame 1, null data, video frame 2, null data …, video frame n, and null data, where the null data may be data indicating that the data at the location is not audio data itself, and the location recorded as null data may correspond to a corresponding audio data packet in the audio file of track 1 or a corresponding audio data packet in the audio file of track 2.

Alternatively, the server may convert a multi-track video file into one video file and a plurality of audio track audio files as follows: and acquiring each video frame to form a video file and acquiring each audio data packet corresponding to each audio track to form a plurality of audio track audio files according to the offset information of each video frame and each audio data packet corresponding to each audio track in the multi-audio track video file.

In an alternative embodiment, the multi-track video file may include location information of each video frame and each audio packet corresponding to each audio track, and the location information is represented by a location offset. After the server acquires the multi-track video file, the video file can be obtained according to the position offset of each video frame and each audio data packet corresponding to each audio track, and all the video frames and all the audio data packets of each audio track can be acquired to obtain the audio file of each audio track.

In step 103, when an acquisition request for a target video sent by a terminal is received, where the acquisition request carries a first audio track type, a video file and an audio track file corresponding to the first audio track type are sent to the terminal.

In an alternative embodiment, the user may view a video through a video playing application installed on the terminal. When a user wants to watch or download a target video online, an acquisition request for the target video can be sent to a server by operating a trigger terminal, wherein the acquisition request can carry a user-selected or default audio track type (which can be called as a first audio track type). Correspondingly, the server may receive an acquisition request carrying a first audio track type sent by the terminal, further may acquire a video file corresponding to the target video and an audio track file corresponding to the first audio track type, and further send the video file and the audio track file corresponding to the first audio track type to the terminal, as shown in fig. 2.

Optionally, the server may sequentially send each video frame and each audio data packet to the terminal according to the playing sequence, and correspondingly, the processing procedure in step 103 may be as follows: when an acquisition request for a target video sent by a terminal is received, wherein the acquisition request carries a first audio track type, video frames contained in the video file and audio data packets contained in an audio track audio file corresponding to the first audio track type are sent to the terminal in sequence according to the playing sequence of each video frame contained in the video file and each audio data packet contained in the first audio track audio file.

In an alternative embodiment, after the server converts a multi-track video file into a video file and a plurality of audio track audio files, the playing sequence of each video frame and each audio data packet can be recorded. In this case, when receiving an acquisition request carrying a first audio track type sent by the terminal, the server may sequentially acquire a video frame from the video file, acquire an audio data packet from the audio track file corresponding to the first audio track type, and send the acquired video frame or audio data packet to the terminal according to the playing sequence of the video frame and audio data packet. For example, after the server receives an acquisition request sent by the terminal, the server may first acquire a video frame 1 from a video file and send the video frame 1 to the terminal, then acquire an audio data packet 1 from a track audio file corresponding to a first track type and send the audio data packet to the terminal, then acquire a video frame 2 from the video file and send the video frame 2 to the terminal, and so on until an audio data packet n included in a track audio file corresponding to the first track type is sent to the terminal.

Optionally, the playing sequence of each video frame and each audio data packet may be recorded in a file header of the video file, and accordingly, the processing procedure may be as follows: when an acquisition request for a target video sent by a terminal is received, wherein the acquisition request carries a first audio track type, acquiring the playing sequence of each video frame contained in a video file recorded in a file header of the video file and each audio data packet contained in an audio file of the first audio track; and according to the playing sequence, sequentially sending video frames contained in the video file and audio data packets contained in the audio track file corresponding to the first audio track type to the terminal.

In an alternative embodiment, after the server converts a multi-track video file into a video file and a plurality of audio track audio files, the playing sequence of each video frame and each audio data packet can be recorded in the file header of the video file. In this way, when the server receives an acquisition request for a target video sent by the terminal, the server may acquire a playing sequence of each video frame contained in the video file recorded in the file header of the video file and each audio data packet contained in the first audio track audio file, and further may sequentially send the video frame contained in the video file and the audio data packet contained in the audio track audio file corresponding to the first audio track type to the terminal according to the playing sequence. Specifically, index information may be recorded in a header of the video file, that is, a playing sequence of each video frame and each audio data packet and corresponding position information may be recorded, where a data type (video data type or audio data type) and a corresponding relationship between the playing sequence and the position information may be stored in the header, and a corresponding relationship between the data type and the position information may also be sequentially stored in the header from front to back according to the playing sequence, for example, as shown in table 1, the position information is represented by a position Offset, and the index information is stored in the header by an STCO (Chunk Offset of sample) encapsulation structure, where the STCO encapsulation structure may include a header portion and a body portion, and the index information may be stored in the body portion of the STCO encapsulation structure.

TABLE 1

After receiving an acquisition request sent by a terminal, a server can acquire the playing sequence of each video frame and each audio data packet stored in a file header of a video file, and correspondingly acquire the corresponding video frame or audio data packet from the video file or audio track audio file according to the data type of data in each playing sequence, wherein the position offsets of the audio data packets in the same playing sequence in each audio track audio file are the same because the attributes of each audio track audio file are the same.

For the first case of the storage manner of the video file, the position information corresponding to the video data type recorded in the header of the video file may be a position offset of the corresponding video frame in the video file, and the position information corresponding to the audio data type may be a position offset of the corresponding audio data packet in each audio track and audio file. When the server receives an acquisition request carrying a first audio track type sent by the terminal, the server can judge the data type of data at the position corresponding to each position offset according to the corresponding relation between the data type and the position offset stored in the file header of the video file, if the data type is the video data type, the server can acquire a corresponding video frame from the position corresponding to the position offset in the video file, and if the data type is the audio data type, the server can acquire a corresponding audio data packet from the position corresponding to the position offset in the audio track audio file corresponding to the first audio track type. For example, data stored in a file of a video file is shown in table 1, after receiving an acquisition request carrying a first audio track type sent by a terminal, a server may first acquire a video frame from a position corresponding to a position offset 1 in the video file and send the video frame to the terminal, then may acquire an audio data packet from a position corresponding to a position offset 2 in an audio track file corresponding to the first audio track type and send the audio data packet to the terminal, then may acquire a video frame from a position corresponding to a position offset 3 in the video file and send the video frame to the terminal, and finally may acquire an audio data packet from a position offset 4 in an audio track file corresponding to the first audio track type and send the audio data packet to the terminal.

For the second case of the video file storage mode, the position information recorded in the header of the video file may be a position offset of the video frame or the audio data packet in the video file. Thus, when the server receives an acquisition request carrying a first audio track type sent by the terminal, the server can judge the data type of the data at the position corresponding to each position offset according to the corresponding relation between the data type stored in the file header of the video file and the position offset, if the data type is the video data type, the server can acquire a corresponding video frame from the position corresponding to the position offset in the video file, and if the data type is the audio data type, the server can acquire a corresponding audio data packet in the first audio track audio file according to the data volume of the sent audio data packet. For example, the data stored in a video file may be, in order: after receiving an acquisition request carrying a first audio track type, the server obtains, from table 1, video frames 1 (corresponding position information is position offset 1), null data (corresponding position information is position offset 2), video frames 2 (corresponding position information is position offset 3), and null data (corresponding position information is position offset 4), where, as can be seen from table 1, data at a position corresponding to position offset 1 is a video data type, the video frames at a position corresponding to position offset 1 can be directly acquired, and then, data at a position corresponding to position offset 2 can be determined to be an audio data type, the server can acquire audio data of a preset data amount from an audio file of an audio track corresponding to the first audio track type, and then, data at a position corresponding to position offset 3 can be determined to be a video data type, and then, video frames at a position corresponding to position offset 3 can be directly acquired, finally, it may be determined that the data at the position corresponding to the position offset 4 is of the audio data type, and the server may obtain the audio data of the preset data amount after the position where the audio data was obtained last time in the audio file of the audio track corresponding to the first audio track type.

Optionally, the user may perform an audio track switching operation during the process of watching the target video, and accordingly, the processing procedure may be as follows: in the process of sending a video file and an audio track file corresponding to a first audio track type to a terminal, when receiving an audio track switching notification carrying a second audio track type and a first playing position sent by the terminal, stopping sending the audio track file corresponding to the first audio track type to the terminal, and sending audio data contained in the audio track file corresponding to the second audio track type with the playing position behind the first playing position to the terminal; the first playing position is used for representing the playing position of the audio data to be played when the terminal receives an audio track switching instruction input by a user.

In an alternative embodiment, during the process of watching the target video, the user may switch the previously selected first track type to the second track type (for example, may click on an option of the second track type displayed on the playing page), at this time, the terminal receives a track switching instruction input by the user, and may further determine the first playing position of the audio data to be played, and send a track switching notification to the server. Specifically, after receiving the acquisition request sent by the terminal, the server may first send data recorded in a file header of the video file to the terminal (i.e., may send the playing of each video frame and audio data packet to the terminal in sequence). In this case, when the terminal receives the audio track switching instruction, it may obtain a first playing position used for representing the audio data to be played (where the first playing position may be a playing position of the video data or the audio data currently being played, or a playing position of the video data or the audio data after the audio data currently being played). After the first playing position is acquired, an audio track switching notification carrying the second audio track type and the first playing position may be sent to the server, and after the server receives the audio track switching notification carrying the second audio track type and the first playing position sent by the terminal, the server may stop sending the audio track audio file corresponding to the first audio track type to the terminal, and may send audio data of the playing position in the audio track audio file corresponding to the second audio track type after the first playing position to the terminal.

Specifically, in the process of sending the video file and the audio track file corresponding to the first audio track type to the terminal, when receiving an audio track switching notification carrying a second audio track type and a first playing position sent by the terminal, the server may stop sending the audio track file corresponding to the first audio track type to the terminal, may determine a third playing position of video data sent to the terminal last time, and may suspend sending the video file to the terminal, and may further send target audio data included in the audio track file corresponding to the second audio track type having a playing position between the first playing position and the third playing position to the terminal, and after sending the target audio data is completed, the server may send video data included in the video file having a playing position after the third playing position and audio data included in the audio track file corresponding to the second audio track type to the terminal, that is, the server may sequentially send, to the terminal, video data of which the playing position in the video file is after the third playing position and audio data of which the playing position in the audio track file corresponding to the second audio track type is after the third playing position according to the playing sequence of each video data after the third playing position and the audio data in the audio track file corresponding to the second audio track type.

Optionally, the user may also fast forward or rewind while watching the target video, and accordingly, the processing procedure may be as follows: in the process of sending a video file and an audio track file corresponding to a first audio track type to a terminal, when a positioning playing request which is sent by the terminal and carries a second playing position is received, sending video data contained in the video file with the playing position behind the second playing position and audio data contained in the audio track file corresponding to the first audio track type to the terminal; the second playing position is used for representing the playing position of the video data to be played when the terminal receives a positioning playing instruction input by a user.

In an optional embodiment, a user may perform positioning playing by dragging the play bar during the process of watching a target video, at this time, the terminal is triggered to receive a positioning playing instruction input by the user, and then, a playing position of video data to be played may be determined, and a positioning playing request is sent to the server. Specifically, when the terminal receives the positioning play instruction, a second play position used for representing video data to be played may be obtained (where the second play position may be a play position of video data or audio data currently being played, or a play position of video data before or after the video data or audio data currently being played). After the second playing position is obtained, a positioning playing request carrying the second playing position may be sent to the server, and after the server receives the positioning playing request carrying the second playing position sent by the terminal, video data included in a video file with the playing position behind the second playing position and audio data included in an audio track audio file corresponding to the first audio track type may be sent to the terminal.

Still another exemplary embodiment of the present disclosure provides an apparatus for transmitting a target video, as shown in fig. 3, the apparatus including:

an obtaining module 310, configured to obtain a multi-track video file of a target video;

a conversion module 320, configured to convert the multi-track video file into a video file and multiple track audio files, where the video file includes video data in the multi-track video file, and each track audio file includes audio data corresponding to different track types included in the multi-track video file;

a sending module 330, configured to send the video file and the audio file of the audio track corresponding to the first audio track type to the terminal when an acquisition request for the target video sent by the terminal is received, where the acquisition request carries the first audio track type.

Optionally, the converting module 320 is configured to:

Optionally, the sending module 330 is configured to:

Optionally, the sending module 330 is further configured to:

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

It should be noted that: in the apparatus for transmitting a target video according to the above embodiment, when transmitting a target video, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the server is divided into different functional modules to complete all or part of the above described functions. In addition, the apparatus for sending a target video and the method embodiment for sending a target video provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment and are not described herein again.

Yet another exemplary embodiment of the present disclosure shows a schematic structural diagram of a server. Fig. 4 is a block diagram illustrating an apparatus 1900 for transmitting a target video according to an example embodiment. For example, the apparatus 1900 may be provided as a server. Referring to fig. 4, the device 1900 includes a processing component 1922 further including one or more processors and memory resources, represented by memory 1932, for storing instructions, e.g., applications, executable by the processing component 1922. The application programs stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processing component 1922 is configured to execute instructions to perform the method of transmitting the target video described above.

The device 1900 may also include a power component 1926 configured to perform power management of the device 1900, a wired or wireless network interface 1950 configured to connect the device 1900 to a network, and an input/output (I/O) interface 1958. The device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

Device 1900 may include memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors include instructions for:

acquiring a multi-track video file of a target video;

Optionally, the method further includes:

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. a method for sending target video, is characterized in that, described method comprises:

Get the multi-track video file of the target video;

The multi-track video file is converted into a video file and a plurality of audio-track audio files, wherein the video file includes the video frames in the multi-track video file, and each audio-track audio file contains the multi-track audio files respectively. The audio data packets corresponding to the different audio track types contained in the audio track video file, the attributes of each audio track audio file are the same, wherein the attributes include the package frame length;

When an acquisition request for the target video sent by the terminal is received, and the acquisition request carries the first audio track type, send the audio track corresponding to the video file and the first audio track type to the terminal audio files;

In the process of sending the video file and the audio file of the audio track corresponding to the first audio track type to the terminal, when the audio file carrying the second audio track type and the first playback position sent by the terminal is received At the time of the track switching notification, stop sending the audio file of the audio track corresponding to the first audio track type to the terminal, determine the third playback position of the video file sent to the terminal most recently, and suspend sending the video to the terminal file, send the target audio data packet contained in the audio track audio file corresponding to the second audio track type whose playback position is between the first playback position and the third playback position to the terminal, when the target audio data packet is After sending, send to the terminal the video frame contained in the video file whose playback position is after the third playback position and the audio data packet contained in the audio track audio file corresponding to the second audio track type, wherein the The first playback position is used to indicate the playback position of the audio data packet to be played when the terminal receives the audio track switching instruction input by the user.

2. The method according to claim 1, wherein the described multi-track video file is converted into a video file and a plurality of track audio files, comprising:

According to each video frame contained in the multi-audio-track video file and the offset information of each audio data packet corresponding to each audio track in the multi-audio-track video file, obtain the video frames composed of the video files, and obtain the corresponding video files. Each audio data packet of a track constitutes a plurality of track audio files.

3. The method according to claim 1, characterized in that, when receiving an acquisition request for the target video sent by a terminal, when the acquisition request carries the first audio track type, send the request to the terminal. Send the audio file of the audio track corresponding to the video file and the first audio track type, including:

When an acquisition request for the target video sent by the terminal is received, and the acquisition request carries the first audio track type, each video frame contained in the video file and the first audio track audio file contain The playback sequence of each audio data packet is sent to the terminal in sequence, the video frame contained in the video file and the audio data packet contained in the audio track audio file corresponding to the first audio track type.

4. The method according to claim 3, wherein when an acquisition request for the target video sent by the terminal is received, and the acquisition request carries the first audio track type, according to the video The playback sequence of each video frame contained in the file and each audio data packet contained in the first audio track audio file, sending the video frames contained in the video file to the terminal in turn corresponding to the type of the first audio track An audio track audio file contains audio packets, including:

When an acquisition request for the target video sent by the terminal is received, and the acquisition request carries the first audio track type, acquire each video frame included in the video file recorded in the file header of the video file and the playback order of each audio data packet included in the first track audio file;

According to the playing sequence, the video frames contained in the video file and the audio data packets contained in the audio track audio file corresponding to the first audio track type are sequentially sent to the terminal.

5. The method according to claim 1, wherein the method further comprises:

In the process of sending the video file and the audio file of the audio track corresponding to the first audio track type to the terminal, when receiving the positioning playback request that carries the second playback position sent by the terminal, send the audio file to the terminal. Sending the audio data packets contained in the video frame contained in the video file whose playback position is after the second playing position and the audio track audio file corresponding to the first audio track type;

The second playback position is used to indicate the playback position of the video frame to be played when the terminal receives the positioning playback instruction input by the user.

6. A device for sending a target video, wherein the device comprises:

The acquisition module is used to acquire the multi-track video file of the target video;

The conversion module is used to convert the multi-track video file into a video file and a plurality of audio-track audio files, wherein the video file comprises the video frame in the multi-track video file, and each audio-track audio file The audio data packets corresponding to the different audio track types contained in the multi-audio-track video files are respectively included, and the attributes of each audio-track audio file are the same, wherein the attributes include the package frame length;

A sending module, configured to send the video file and the first audio track to the terminal when receiving an acquisition request for the target video sent by the terminal, and the acquisition request carries the first audio track type The audio file of the audio track corresponding to the type;

The sending module is further configured to, in the process of sending the video file and the audio track audio file corresponding to the first audio track type to the terminal, when receiving the second audio track sent by the terminal When the audio track switching notification of the type and the first playback position, stop sending the audio track audio file corresponding to the first audio track type to the terminal, and determine the third playback position of the video file sent to the terminal most recently, and suspend sending the video file to the terminal, and send the target audio data contained in the audio track audio file corresponding to the second audio track type whose playback position is between the first playback position and the third playback position to the terminal package, after sending the target audio data packet, send to the terminal the video frame contained in the video file whose playback position is after the third playback position and the audio track audio file corresponding to the second audio track type. The audio data package, wherein the first playback position is used to indicate the playback position of the audio data package to be played when the terminal receives the audio track switching instruction input by the user.

7. The device according to claim 6, wherein the conversion module is used for:

8. The apparatus according to claim 6, wherein the sending module is configured to:

9. The apparatus according to claim 8, wherein the sending module is configured to:

10. The device according to claim 6, wherein the sending module is further configured to: