[go: up one dir, main page]

CN113891108A - Subtitle optimization method, device, electronic device and storage medium - Google Patents

Subtitle optimization method, device, electronic device and storage medium Download PDF

Info

Publication number
CN113891108A
CN113891108A CN202111214119.5A CN202111214119A CN113891108A CN 113891108 A CN113891108 A CN 113891108A CN 202111214119 A CN202111214119 A CN 202111214119A CN 113891108 A CN113891108 A CN 113891108A
Authority
CN
China
Prior art keywords
subtitle
video stream
live video
player
subtitles
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111214119.5A
Other languages
Chinese (zh)
Inventor
刘坚
李秋平
何心怡
王明轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Youzhuju Network Technology Co Ltd
Original Assignee
Beijing Youzhuju Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Youzhuju Network Technology Co Ltd filed Critical Beijing Youzhuju Network Technology Co Ltd
Priority to CN202111214119.5A priority Critical patent/CN113891108A/en
Publication of CN113891108A publication Critical patent/CN113891108A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/81Monomedia components thereof
    • H04N21/8126Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts
    • H04N21/8133Monomedia components thereof involving additional data, e.g. news, sports, stocks, weather forecasts specifically related to the content, e.g. biography of the actors in a movie, detailed information about an article seen in a video program

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本公开实施例公开了一种字幕优化方法、装置、电子设备和存储介质,该方法包括:显示用户界面,所述用户界面包括播放器和直播视频流中音频流对应的一条或多条第一字幕;响应于针对目标字幕的触发操作,在所述播放器中播放与所述目标字幕对应的直播视频流片段,以使得用户通过观看所述直播视频流片段对所述目标字幕进行校对;其中,所述目标字幕为所述一条或多条第一字幕中的字幕。通过本公开实施例提供的字幕优化方案,提高了字幕校对的准确度与效率。

Figure 202111214119

Embodiments of the present disclosure disclose a subtitle optimization method, apparatus, electronic device, and storage medium. The method includes: displaying a user interface, where the user interface includes a player and one or more first subtitles corresponding to an audio stream in a live video stream subtitles; in response to a triggering operation for a target subtitle, playing a live video stream segment corresponding to the target subtitle in the player, so that the user can proofread the target subtitle by watching the live video stream segment; wherein , the target subtitles are subtitles in the one or more first subtitles. The subtitle optimization solution provided by the embodiments of the present disclosure improves the accuracy and efficiency of subtitle proofreading.

Figure 202111214119

Description

Subtitle optimization method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of information technology, and in particular, to a method and an apparatus for optimizing subtitles, an electronic device, and a storage medium.
Background
With the continuous development of video live broadcast technology, the demand of users for live video streaming is also increasing. In order to improve user experience, the subtitles can be distributed to the live video stream, and then the live video stream added with the subtitles is sent to a user terminal to be played.
In the prior art, subtitles are corrected manually. But the efficiency of manual proofreading is low and the accuracy is not high.
Disclosure of Invention
In order to solve the technical problem or at least partially solve the technical problem, embodiments of the present disclosure provide a method, an apparatus, an electronic device, and a storage medium for optimizing subtitles, which are helpful for improving the efficiency and quality of correcting subtitles.
In a first aspect, an embodiment of the present disclosure provides a subtitle optimization method, where the method includes:
displaying a user interface, wherein the user interface comprises a player and one or more first subtitles corresponding to an audio stream in a live video stream;
responding to a trigger operation aiming at a target subtitle, playing a live video stream segment corresponding to the target subtitle in the player, so that a user proofreads the target subtitle by watching the live video stream segment;
wherein the target caption is one of the one or more first captions.
In a second aspect, an embodiment of the present disclosure further provides a subtitle optimizing apparatus, where the apparatus includes:
the first display module is used for displaying a user interface, and the user interface comprises a player and one or more first subtitles corresponding to an audio stream in a live video stream;
the playing module is used for responding to the triggering operation aiming at the target caption, and playing a live video stream segment corresponding to the target caption in the player so that a user proofreads the target caption by watching the live video stream segment;
wherein the target caption is one of the one or more first captions.
In a third aspect, an embodiment of the present disclosure further provides an electronic device, where the electronic device includes:
one or more processors;
storage means for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement a caption optimization method as described above.
In a fourth aspect, the disclosed embodiments also provide a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the subtitle optimization method as described above.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has at least the following advantages:
according to the subtitle optimization method provided by the embodiment of the disclosure, one or more first subtitles corresponding to an audio stream in a player and a live video stream are displayed on a user interface, and when a trigger operation for a target subtitle is received, a live video stream segment corresponding to the target subtitle is played in the player, so that a user proofreads the target subtitle by watching the live video stream segment, wherein the target subtitle is a subtitle in the one or more first subtitles. By the technical scheme, any frame of live broadcast picture of the live broadcast video stream can be conveniently played back, repeated watching of the same live broadcast video stream segment by a proofreader of the original text is facilitated, proofreading is carried out while watching, and the aims of improving the proofreading quality and the proofreading efficiency of the first subtitle can be fulfilled.
Drawings
The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. Throughout the drawings, the same or similar reference numbers refer to the same or similar elements. It should be understood that the drawings are schematic and that elements and features are not necessarily drawn to scale.
Fig. 1 is a schematic structural diagram of a live broadcast simultaneous transmission hardware device in an embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of another live broadcast simultaneous transmission hardware device in the embodiment of the present disclosure;
fig. 3 is a flowchart of a subtitle optimization method in an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a user interface in an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a user interface in an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a user interface in an embodiment of the present disclosure;
FIG. 7 is a schematic view of a user interface in an embodiment of the present disclosure;
FIG. 8 is a schematic diagram of a user interface in an embodiment of the present disclosure;
FIG. 9 is a schematic illustration of a user interface in an embodiment of the present disclosure;
FIG. 10 is a schematic illustration of a user interface in an embodiment of the present disclosure;
fig. 11 is a schematic structural diagram of a subtitle optimizing apparatus in an embodiment of the present disclosure;
fig. 12 is a schematic structural diagram of an electronic device in an embodiment of the disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
Before explaining the subtitle optimization scheme provided by the embodiment of the present disclosure, hardware devices and application scenarios related to the subtitle optimization scheme are briefly introduced, so as to facilitate better understanding of the subtitle optimization scheme provided by the embodiment of the present disclosure.
Live broadcast simultaneous transmission proofreading refers to: the method comprises the steps of adding subtitles to live content of a main broadcast and then sending the added subtitles to a broadcast watching end so that a user of the broadcast watching end can see a live broadcast picture with the subtitles, firstly carrying out voice recognition on live broadcast audio by a machine in the subtitle adding process to obtain a first subtitle to be corrected, and then carrying out machine translation on the first subtitle to be corrected to obtain a second subtitle to be corrected (for example, the first subtitle is Chinese, and the second subtitle is corresponding English). And the original text proofreader proofreads the first subtitles, if an error is found, the manual modification is carried out, the translated text proofreader proofreads the second subtitles, and if an error is found, the manual modification is carried out. It is understood that the original proofreader and the translation proofreader may be the same person or different persons, and usually, in order to reduce the work intensity and improve the work efficiency and the proofreading accuracy, the original proofreader and the translation proofreader are different persons.
Wherein, live with pass proofreading flow includes: the live broadcast simultaneous transmission hardware equipment pulls a live broadcast video stream of a main broadcast from a server or a main broadcast end, records and processes the live broadcast video stream (the processing comprises the steps of collecting audio in the live broadcast video stream, carrying out voice recognition on the audio to obtain a first subtitle to be corrected, and translating the first subtitle to be corrected to obtain a second subtitle to be corrected), then plays the recorded live broadcast video stream through audio and video equipment, displays the first subtitle and the second subtitle on a display interface, a proofreader of original text proofreads the first subtitle, if errors are found, manual modification is carried out, a proofreader of translated text proofreads the second subtitle, and if errors are found, manual modification is carried out.
Optionally, referring to a schematic structural diagram of a live broadcast simultaneous transmission hardware device shown in fig. 1, where the live broadcast simultaneous transmission hardware device and the audio and video device are the same device, and the live broadcast simultaneous transmission hardware device and the audio and video device correspond to the device 24 in fig. 1. The original proofreader and the translated text proofreader correspond to one live broadcast simultaneous transmission hardware device respectively, for example, the original proofreader proofreads the first subtitles based on the device 24 (the device 24 may be regarded as a first live broadcast simultaneous transmission hardware device), and the translated text proofreader proofreads the second subtitles based on the device 25 (the device 25 is a backup of the device 24, and may be regarded as a second live broadcast simultaneous transmission hardware device). The terminal 21 corresponds to a terminal of the anchor, and the terminal 21 uploads the live video stream to the server 22. The device 24 pulls the live video stream from the terminal 21 or the server 22. For example, the device 24 pulls the live video stream from the server 22 according to a URL (Uniform Resource Locator) of the live video stream. Terminal 27 corresponds to the terminal of the viewing user and device 26 corresponds to a server.
The moment when the device 24 starts pulling the live video stream can be any time. Optionally, the device 24 starts to pull the live video stream after the original proofreader issues a "start instruction". For example, the proof reader may click on a button or icon in the user interface of device 24 at the 9:50 time of the day, i.e., issue a "start command," and device 24 may pull the live video stream from the 9:50 time of the day. Further, if the proof reader clicks "start live" on the user interface of the device 24 at 10:00 of the day, the device 24 records the live video stream pulled by the proof reader from 10:00, and the device 24 synchronously processes the pulled live video stream from 10:00, wherein the processing procedure includes: the audio in the live video stream is collected, the collected audio is subjected to speech recognition to obtain a first caption, the first caption is displayed on the display interface of the device 24, so that the original proofreader can proofread the first caption, the result of the speech recognition (for example, a chinese text) is translated to obtain a translation text (for example, english), that is, a second caption, and the second caption is displayed on the display interface of the device 25, so that the translation proofreader can proofread the second caption.
Optionally, referring to a schematic structural diagram of another live broadcast simultaneous transmission hardware device shown in fig. 2, where the live broadcast simultaneous transmission hardware device and the audio-video device are not the same device, but two different devices, for example, the live broadcast simultaneous transmission hardware device corresponds to the device 24 in fig. 3, the audio-video device corresponds to the second server 23 in fig. 3, the original text proofreader and the translation proofreader perform subtitle proofreading based on different live broadcast simultaneous transmission hardware devices, for example, the original text proofreader performs subtitle proofreading based on the device 24 (the device 24 may be regarded as a first live broadcast simultaneous transmission hardware device), and the translation proofreader performs proofreading based on the device 25 (the device 25 is a backup of the device 24, and may be regarded as a second live broadcast simultaneous transmission hardware device). The terminal 21 corresponds to a terminal of a main broadcast, the terminal 21 uploads a live video stream to the first server 22, and the second server 23 pulls the live video stream from the first server 22 or the terminal 21. Terminal 27 corresponds to the terminal of the viewing user and device 26 corresponds to a server.
The moment when the second server 23 starts pulling the live video stream from the first server 22 or the terminal 21 may be any time. Optionally, the second server 23 starts to pull the live video stream after the original proofreader issues a "start instruction". For example, the proof reader clicks a certain button or icon in the user interface of the device 24 at the time of 9:50 th day to issue a "start instruction", the device 24 sends the "start instruction" to the second server 23, and the second server 23 starts to pull the live video stream from the first server 22 or the terminal 21 after receiving the "start instruction". At 10:00 of the day, the proofreader of the original text clicks a "start live broadcasting" button on the user interface of the device 24, the device 24 sends a recording instruction to the second server 23 according to the clicking operation of the proofreader of the original text, and assuming that the second server 23 receives the recording instruction soon, that is, the second server 23 receives the recording instruction at 10:00, the second server 23 records the live video stream pulled by the second server 23 from 10:00, and the second server 23 synchronously processes the pulled live video stream from 10:00, that is, the recording of the live video stream and the processing of the live video stream are synchronously performed. The processing operation on the live video stream comprises the following steps: the audio in the live video stream is collected, the collected audio is subjected to speech recognition to obtain a first caption, the first caption is displayed on the display interface of the device 24, so that the original proofreader can proofread the first caption, the result of the speech recognition (for example, a chinese text) is translated to obtain a translation text (for example, an english text), that is, a second caption, and the second caption is displayed on the display interface of the device 25, so that the translation proofreader can proofread the second caption.
Taking fig. 1 as an example, if the original proofreader modifies the first subtitle (for example, chinese text) based on the process of the device 24 performing the proofreading of the first subtitle, the device 24 synchronizes the modified first subtitle to the device 25, so that the translation proofreader modifies the corresponding second subtitle (for example, english text) according to the modified first subtitle. Further, the device 25 sends the modified second subtitle to the device 24.
Taking fig. 2 as an example, if the original proofreader modifies the first subtitle (for example, chinese text) based on the process of the device 24 performing the proofreading on the first subtitle, the device 24 synchronizes the modified first subtitle to the second server 23, and the second server 23 further synchronizes the modified first subtitle to the device 25, so that the translation proofreader modifies the corresponding second subtitle (for example, english text) according to the modified first subtitle. Further, the device 25 sends the modified second subtitle to the second server 23, and synchronizes the modified second subtitle to the device 24 through the second server 23.
Fig. 3 is a flowchart of a subtitle optimization method in an embodiment of the present disclosure, where the subtitle optimization method is applied to live broadcast simultaneous transmission hardware devices, and aims to improve the proofreading accuracy and the proofreading efficiency of subtitles to be proofread through a certain subtitle optimization scheme. The method can be executed by a subtitle optimization device, the device can be realized in a software and/or hardware mode, and the device can be configured in live broadcast simultaneous transmission hardware equipment, such as an electronic terminal, specifically including but not limited to a smart phone, a palm computer, a tablet computer, wearable equipment with a display screen, a desktop computer, a notebook computer, an all-in-one machine, smart home equipment and the like. As shown in fig. 3, the method may specifically include the following steps:
step 301, displaying a user interface, where the user interface includes a player and one or more first subtitles corresponding to an audio stream in a live video stream.
Specifically, a user interface, such as the schematic diagram of a user interface shown in fig. 4, may be displayed on the display, and includes a player 410 and a plurality of first subtitles 420 corresponding to audio streams in a live video stream. The number of the first subtitles 420 may also be one, and fig. 4 shows an example where the number of the first subtitles 420 is 3 (a plurality is usually at least two).
The first subtitle is generally text obtained by performing audio extraction on a live video stream and performing voice recognition based on the extracted audio. Since audio extraction and speech recognition are usually performed automatically by a machine, the accuracy is not high, for example, the actual text corresponding to the audio is "zhang san", and the result of the speech recognition is "zhang san", so in order to improve the accuracy of the first subtitle, the first subtitle is usually checked manually after being obtained to be modified in time when an error is found. When the first subtitle is calibrated, the recorded live video stream is usually played in the player 410, and the original text calibrator can calibrate the first subtitle while watching the video, so that the calibration efficiency and the calibration accuracy can be improved.
Illustratively, a plurality of first subtitles are displayed in a first area of the user interface in a contextual manner, as shown in fig. 4, by displaying the plurality of first subtitles in the user interface in the contextual manner, an original proofreader can proofread the first subtitles by combining longitudinal context information, and can complete positioning and retrieval of content conveniently and quickly by the original proofreader, which not only can improve proofreading precision, but also can help to improve proofreading efficiency.
In one embodiment, the language corresponding to the first subtitle is the same as the language corresponding to the audio stream. For example, if the language corresponding to the audio stream is chinese, the first subtitle is a chinese text, and if the language corresponding to the audio stream is english, the first subtitle is an english text.
In one embodiment, the language corresponding to the first subtitle is different from the language corresponding to the audio stream. For example, if the language corresponding to the audio stream is chinese, the first subtitle is an english text, and if the language corresponding to the audio stream is english, the first subtitle is a chinese text.
Step 302, responding to a trigger operation for a target subtitle, playing a live video stream segment corresponding to the target subtitle in the player, so that a user proofs the target subtitle by watching the live video stream segment.
And the target caption is a caption in the one or more first captions. For example, the target subtitle is a subtitle 420 shown in fig. 4. The triggering operation for the target caption may be an operation of clicking the target caption, an operation of sliding the target caption, an operation of clicking a related control associated with the target caption, an operation of triggering a shortcut key when the target caption is in a specific state, or the like.
Specifically, the playing a live video stream segment corresponding to a target subtitle in the player in response to a trigger operation for the target subtitle includes:
responding to a triggering operation acted on a playing control piece associated with the target subtitle, and playing a live video stream segment corresponding to the target subtitle in the player; and when the target subtitle is in an editing state, displaying the playing control at the associated position of the target subtitle. For example, when the mouse is suspended above the target subtitle, the target subtitle is in an editing state, and the original translator can delete or modify a certain character of the target subtitle, or add a certain character to the target subtitle, etc. to edit the target subtitle; or when the target caption is selected, the target caption is in an editing state; or when the relevant control is clicked, the target subtitle is in an editing state. When the target subtitle is in an editing state, a play control is displayed at a position associated with the target subtitle, as shown in fig. 5, the target subtitle 420 is in the editing state, a play control 510 is displayed at a position associated with the target subtitle 420, and when the original translator clicks the play control 510, a live video stream segment corresponding to the target subtitle 420 is played in the player 410. The live video stream segment corresponding to the target subtitle 420 refers to: the target caption 420 can be obtained by performing voice recognition on the audio in the live video stream segment, that is, the live video stream segment whose audio semantic is the semantic expressed by the target caption.
Optionally, the playing, in response to the trigger operation for the target subtitle, a live video stream segment corresponding to the target subtitle in the player includes:
and when the target subtitle is in an editing state, responding to the triggering operation of a preset shortcut key to play a live video stream segment corresponding to the target subtitle in the player.
According to the technical scheme, the plurality of first subtitles are displayed on the user interface in the context mode, so that an original text proofreader can conveniently proofread the first subtitles by combining the context, and the proofreading efficiency and the proofreading accuracy are improved; when an original text proofreader wants to watch a video picture corresponding to a target subtitle, a certain operation can be triggered aiming at the target subtitle so that the video picture corresponding to the target subtitle can be seen in a player, the original text proofreader can conveniently play back any frame of video picture, and the first subtitle can be proofread conveniently by combining the video.
In some embodiments, to further facilitate the collation of the first subtitles, a recorded live video stream may be played in one player and a video segment played back in another player. Specifically, as shown in fig. 6, the player 420 includes a first player 610 and a second player 620. The playing of the live video stream segment corresponding to the target subtitle in the player includes: and playing a live video stream segment corresponding to the target caption in the first player 610. Playing the recorded live video stream in the second player 620 in response to the live video stream playing instruction; and in the process of playing the recorded live video stream, responding to a first subtitle modification instruction, and modifying the first subtitle pointed by the first subtitle modification instruction. Specifically, for example, when the original proofreader clicks (may be a touch click, or may be a click manner such as a mouse click) an icon or button (such as the "start live" icon 630 shown in fig. 6) of "start live" on the live broadcast hardware device user interface, a live broadcast video stream playing instruction is triggered, and in response to the instruction, the recorded live broadcast video stream is played in the second player 620, and meanwhile, the original proofreader may also pause the playing of the live broadcast video stream at any time according to the self-proofreading condition. Therefore, the original text proofreader can control the playing of the live video stream in real time according to the self proofreading progress, and can achieve the purposes of proofreading, watching live broadcast and listening audio, so that the aim of proofreading the first subtitle by means of the heard audio and the mouth shape of the main broadcast in the live broadcast picture is achieved, and the proofreading accuracy and the proofreading efficiency of the first subtitle can be improved. In the proofreading process, if the original text proofreader finds that the first subtitle does not correspond to the text determined based on the live video stream heard and seen by the original text proofreader, the first subtitle is modified, and the purpose of proofreading the first subtitle is achieved.
The setting positions of the first player 610 and the second player 620 in the user interface are not limited, and the position of the first player 610 and the position of the second player 620 may be in an up-down relationship or a left-right relationship as shown in fig. 6. While the second player 620 for playing the recorded live video stream may be larger and the first player 610 for playing back the video segments may be smaller. Alternatively, the original proof reader is enabled to manually resize the first player 610 and the second player 620, and when playback of the video segment is required, the first player 610 may be resized for easy viewing.
On the basis of the foregoing embodiment, before playing the recorded live video stream in the second player in response to a live video stream playing instruction, the method further includes:
acquiring the live video stream according to the address information of the live video stream; and responding to a live broadcast starting instruction, and recording the live broadcast video stream. The address information of the live video stream is, for example, a source stream URL, and may be an address of the anchor. The relevant staff may fill in the source stream URL manually, or may automatically fill in the source stream URL by a machine in an automated manner, for example, when a user selects a target live video stream through a selection interface of another live video stream, the source stream URL of the target live video stream is automatically filled in the source stream URL entry box.
Further, the method further comprises:
and responding to a second caption display instruction, displaying second captions corresponding to the one or more first captions in the user interface respectively, wherein the languages corresponding to the second captions are different from the languages corresponding to the audio stream, and the second captions are displayed in a second area of the user interface in a contextual mode. Controlling the second area to be hidden and displayed in the user interface in response to a second subtitle hiding instruction; and responding to a second subtitle modification instruction, and modifying the second subtitle pointed by the second subtitle modification instruction. Any first subtitle in the one or more first subtitles and a second subtitle corresponding to the any first subtitle are in a transverse contrast relation in the user interface.
Optionally, the second subtitle display instruction may be triggered by clicking a preset icon or a preset button disposed in the user interface, for example, a user interface diagram shown in fig. 7 includes a preset icon 710, when the icon 710 is clicked, the second subtitle display instruction is triggered, second subtitles corresponding to one or more first subtitles are displayed in the user interface, for example, a user interface diagram shown in fig. 8 in which second subtitles 820 corresponding to each first subtitle 810 are displayed, and when the preset icon 830 in fig. 8 is clicked, the second subtitle hiding instruction is triggered to hide the second subtitles displayed in the user interface, that is, return to the user interface diagram shown in fig. 7 in which only the first subtitles 720 are displayed and the second subtitles corresponding to the first subtitles are not displayed.
With continued reference to FIG. 8, in one embodiment, the second subtitle 820 is displayed in the second area of the user interface in the form of a context. By displaying the plurality of second subtitles on the user interface in the context mode, a translation proofreader can proofread the second subtitles by combining the longitudinal context information, so that the proofreading precision can be improved, and the proofreading efficiency can be improved.
In an embodiment, any one of the one or more first subtitles and the second subtitle corresponding to any one of the first subtitles are in a horizontal comparison relationship in the user interface, as shown in fig. 8, the second subtitle corresponding to the first subtitle is in a horizontal comparison relationship in the user interface, so that an original text proofreader can conveniently proofread the first subtitle with reference to the second subtitle corresponding to the first subtitle, a translation proofreader can conveniently proofread the second subtitle with reference to the first subtitle corresponding to the second subtitle, and the proofreading efficiency and accuracy can be improved.
Further, the method further comprises: and marking the pushed first subtitle, the identification information corresponding to the pushed first subtitle and the second subtitle corresponding to the pushed first subtitle in the user interface according to the pushing progress of the live video stream.
By marking the pushed first caption, the identification information corresponding to the pushed first caption and the second caption corresponding to the pushed first caption in the user interface according to the pushing progress of the live video stream, an original proofreader can master the pushing progress of the live video stream at any time, and the proofreading rhythm and rate can be adjusted at any time.
By marking the corrected first subtitles in the user interface, the translation proofreader can perform proofreading of the second subtitles based on the first subtitles already proofread by the original proofreader, meanwhile, the proofreading speeds of the original proofreader and the translation proofreader can be kept balanced, and the proofreading efficiency of the original proofreader and the translation proofreader is improved together.
As shown in fig. 9, the user interface further includes identification information corresponding to one or more first subtitles, for example, the left sequence number "1, 2, 3, 4, 5, 6, 7, 8, 9" of each first subtitle, where "1, 2, 3, 4, 5, 6, 7, 8, 9" of each first subtitle is a screen-casting progress sequence number, and each screen-casting progress sequence number corresponds to a chinese text of a sentence in audio. The pushed state of the subtitles may be represented by different colors, for example, a green part (such as the subtitle 910 in fig. 9) represents the subtitles that have been pushed (i.e. the live video stream is pushed to the viewer, such as the device 27 shown in fig. 1), and the remaining non-green part (such as the subtitle part represented by the reference numeral 920) represents the subtitles that have not been pushed. The blue portion (e.g., the subtitle 930 in fig. 9), i.e., the 6 th line of subtitles, represents the portion that the proof reader is currently proofreading, the subtitles above the blue portion (e.g., the subtitles indicated by the reference numeral 940, i.e., the subtitles in the lines 1, 2, 3, 4, and 5) represent the subtitles that have already been proofread, and the subtitles below the blue portion (e.g., the subtitles indicated by the reference numeral 1050, i.e., the subtitles in the lines 7, 8, and 9) represent the subtitles that have not yet been proofread. By displaying the identification information on the user interface, an original proofreader can master the live push progress at any time so as to adjust the proofreading rhythm and rate at any time. For example, it can be determined from fig. 9 that the 2 nd, 3 rd, 4 th, and 5 th subtitles are subtitles that have been collated but have not been pushed yet, and the 7 th and 8 th subtitles are subtitles that have not been collated yet. Correspondingly, taking fig. 1 as an example, the device 24 may send the screen-casting progress sequence number of the first subtitle already checked by the original proofreader to the device 25, and the device 25 may identify the first subtitle already checked by the original proofreader in the local user interface, for example, by a certain color, so that the translation proofreader performs proofreading of the second subtitle based on the first subtitle already checked by the original proofreader, and meanwhile, the proofreading speeds of the original proofreader and the translation proofreader may also be kept balanced, and the proofreading efficiencies of the two are improved together.
Further, the method further comprises:
responding to a live broadcast delay setting instruction, and pushing a live broadcast video stream at least comprising the first subtitle according to the live broadcast delay set by the live broadcast delay setting instruction; or responding to a live broadcast delay setting instruction, and pushing a subtitle file formed by the first subtitle and the recorded live broadcast video stream according to the live broadcast delay set by the live broadcast delay setting instruction. The live video stream including the first subtitle refers to a video stream added with a subtitle in the live video stream. For example, the live broadcast delay set by the live broadcast delay setting instruction is 30 minutes, when the current day is 10:00, the original proofreader triggers a live broadcast start instruction, the device 24 or the server 23 in fig. 2 starts to record the pulled live broadcast video stream, and the device 24 or the server 23 pushes the live broadcast video stream with the subtitles to the terminal 27 of the watching user at 10:30 of the current day. Alternatively, the device 24 or the server 23 pushes the subtitle file and the already recorded live video stream to the terminal 27 of the watching user at 10:30 of the day.
Taking fig. 1 as an example, in one embodiment: the device 24 may generate a subtitle image according to each piece of collated first subtitle, second subtitle, and time information corresponding to the first subtitle. The time information of the subtitle image is the time information corresponding to the first subtitle, and the time information corresponding to the first subtitle is the time stamp of the audio frame corresponding to the first subtitle. Further, the device 24 determines, according to the time information of the generated subtitle image, a video frame corresponding to the time information in the live video stream, and combines the subtitle image with a picture of the video frame corresponding to the time information, thereby obtaining the live video stream to which the subtitle is added. The device 24 may send the live video stream added with the subtitles to the server 26 when the preset delay time is reached according to a preset delay time (for example, 30 minutes), and the server 26 sends the live video stream added with the subtitles to the terminal 27, so that the user at the viewing end can view the live video stream added with the first subtitles and/or the second subtitles.
In another embodiment: the device 24 may generate a subtitle file according to each piece of collated first subtitle, second subtitle, and time information corresponding to the first subtitle, and when a preset delay time is reached, the device 24 sends the subtitle file and the recorded live video stream to the server 26, the server 26 sends the subtitle file and the live video stream to the terminal 27, the terminal 27 plays the subtitle file and the live video stream synchronously in the player, and the time information in the subtitle file needs to be aligned with the time information in the live video stream, so that a user at a viewing end can view the live video stream to which the first subtitle and/or the second subtitle is added.
Taking fig. 2 as an example, in one embodiment: the server 23 may generate a subtitle image according to each piece of corrected first subtitle, second subtitle, and time information corresponding to the first subtitle. The time information of the subtitle image is the time information corresponding to the first subtitle, and the time information corresponding to the first subtitle is the time stamp of the audio frame corresponding to the first subtitle. Further, the server 23 determines a video frame corresponding to the time information in the live video stream according to the time information of the generated subtitle image, and combines the subtitle image with a picture of the video frame corresponding to the time information, thereby obtaining the live video stream to which the subtitle is added. The server 23 may transmit the live video stream added with the subtitles to the server 26 when a preset delay time (for example, 30 minutes) is reached, and the server 26 may transmit the live video stream added with the subtitles to the terminal 27 of the viewing user.
In another embodiment: the server 23 may generate a subtitle file according to each piece of collated first subtitle, second subtitle, and time information corresponding to the first subtitle, when a preset delay time is reached, the server 23 sends the subtitle file and the recorded live video stream to the server 26, the server 26 sends the subtitle file and the live video stream to the terminal 27 of the viewing user, the terminal 27 plays the subtitle file and the live video stream synchronously in the player, and the time information in the subtitle file needs to be aligned with the time information in the live video stream.
An optional processing flow of the subtitle optimization scheme provided by the embodiment of the disclosure is as follows: firstly pulling the live video stream, then recording the live video stream, and processing the live video stream while recording to obtain a first caption and a second caption. The first caption and the second caption are displayed on a user interface of an original text proofreader and also displayed on a user interface of a translation proofreader, the original text proofreader proofs the first caption, and the translation proofreader proofs the second caption. And generating a subtitle image or a subtitle file according to the corrected first subtitle and the second subtitle, if the subtitle image is the subtitle image, fusing the subtitle image into a live video stream to obtain a live video stream with the subtitle, and further pushing the live video stream with the subtitle to a terminal of a watching user. And if the video stream is the subtitle file, simultaneously pushing the subtitle file and the recorded live video stream to a terminal of a watching user.
Further, referring to a schematic diagram of a user interface shown in fig. 10, the player 420 further includes a third player 630, and the method further includes: the live video stream at least including the first subtitle is played in the third player 630, so that the original proofreader can conveniently check the video frame after the subtitle is added, and further confirm whether the proofread subtitle is accurate and consistent with the video frame.
According to the subtitle optimization method provided by the embodiment of the disclosure, the player is displayed on the user interface, and when a live video stream playing instruction is received, the recorded live video stream is played in the player, so that an original text proofreader can watch a live picture corresponding to the live video stream, and thus the original text proofreader can watch the live picture while proofreading a first subtitle corresponding to an audio stream in the live video stream, namely, listening, watching and proofreading are realized. When an original text proofreader wants to watch a video picture corresponding to a target subtitle, a certain operation can be triggered aiming at the target subtitle so that the video picture corresponding to the target subtitle can be seen in a player, the original text proofreader can conveniently play back any frame of video picture, and the first subtitle can be proofread conveniently by combining the video.
Fig. 11 is a schematic structural diagram of a subtitle optimizing apparatus in an embodiment of the present disclosure. The device provided by the embodiment of the disclosure can be configured in live broadcast simultaneous transmission hardware equipment. As shown in fig. 11, the apparatus specifically includes: a first display module 1110 and a play module 1120.
The first display module 1110 is configured to display a user interface, where the user interface includes a player and one or more first subtitles corresponding to an audio stream in a live video stream; a playing module 1120, configured to, in response to a trigger operation for a target subtitle, play a live video stream segment corresponding to the target subtitle in the player, so that a user checks the target subtitle by watching the live video stream segment; and the target caption is a caption in the one or more first captions.
Optionally, the playing module 1120 is specifically configured to: responding to a triggering operation acted on a playing control piece associated with the target subtitle, and playing a live video stream segment corresponding to the target subtitle in the player; and when the target subtitle is in an editing state, displaying the playing control at the associated position of the target subtitle.
Optionally, the playing module 1120 is specifically configured to: and when the target subtitle is in an editing state, responding to the triggering operation of a preset shortcut key to play a live video stream segment corresponding to the target subtitle in the player.
Optionally, the player includes a first player and a second player; the playing module 1120 is configured to play a live video stream segment corresponding to the target subtitle in the first player, and in response to a live video stream playing instruction, play a recorded live video stream in the second player.
Optionally, the system further includes a modification module, configured to modify, in response to a first subtitle modification instruction, a first subtitle pointed by the first subtitle modification instruction in a process of playing the recorded live video stream.
Optionally, the system further includes an obtaining module, configured to obtain the live video stream according to address information of the live video stream before playing the recorded live video stream in the second player. And the recording module is used for responding to a live broadcast starting instruction and recording the live broadcast video stream.
Optionally, the method further includes: the pushing module is used for responding to a live broadcast delay setting instruction and pushing a live broadcast video stream at least comprising the first caption according to the live broadcast delay set by the live broadcast delay setting instruction; or responding to a live broadcast delay setting instruction, and pushing a subtitle file formed by the first subtitle and the recorded live broadcast video stream according to the live broadcast delay set by the live broadcast delay setting instruction.
Optionally, the player further includes a third player, and the playing module 1120 is configured to play a live video stream including at least the first subtitle in the third player.
Optionally, the language corresponding to the one or more first subtitles is the same as the language corresponding to the audio stream.
Optionally, the method further includes: and the second display module is used for responding to a second caption display instruction, displaying second captions corresponding to the one or more first captions in the user interface respectively, wherein the languages corresponding to the second captions are different from the languages corresponding to the audio streams, and the second captions are displayed in a second area of the user interface in a contextual manner.
Optionally, the apparatus further includes a hiding module, configured to control the second area to be hidden and displayed in the user interface in response to a second subtitle hiding instruction; the modification module is further to: and responding to a second subtitle modification instruction, and modifying the second subtitle pointed by the second subtitle modification instruction.
Optionally, the plurality of first subtitles are displayed in a first area of the user interface in a contextual manner; any first subtitle in the one or more first subtitles and a second subtitle corresponding to the any first subtitle are in a transverse contrast relation in the user interface.
Optionally, the user interface further includes identification information corresponding to the one or more first subtitles.
Optionally, the apparatus further comprises: and the marking module is used for marking the pushed first caption, the identification information corresponding to the pushed first caption and the second caption corresponding to the pushed first caption in the user interface according to the pushing progress of the live video stream.
According to the subtitle optimization device provided by the embodiment of the disclosure, the plurality of first subtitles are displayed in the form of the context on the user interface, so that an original proofreader can conveniently proofread the first subtitles by combining the context, and the proofreading efficiency and the proofreading accuracy are improved; when an original text proofreader wants to watch a video picture corresponding to a target subtitle, a certain operation can be triggered aiming at the target subtitle so that the video picture corresponding to the target subtitle can be seen in a player, the original text proofreader can conveniently play back any frame of video picture, and the first subtitle can be proofread conveniently by combining the video.
The apparatus provided in the embodiment of the present disclosure may perform the method steps provided in the embodiment of the method of the present disclosure, and the advantageous effects are not described herein again.
Fig. 12 is a schematic structural diagram of an electronic device in an embodiment of the disclosure. Referring now specifically to fig. 12, a schematic diagram of an electronic device 500 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device 500 in the embodiments of the present disclosure may include, but is not limited to, mobile terminals such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet), a PMP (portable multimedia player), a vehicle-mounted terminal (e.g., a car navigation terminal), a wearable electronic device, and the like, and fixed terminals such as a digital TV, a desktop computer, a smart home device, and the like. The electronic device shown in fig. 12 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 12, an electronic device 500 may include a processing means (e.g., central processing unit, graphics processor, etc.) 501 that may perform various appropriate actions and processes to implement the … method of embodiments as described in this disclosure, according to a program stored in a Read Only Memory (ROM)502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data necessary for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other through a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
Generally, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; output devices 507 including, for example, a Liquid Crystal Display (LCD), speakers, vibrators, and the like; storage devices 508 including, for example, magnetic tape, hard disk, etc.; and a communication device 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 12 illustrates an electronic device 500 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated by the flow chart, thereby implementing the method as described above. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or installed from the storage means 508, or installed from the ROM 502. The computer program performs the above-described functions defined in the methods of the embodiments of the present disclosure when executed by the processing device 501.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to:
displaying a user interface, wherein the user interface comprises a player and one or more first subtitles corresponding to an audio stream in a live video stream; responding to a trigger operation aiming at a target subtitle, playing a live video stream segment corresponding to the target subtitle in the player, so that a user proofreads the target subtitle by watching the live video stream segment; and the target caption is a caption in the one or more first captions.
Optionally, when the one or more programs are executed by the electronic device, the electronic device may further perform other steps described in the above embodiments.
Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, there is provided a subtitle optimization method including: displaying a user interface, wherein the user interface comprises a player and one or more first subtitles corresponding to an audio stream in a live video stream; responding to a trigger operation aiming at a target subtitle, playing a live video stream segment corresponding to the target subtitle in the player, so that a user proofreads the target subtitle by watching the live video stream segment; and the target caption is a caption in the one or more first captions.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, the playing, in the player, a live video stream segment corresponding to a target subtitle in response to a trigger operation for the target subtitle includes: responding to a triggering operation acted on a playing control piece associated with the target subtitle, and playing a live video stream segment corresponding to the target subtitle in the player; and when the target subtitle is in an editing state, displaying the playing control at the associated position of the target subtitle.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, the playing, in the player, a live video stream segment corresponding to a target subtitle in response to a trigger operation for the target subtitle includes: and when the target subtitle is in an editing state, responding to the triggering operation of a preset shortcut key to play a live video stream segment corresponding to the target subtitle in the player.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, the player includes a first player and a second player; the playing of the live video stream segment corresponding to the target subtitle in the player includes: and playing a live video stream segment corresponding to the target caption in the first player.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, the method further includes: responding to a live video stream playing instruction, and playing the recorded live video stream in the second player; and in the process of playing the recorded live video stream, responding to a first subtitle modification instruction, and modifying the first subtitle pointed by the first subtitle modification instruction.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, before playing the recorded live video stream in the second player in response to the live video stream playing instruction, the method further includes: acquiring the live video stream according to the address information of the live video stream; and responding to a live broadcast starting instruction, and recording the live broadcast video stream.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, the method further includes: responding to a live broadcast delay setting instruction, and pushing a live broadcast video stream at least comprising the first subtitle according to the live broadcast delay set by the live broadcast delay setting instruction; or responding to a live broadcast delay setting instruction, and pushing a subtitle file formed by the first subtitle and the recorded live broadcast video stream according to the live broadcast delay set by the live broadcast delay setting instruction.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, the player further includes a third player, and the method further includes: and playing a live video stream at least comprising the first subtitle in the third player.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, the language corresponding to the one or more first subtitles is the same as the language corresponding to the audio stream; the method further comprises the following steps: and responding to a second caption display instruction, displaying second captions corresponding to the one or more first captions in the user interface respectively, wherein the languages corresponding to the second captions are different from the languages corresponding to the audio stream, and the second captions are displayed in a second area of the user interface in a contextual mode.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, in response to a second subtitle hiding instruction, controlling the second region to be hidden and displayed in the user interface; and responding to a second subtitle modification instruction, and modifying the second subtitle pointed by the second subtitle modification instruction.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, the plurality of first subtitles are displayed in a first area of the user interface in a contextual manner; any first subtitle in the one or more first subtitles and a second subtitle corresponding to the any first subtitle are in a transverse contrast relation in the user interface.
According to one or more embodiments of the present disclosure, in the method provided by the present disclosure, optionally, the user interface further includes identification information corresponding to the one or more first subtitles, respectively; the method further comprises the following steps: and marking the pushed first subtitle, the identification information corresponding to the pushed first subtitle and the second subtitle corresponding to the pushed first subtitle in the user interface according to the pushing progress of the live video stream.
According to one or more embodiments of the present disclosure, there is provided a subtitle optimizing apparatus including: the first display module is used for displaying a user interface, and the user interface comprises a player and one or more first subtitles corresponding to an audio stream in a live video stream; the playing module is used for responding to the triggering operation aiming at the target caption, and playing a live video stream segment corresponding to the target caption in the player so that a user proofreads the target caption by watching the live video stream segment; wherein the target caption is one of the one or more first captions.
According to one or more embodiments of the present disclosure, in the subtitle optimizing apparatus provided by the present disclosure, optionally, the playing module is specifically configured to: responding to a triggering operation acted on a playing control piece associated with the target subtitle, and playing a live video stream segment corresponding to the target subtitle in the player; and when the target subtitle is in an editing state, displaying the playing control at the associated position of the target subtitle.
According to one or more embodiments of the present disclosure, in the subtitle optimizing apparatus provided by the present disclosure, optionally, the playing module is specifically configured to: and when the target subtitle is in an editing state, responding to the triggering operation of a preset shortcut key to play a live video stream segment corresponding to the target subtitle in the player.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, the player includes a first player and a second player; the playing module is used for playing a live video stream segment corresponding to the target caption in the first player, and responding to a live video stream playing instruction to play a recorded live video stream in the second player.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, a modification module is further included, configured to, in a process of playing the recorded live video stream, respond to a first subtitle modification instruction, and modify a first subtitle pointed by the first subtitle modification instruction.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, an obtaining module is further included, configured to obtain a live video stream according to address information of the live video stream before playing a recorded live video stream in the second player. And the recording module is used for responding to a live broadcast starting instruction and recording the live broadcast video stream.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, the apparatus further includes: the pushing module is used for responding to a live broadcast delay setting instruction and pushing a live broadcast video stream at least comprising the first caption according to the live broadcast delay set by the live broadcast delay setting instruction; or responding to a live broadcast delay setting instruction, and pushing a subtitle file formed by the first subtitle and the recorded live broadcast video stream according to the live broadcast delay set by the live broadcast delay setting instruction.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, the player further includes a third player, and the playing module is configured to play a live video stream including at least the first subtitle in the third player.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, the language corresponding to the one or more first subtitles is the same as the language corresponding to the audio stream.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, the apparatus further includes: and the second display module is used for responding to a second caption display instruction, displaying second captions corresponding to the one or more first captions in the user interface respectively, wherein the languages corresponding to the second captions are different from the languages corresponding to the audio streams, and the second captions are displayed in a second area of the user interface in a contextual manner.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, a hiding module is further included, configured to control, in response to a second subtitle hiding instruction, the second area to be hidden and displayed in the user interface; the modification module is further to: and responding to a second subtitle modification instruction, and modifying the second subtitle pointed by the second subtitle modification instruction.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, the plurality of first subtitles are displayed in a first area of the user interface in a contextual manner; any first subtitle in the one or more first subtitles and a second subtitle corresponding to the any first subtitle are in a transverse contrast relation in the user interface.
According to one or more embodiments of the present disclosure, in the subtitle optimization apparatus provided by the present disclosure, optionally, the user interface further includes identification information corresponding to each of the one or more first subtitles.
According to one or more embodiments of the present disclosure, in the subtitle optimizing apparatus provided by the present disclosure, optionally, the apparatus further includes: and the marking module is used for marking the pushed first caption, the identification information corresponding to the pushed first caption and the second caption corresponding to the pushed first caption in the user interface according to the pushing progress of the live video stream.
In accordance with one or more embodiments of the present disclosure, there is provided an electronic device including:
one or more processors;
a memory for storing one or more programs;
when executed by the one or more processors, cause the one or more processors to implement any of the methods provided by the present disclosure.
According to one or more embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the methods provided by the present disclosure.
Embodiments of the present disclosure also provide a computer program product comprising a computer program or instructions which, when executed by a processor, implement the method as described above.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.
Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (15)

1.一种字幕优化方法,其特征在于,所述方法包括:1. a subtitle optimization method, is characterized in that, described method comprises: 显示用户界面,所述用户界面包括播放器和直播视频流中音频流对应的一条或多条第一字幕;Displaying a user interface, the user interface includes the player and one or more first subtitles corresponding to the audio stream in the live video stream; 响应于针对目标字幕的触发操作,在所述播放器中播放与所述目标字幕对应的直播视频流片段,以使得用户通过观看所述直播视频流片段对所述目标字幕进行校对;In response to the triggering operation for the target subtitle, playing the live video stream segment corresponding to the target subtitle in the player, so that the user can proofread the target subtitle by watching the live video stream segment; 其中,所述目标字幕为所述一条或多条第一字幕中的字幕。The target subtitles are subtitles in the one or more first subtitles. 2.根据权利要求1所述的方法,其特征在于,所述响应于针对目标字幕的触发操作,在所述播放器中播放与所述目标字幕对应的直播视频流片段,包括:2. The method according to claim 1, wherein, in response to a triggering operation for the target subtitle, playing the live video stream segment corresponding to the target subtitle in the player comprises: 响应于作用在与所述目标字幕关联的播放控件的触发操作,在所述播放器中播放与所述目标字幕对应的直播视频流片段;in response to a trigger operation acting on a playback control associated with the target subtitle, playing a live video stream segment corresponding to the target subtitle in the player; 其中,当所述目标字幕处于编辑状态时,在所述目标字幕的关联位置显示所述播放控件。Wherein, when the target subtitle is in the editing state, the playback control is displayed at the associated position of the target subtitle. 3.根据权利要求1所述的方法,其特征在于,所述响应于针对目标字幕的触发操作,在所述播放器中播放与所述目标字幕对应的直播视频流片段,包括:3. The method according to claim 1, wherein, in response to a triggering operation for a target subtitle, playing a live video stream segment corresponding to the target subtitle in the player comprises: 在所述目标字幕处于编辑状态时,响应于预设快捷键的触发操作在所述播放器中播放与所述目标字幕对应的直播视频流片段。When the target subtitle is in the editing state, the live video stream segment corresponding to the target subtitle is played in the player in response to a trigger operation of a preset shortcut key. 4.根据权利要求1所述的方法,其特征在于,所述播放器包括第一播放器和第二播放器;4. The method of claim 1, wherein the player comprises a first player and a second player; 所述在所述播放器中播放与所述目标字幕对应的直播视频流片段,包括:The playing the live video stream segment corresponding to the target subtitle in the player includes: 在所述第一播放器中播放与所述目标字幕对应的直播视频流片段。Play the live video stream segment corresponding to the target subtitle in the first player. 5.根据权利要求4所述的方法,其特征在于,还包括:5. The method of claim 4, further comprising: 响应于直播视频流播放指令,在所述第二播放器中播放已录制的直播视频流;In response to the live video stream playing instruction, play the recorded live video stream in the second player; 在播放所述已录制的直播视频流的过程中,响应于第一字幕修改指令,对所述第一字幕修改指令指向的第一字幕进行修改。During the process of playing the recorded live video stream, in response to the first subtitle modification instruction, the first subtitle pointed to by the first subtitle modification instruction is modified. 6.根据权利要求5所述的方法,其特征在于,所述响应于直播视频流播放指令,在所述第二播放器中播放已录制的直播视频流之前,所述方法还包括:6. The method according to claim 5, wherein, before playing the recorded live video stream in the second player in response to a live video stream playing instruction, the method further comprises: 根据所述直播视频流的地址信息,获取所述直播视频流;Acquire the live video stream according to the address information of the live video stream; 响应于直播开始指令,对所述直播视频流进行录制。The live video stream is recorded in response to the live broadcast start instruction. 7.根据权利要求6所述的方法,其特征在于,所述方法还包括:7. The method according to claim 6, wherein the method further comprises: 响应于直播延时设置指令,根据所述直播延时设置指令所设置的直播延时推送至少包括所述第一字幕的直播视频流;或者In response to the live broadcast delay setting instruction, push a live video stream including at least the first subtitle according to the live broadcast delay set by the live broadcast delay setting instruction; or 响应于直播延时设置指令,根据所述直播延时设置指令所设置的直播延时推送由所述第一字幕构成的字幕文件和所述已录制的直播视频流。In response to the live broadcast delay setting instruction, push the subtitle file composed of the first subtitle and the recorded live video stream according to the live broadcast delay set by the live broadcast delay setting instruction. 8.根据权利要求7所述的方法,其特征在于,所述播放器还包括第三播放器,所述方法还包括:8. The method according to claim 7, wherein the player further comprises a third player, the method further comprising: 在所述第三播放器中播放至少包括所述第一字幕的直播视频流。A live video stream including at least the first subtitle is played in the third player. 9.根据权利要求1-8任一项所述的方法,其特征在于,所述一条或多条第一字幕对应的语种与所述音频流对应的语种相同;9. The method according to any one of claims 1-8, wherein the language corresponding to the one or more first subtitles is the same as the language corresponding to the audio stream; 所述方法还包括:The method also includes: 响应于第二字幕显示指令,在所述用户界面中显示所述一条或多条第一字幕分别对应的第二字幕,所述第二字幕对应的语种与所述音频流对应的语种不同,所述第二字幕以上下文的形式显示在所述用户界面的第二区域中。In response to the second subtitle display instruction, second subtitles respectively corresponding to the one or more first subtitles are displayed in the user interface, and the language corresponding to the second subtitles is different from the language corresponding to the audio stream, so The second subtitle is displayed in context in a second area of the user interface. 10.根据权利要求9所述的方法,其特征在于,所述方法还包括:10. The method according to claim 9, wherein the method further comprises: 响应于第二字幕隐藏指令,控制所述第二区域在所述用户界面中隐藏显示;In response to the second subtitle hiding instruction, controlling the second area to be hidden and displayed in the user interface; 响应于第二字幕修改指令,对所述第二字幕修改指令指向的第二字幕进行修改。In response to the second subtitle modification instruction, the second subtitle pointed to by the second subtitle modification instruction is modified. 11.根据权利要求9所述的方法,其特征在于,所述多条第一字幕以上下文的形式显示在所述用户界面的第一区域中;11. The method of claim 9, wherein the plurality of first subtitles are displayed in the first area of the user interface in a contextual form; 所述一条或多条第一字幕中任一第一字幕和所述任一第一字幕对应的第二字幕在所述用户界面中呈横向的对照关系。Any one of the one or more first subtitles and a second subtitle corresponding to the any one of the first subtitles are in a horizontal contrast relationship in the user interface. 12.根据权利要求9所述的方法,其特征在于,所述用户界面还包括所述一条或多条第一字幕分别对应的标识信息;12. The method according to claim 9, wherein the user interface further comprises identification information corresponding to the one or more first subtitles respectively; 所述方法还包括:The method also includes: 根据所述直播视频流的推送进度,在所述用户界面中对已推送的第一字幕、所述已推送的第一字幕对应的标识信息、以及所述已推送的第一字幕对应的第二字幕进行标注。According to the push progress of the live video stream, the first subtitle that has been pushed, the identification information corresponding to the first subtitle that has been pushed, and the second subtitle corresponding to the first subtitle that has been pushed are displayed in the user interface. Subtitles are marked. 13.一种字幕优化装置,其特征在于,包括:13. A subtitle optimization device, comprising: 第一显示模块,用于显示用户界面,所述用户界面包括播放器和直播视频流中音频流对应的一条或多条第一字幕;a first display module, configured to display a user interface, the user interface including the player and one or more first subtitles corresponding to the audio stream in the live video stream; 播放模块,用于响应于针对目标字幕的触发操作,在所述播放器中播放与所述目标字幕对应的直播视频流片段,以使得用户通过观看所述直播视频流片段对所述目标字幕进行校对;The playing module is configured to play, in the player, a live video stream segment corresponding to the target subtitle in response to a triggering operation for the target subtitle, so that the user can play the target subtitle by watching the live video stream segment. proofread; 其中,所述目标字幕为所述一条或多条第一字幕中的一条字幕。Wherein, the target subtitle is one subtitle in the one or more first subtitles. 14.一种电子设备,其特征在于,所述电子设备包括:14. An electronic device, characterized in that the electronic device comprises: 一个或多个处理器;one or more processors; 存储装置,用于存储一个或多个程序;a storage device for storing one or more programs; 当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-12中任一项所述的方法。The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-12. 15.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-12中任一项所述的方法。15. A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the method according to any one of claims 1-12 is implemented.
CN202111214119.5A 2021-10-19 2021-10-19 Subtitle optimization method, device, electronic device and storage medium Pending CN113891108A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111214119.5A CN113891108A (en) 2021-10-19 2021-10-19 Subtitle optimization method, device, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111214119.5A CN113891108A (en) 2021-10-19 2021-10-19 Subtitle optimization method, device, electronic device and storage medium

Publications (1)

Publication Number Publication Date
CN113891108A true CN113891108A (en) 2022-01-04

Family

ID=79003332

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111214119.5A Pending CN113891108A (en) 2021-10-19 2021-10-19 Subtitle optimization method, device, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN113891108A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760518A (en) * 2022-04-19 2022-07-15 高途教育科技集团有限公司 Video subtitle processing method and device, electronic equipment and readable storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108259971A (en) * 2018-01-31 2018-07-06 百度在线网络技术(北京)有限公司 Subtitle adding method, device, server and storage medium
CN108401192A (en) * 2018-04-25 2018-08-14 腾讯科技(深圳)有限公司 Video stream processing method, device, computer equipment and storage medium
CN108600773A (en) * 2018-04-25 2018-09-28 腾讯科技(深圳)有限公司 Caption data method for pushing, subtitle methods of exhibiting, device, equipment and medium
CN111901615A (en) * 2020-06-28 2020-11-06 北京百度网讯科技有限公司 Live video playing method and device
CN111970577A (en) * 2020-08-25 2020-11-20 北京字节跳动网络技术有限公司 Subtitle editing method and device and electronic equipment
US20210014575A1 (en) * 2017-12-20 2021-01-14 Flickray, Inc. Event-driven streaming media interactivity
CN112601102A (en) * 2020-12-11 2021-04-02 北京有竹居网络技术有限公司 Method and device for determining simultaneous interpretation of subtitles, electronic equipment and storage medium
CN112601101A (en) * 2020-12-11 2021-04-02 北京有竹居网络技术有限公司 Subtitle display method and device, electronic equipment and storage medium
CN112616062A (en) * 2020-12-11 2021-04-06 北京有竹居网络技术有限公司 Subtitle display method and device, electronic equipment and storage medium
EP3886444A1 (en) * 2018-11-27 2021-09-29 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Video processing method and apparatus, and electronic device and computer-readable medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210014575A1 (en) * 2017-12-20 2021-01-14 Flickray, Inc. Event-driven streaming media interactivity
CN108259971A (en) * 2018-01-31 2018-07-06 百度在线网络技术(北京)有限公司 Subtitle adding method, device, server and storage medium
CN108401192A (en) * 2018-04-25 2018-08-14 腾讯科技(深圳)有限公司 Video stream processing method, device, computer equipment and storage medium
CN108600773A (en) * 2018-04-25 2018-09-28 腾讯科技(深圳)有限公司 Caption data method for pushing, subtitle methods of exhibiting, device, equipment and medium
EP3886444A1 (en) * 2018-11-27 2021-09-29 Guangdong Oppo Mobile Telecommunications Corp., Ltd. Video processing method and apparatus, and electronic device and computer-readable medium
CN111901615A (en) * 2020-06-28 2020-11-06 北京百度网讯科技有限公司 Live video playing method and device
CN111970577A (en) * 2020-08-25 2020-11-20 北京字节跳动网络技术有限公司 Subtitle editing method and device and electronic equipment
CN112601102A (en) * 2020-12-11 2021-04-02 北京有竹居网络技术有限公司 Method and device for determining simultaneous interpretation of subtitles, electronic equipment and storage medium
CN112601101A (en) * 2020-12-11 2021-04-02 北京有竹居网络技术有限公司 Subtitle display method and device, electronic equipment and storage medium
CN112616062A (en) * 2020-12-11 2021-04-06 北京有竹居网络技术有限公司 Subtitle display method and device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114760518A (en) * 2022-04-19 2022-07-15 高途教育科技集团有限公司 Video subtitle processing method and device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
US11463779B2 (en) Video stream processing method and apparatus, computer device, and storage medium
US11252444B2 (en) Video stream processing method, computer device, and storage medium
US11917344B2 (en) Interactive information processing method, device and medium
CN108600773B (en) Subtitle data pushing method, subtitle display method, device, equipment and medium
US10057537B1 (en) System and method for source script and video synchronization interface
WO2021196903A1 (en) Video processing method and device, readable medium and electronic device
CN113259740A (en) Multimedia processing method, device, equipment and medium
CN112601101B (en) A subtitle display method, device, electronic equipment and storage medium
CN104918105B (en) More screen playing methods, equipment and the system of media file
WO2019047878A1 (en) Method for controlling terminal by voice, terminal, server and storage medium
WO2022105760A1 (en) Multimedia browsing method and apparatus, device and medium
WO2020207080A1 (en) Video capture method and apparatus, electronic device and storage medium
CN112601102A (en) Method and device for determining simultaneous interpretation of subtitles, electronic equipment and storage medium
WO2018130173A1 (en) Dubbing method, terminal device, server and storage medium
CN114095671A (en) Cloud conference live broadcast system, method, device, device and medium
US20240296871A1 (en) Method, apparatus, device, storage medium and program product for video generation
WO2022218109A1 (en) Interaction method and apparatus, electronic device, and computer readable storage medium
CN114040255A (en) Live caption generating method, system, equipment and storage medium
US20240107087A1 (en) Server, terminal and non-transitory computer-readable medium
CN114125358A (en) Cloud conference subtitle display method, system, device, electronic equipment and storage medium
US20190019533A1 (en) Methods for efficient annotation of audiovisual media
CN113992926A (en) Interface display method and device, electronic equipment and storage medium
CN113891168A (en) Subtitle processing method, device, electronic device and storage medium
CN113891108A (en) Subtitle optimization method, device, electronic device and storage medium
CN115086753A (en) Live video stream processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20220104