[go: up one dir, main page]

CN109743589B - Article generation method and device - Google Patents

Article generation method and device Download PDF

Info

Publication number
CN109743589B
CN109743589B CN201811600339.XA CN201811600339A CN109743589B CN 109743589 B CN109743589 B CN 109743589B CN 201811600339 A CN201811600339 A CN 201811600339A CN 109743589 B CN109743589 B CN 109743589B
Authority
CN
China
Prior art keywords
paragraph
sentence
adjacent
threshold value
sentences
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811600339.XA
Other languages
Chinese (zh)
Other versions
CN109743589A (en
Inventor
陈杰
张玉东
杨宏生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Original Assignee
Baidu Online Network Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu Online Network Technology Beijing Co Ltd filed Critical Baidu Online Network Technology Beijing Co Ltd
Priority to CN201811600339.XA priority Critical patent/CN109743589B/en
Publication of CN109743589A publication Critical patent/CN109743589A/en
Application granted granted Critical
Publication of CN109743589B publication Critical patent/CN109743589B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • User Interface Of Digital Computer (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an article generation method and device, wherein the method comprises the following steps: acquiring a video and a corresponding voice, and identifying the voice to obtain each sentence; acquiring feature information of each sentence, and performing paragraph division on each sentence according to the feature information to obtain a paragraph sequence; for each paragraph in the paragraph sequence, obtaining a key sentence in the paragraph; acquiring a time period corresponding to the key sentence, and selecting a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph; the article is generated according to each paragraph in the paragraph sequence and the corresponding picture, wherein the article comprises each paragraph and the corresponding picture, so that the video content can be effectively embodied, a user can easily select a video to be watched, and the video playing efficiency is improved.

Description

Article generation method and device
Technical Field
The invention relates to the technical field of video processing, in particular to an article generation method and device.
Background
At present, before a video is published, the video is analyzed and processed, and one frame of picture in the video is selected as a thumbnail of the video, so that after the video is published, a user can know the content of the video according to the thumbnail and further determine whether to select to watch the video. However, in the above scheme, the content displayed by the thumbnail is less, and it is difficult to effectively embody the video content, so that it is difficult for the user to select the video that the user wants to watch, and the user may quit in the middle of selecting the video that the user does not want to watch, thereby reducing the video playing efficiency.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, a first object of the present invention is to provide an article generating method, which is used to solve the problems in the prior art that video thumbnails are difficult to effectively represent video content and video playing efficiency is poor.
A second object of the present invention is to provide an article generation apparatus.
A third object of the present invention is to propose another article generation apparatus.
A fourth object of the invention is to propose a non-transitory computer-readable storage medium.
A fifth object of the invention is to propose a computer program product.
In order to achieve the above object, an embodiment of a first aspect of the present invention provides an article generating method, including:
acquiring a video and corresponding voice, and identifying the voice to obtain each sentence;
acquiring feature information of each sentence, and performing paragraph division on each sentence according to the feature information to obtain a paragraph sequence;
for each paragraph in the paragraph sequence, obtaining a key sentence in the paragraph;
acquiring a time period corresponding to the key sentence, and selecting a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph;
and generating an article according to each paragraph in the paragraph sequence and the corresponding picture.
Further, the recognizing the speech to obtain each sentence includes:
recognizing the voice to obtain each word and a timestamp corresponding to each word;
aiming at any two adjacent words, calculating the time difference value of the two adjacent words according to the timestamps corresponding to the two adjacent words;
judging whether the time difference is greater than or equal to a first difference threshold value or not;
if the time difference is smaller than a first difference threshold value, dividing the two adjacent words into the same sentence;
and if the time difference is greater than or equal to a first difference threshold value, dividing the two adjacent words into different sentences to obtain each sentence.
Further, the feature information includes: the middle timestamp corresponding to the sentence is used for judging whether the sentence has a connecting word or not;
the paragraph segmentation is performed on each sentence according to the feature information to obtain a paragraph sequence, and the paragraph sequence comprises:
aiming at any two adjacent sentences, calculating the time difference value of the two adjacent sentences according to the middle time stamps corresponding to the two adjacent sentences;
judging whether the time difference is greater than or equal to a second difference threshold value or not, and whether a later sentence in the two adjacent sentences has a connecting word or not;
if the time difference is smaller than a second difference threshold value or a later sentence in the two adjacent sentences has a connecting word, dividing the two adjacent sentences into the same paragraph;
and if the time difference is greater than or equal to a second difference threshold value and the later sentence in the two adjacent sentences has no connecting word, dividing the two adjacent sentences into different paragraphs to obtain a paragraph sequence.
Further, the second difference threshold is determined by,
generating a time difference set according to the time difference of any two adjacent sentences;
calculating and determining the standard deviation of the time difference value set according to the time difference value set;
and determining the product of the standard deviation and a preset coefficient as the second difference threshold value.
Further, after obtaining the feature information of each sentence and performing paragraph segmentation on each sentence according to the feature information to obtain a paragraph sequence, the method further includes:
for each paragraph in the paragraph sequence, obtaining a word count for the paragraph;
judging whether the word number of the paragraph is smaller than a preset word number threshold value or not;
and if the number of words of the paragraph is less than the preset number of words threshold, combining the paragraph with the next adjacent paragraph until the number of words of the combined paragraph is greater than or equal to the preset number of words threshold.
Further, the obtaining, for each paragraph in the paragraph sequence, a key sentence in the paragraph includes:
acquiring a title of the video;
inputting all sentences and the titles in the paragraph sequence into a preset keyword model to obtain each keyword and corresponding weight, and generating a keyword set;
for each paragraph in the paragraph sequence, querying a keyword set according to each sentence in the paragraph to obtain keywords included in each sentence;
determining the weight of each sentence according to the keywords contained in each sentence and the weight corresponding to the keywords;
and determining the sentence with the maximum weight in the paragraph as a key sentence in the paragraph.
Further, the obtaining of the time period corresponding to the key sentence includes:
acquiring a middle timestamp corresponding to the key sentence;
determining a time period corresponding to the key sentence according to the middle timestamp corresponding to the key sentence and a preset threshold; the starting time point of the time period is the difference value between the middle timestamp and the preset threshold value, and the ending time point of the time period is the sum value between the middle timestamp and the preset threshold value.
According to the article generation method, the voice is identified by acquiring the video and the corresponding voice, and each sentence is obtained; acquiring feature information of each sentence, and performing paragraph division on each sentence according to the feature information to obtain a paragraph sequence; for each paragraph in the paragraph sequence, obtaining a key sentence in the paragraph; acquiring a time period corresponding to the key sentence, and selecting a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph; the article is generated according to each paragraph in the paragraph sequence and the corresponding picture, wherein the article comprises each paragraph and the corresponding picture, so that the video content can be effectively embodied, a user can easily select a video to be watched, and the video playing efficiency is improved.
To achieve the above object, a second embodiment of the present invention provides an article generation apparatus, including:
the acquisition module is used for acquiring videos and corresponding voices and identifying the voices to obtain sentences;
the segmentation module is used for acquiring the characteristic information of each sentence and segmenting each sentence according to the characteristic information to obtain a paragraph sequence;
the obtaining module is further configured to obtain, for each paragraph in the paragraph sequence, a key sentence in the paragraph;
a selection module, configured to obtain a time period corresponding to the key sentence, and select a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph;
and the generation module is used for generating an article according to each paragraph in the paragraph sequence and the corresponding picture.
Further, the obtaining module is specifically configured to,
recognizing the voice to obtain each word and a timestamp corresponding to each word;
aiming at any two adjacent words, calculating the time difference value of the two adjacent words according to the timestamps corresponding to the two adjacent words;
judging whether the time difference is greater than or equal to a first difference threshold value or not;
if the time difference is smaller than a first difference threshold value, dividing the two adjacent words into the same sentence;
and if the time difference is greater than or equal to a first difference threshold value, dividing the two adjacent words into different sentences to obtain each sentence.
Further, the feature information includes: the middle timestamp corresponding to the sentence is used for judging whether the sentence has a connecting word or not;
the dividing module is specifically configured to,
aiming at any two adjacent sentences, calculating the time difference value of the two adjacent sentences according to the middle time stamps corresponding to the two adjacent sentences;
judging whether the time difference is greater than or equal to a second difference threshold value or not, and whether a later sentence in the two adjacent sentences has a connecting word or not;
if the time difference is smaller than a second difference threshold value or a later sentence in the two adjacent sentences has a connecting word, dividing the two adjacent sentences into the same paragraph;
and if the time difference is greater than or equal to a second difference threshold value and the later sentence in the two adjacent sentences has no connecting word, dividing the two adjacent sentences into different paragraphs to obtain a paragraph sequence.
Further, the second difference threshold is determined by,
generating a time difference set according to the time difference of any two adjacent sentences;
calculating and determining the standard deviation of the time difference value set according to the time difference value set;
and determining the product of the standard deviation and a preset coefficient as the second difference threshold value.
Further, the device further comprises: a judging module and a merging module;
the obtaining module is further configured to obtain, for each paragraph in the paragraph sequence, a word count of the paragraph;
the judging module is used for judging whether the word number of the paragraph is smaller than a preset word number threshold value;
and the merging module is used for merging the paragraph with a next adjacent paragraph when the number of words of the paragraph is less than a preset word number threshold value until the number of words of the merged paragraph is more than or equal to the preset word number threshold value.
Further, the obtaining module is specifically configured to,
acquiring a title of the video;
inputting all sentences and the titles in the paragraph sequence into a preset keyword model to obtain each keyword and corresponding weight, and generating a keyword set;
for each paragraph in the paragraph sequence, querying a keyword set according to each sentence in the paragraph to obtain keywords included in each sentence;
determining the weight of each sentence according to the keywords contained in each sentence and the weight corresponding to the keywords;
and determining the sentence with the maximum weight in the paragraph as a key sentence in the paragraph.
Further, the selection module is specifically configured to,
acquiring a middle timestamp corresponding to the key sentence;
determining a time period corresponding to the key sentence according to the middle timestamp corresponding to the key sentence and a preset threshold; the starting time point of the time period is the difference value between the middle timestamp and the preset threshold value, and the ending time point of the time period is the sum value between the middle timestamp and the preset threshold value.
The article generation device of the embodiment of the invention identifies the voice by acquiring the video and the corresponding voice to obtain each sentence; acquiring feature information of each sentence, and performing paragraph division on each sentence according to the feature information to obtain a paragraph sequence; for each paragraph in the paragraph sequence, obtaining a key sentence in the paragraph; acquiring a time period corresponding to the key sentence, and selecting a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph; the article is generated according to each paragraph in the paragraph sequence and the corresponding picture, wherein the article comprises each paragraph and the corresponding picture, so that the video content can be effectively embodied, a user can easily select a video to be watched, and the video playing efficiency is improved.
To achieve the above object, a third embodiment of the present invention provides another article generation apparatus, including: a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the article generation method as described above when executing the program.
In order to achieve the above object, a fourth aspect of the present invention provides a computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the article generation method as described above.
In order to achieve the above object, a fifth aspect of the present invention provides a computer program product, wherein when being executed by an instruction processor, the article generation method as described above is implemented.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flow chart of an article generation method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an article generating apparatus according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of another article generating apparatus according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of another article generating apparatus according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
An article generation method and apparatus according to an embodiment of the present invention are described below with reference to the drawings.
Fig. 1 is a schematic flow chart of an article generation method according to an embodiment of the present invention. As shown in fig. 1, the article generation method includes the following steps:
s101, acquiring videos and corresponding voices, and identifying the voices to obtain sentences.
The execution subject of the article generation method provided by the invention is an article generation device, and the article generation device can be hardware equipment such as terminal equipment and a server, or software installed on the hardware equipment. In this embodiment, the video may be, for example, a video to be published.
In this embodiment, in the speech corresponding to the video, because there is a pause in human speech, especially there is a pause between each sentence, each sentence in the speech can be determined according to the timestamp corresponding to each word. Correspondingly, the process of recognizing the voice by the article generating device to obtain each sentence can be specifically, recognizing the voice to obtain each word and the timestamp corresponding to each word; aiming at any two adjacent words, calculating the time difference value of the two adjacent words according to the timestamps corresponding to the two adjacent words; judging whether the time difference is greater than or equal to a first difference threshold value or not; if the time difference is smaller than a first difference threshold value, dividing two adjacent words into the same sentence; and if the time difference is greater than or equal to the first difference threshold, dividing two adjacent words into different sentences to obtain each sentence. Wherein the timestamp may be a start timestamp, an intermediate timestamp, or an end timestamp of the word. The time difference may be, for example, 0.2 seconds, 0.3 seconds, or the like.
S102, obtaining characteristic information of each sentence, and carrying out paragraph division on each sentence according to the characteristic information to obtain a paragraph sequence.
In this embodiment, the feature information may include: the corresponding middle time stamp of the sentence, whether the sentence has a connective word, etc. Where, conjunctive words such as "and", "but", etc. Correspondingly, the process of the article generating device executing step 102 may specifically be that, for any two adjacent sentences, a time difference value of the two adjacent sentences is calculated according to the intermediate time stamps corresponding to the two adjacent sentences; judging whether the time difference is greater than or equal to a second difference threshold value and whether a later sentence in two adjacent sentences has a connecting word; if the time difference is smaller than a second difference threshold value or the later sentence in the two adjacent sentences has a connecting word, dividing the two adjacent sentences into the same paragraph; and if the time difference is greater than or equal to the second difference threshold value and the later sentence in the two adjacent sentences has no connecting word, dividing the two adjacent sentences into different paragraphs to obtain a paragraph sequence.
The second difference threshold may be determined by generating a time difference set according to a time difference between any two adjacent sentences; calculating and determining the standard deviation of the time difference value set according to the time difference value set; and determining the product of the standard deviation and a preset coefficient as a second difference threshold value. Wherein the second difference threshold is greater than the first difference threshold. The preset coefficient may be N, and the value of N may be 2, for example.
Further, on the basis of the above embodiment, since the paragraph generally has a certain number of words, for more accurately dividing the paragraph, after step 102, the above method may further include the following steps: for each paragraph in the paragraph sequence, obtaining a word number of the paragraph; judging whether the number of words of the paragraph is smaller than a preset word number threshold value or not; if the number of words of the paragraph is less than the preset number of words threshold, the paragraph is merged with the next adjacent paragraph until the number of words of the merged paragraph is greater than or equal to the preset number of words threshold.
For example, if the number of words of the first paragraph is smaller than the preset number of words threshold, the first paragraph and the second paragraph are merged into one paragraph; and judging whether the number of words of the combined paragraph is less than a preset number threshold, if so, combining the combined paragraph and a third paragraph to obtain a combined paragraph. At this time, if the number of words of the recombined paragraph is greater than or equal to the preset number of words threshold, the operation on the recombined paragraph is stopped; and then acquiring a fourth paragraph, and judging whether the word number of the fourth paragraph is less than a preset word number threshold value.
S103, acquiring a key sentence in the paragraph aiming at each paragraph in the paragraph sequence.
In this embodiment, the key sentences in the paragraphs are the sentences that can best embody the idea of the center of the paragraph. The process of the article generating device executing step 103 may specifically be to acquire a title of the video; inputting all sentences and titles in the paragraph sequence into a preset keyword model to obtain each keyword and corresponding weight, and generating a keyword set; for each paragraph in the paragraph sequence, querying a keyword set according to each sentence in the paragraph to obtain keywords included in each sentence; determining the weight of each sentence according to the keywords contained in each sentence and the weight corresponding to the keywords; and determining the sentence with the maximum weight in the paragraph as a key sentence in the paragraph.
The keywords may be words with a large number of occurrences in all sentences or words that embody the central ideas of all paragraphs. The keyword model can be a neural network model and the like, and the keyword model can be trained according to the training text and a keyword set corresponding to the training text.
In this embodiment, the process of determining the weight of each sentence according to the keywords included in each sentence and the weights corresponding to the keywords may specifically be that, for each sentence, the keywords included in the sentence, the occurrence times of the keywords, and the weights corresponding to the keywords are obtained; and calculating the product of the occurrence times and the weight of each keyword to obtain a numerical value, and adding the numerical values of the included keywords to obtain the weight of the sentence.
And S104, acquiring a time period corresponding to the key sentence, and selecting a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph.
In this embodiment, the process of the article generation apparatus acquiring the time period corresponding to the key sentence may specifically be to acquire an intermediate timestamp corresponding to the key sentence; determining a time period corresponding to the key sentence according to the middle timestamp corresponding to the key sentence and a preset threshold; the starting time point of the time period is the difference value between the middle timestamp and the preset threshold value, and the ending time point of the time period is the sum value between the middle timestamp and the preset threshold value.
The time period corresponding to the key sentence may be located in a time period determined according to the start time point and the end time point of the key sentence. In this embodiment, the article generation apparatus may select the most complete video frame from the video segment as the key video frame.
And S105, generating an article according to each paragraph in the paragraph sequence and the corresponding picture.
In this embodiment, taking an example that the paragraph sequence includes 3 paragraphs, the article may include content of a first paragraph, a picture corresponding to the first paragraph, content of a second paragraph, a picture corresponding to the second paragraph, content of a third paragraph, and a picture corresponding to the third paragraph.
In this embodiment, after the article is generated, the article generation apparatus may generate a link address corresponding to the article. When the video is published, the link address corresponding to the article is displayed on the publishing page of the video, so that a user can browse the article through the link address before watching the video, determine whether the video is the video which the user wants to watch according to the article, and further determine whether to watch the video and the like.
Further, the article generation means may generate a link address corresponding to the video after the article is generated. The link address corresponding to the video is displayed on the page where the article is located, so that a user can directly click the link address to watch the video after watching the article under the condition that the content of the article is interested.
According to the article generation method, the voice is identified by acquiring the video and the corresponding voice, and each sentence is obtained; acquiring feature information of each sentence, and performing paragraph division on each sentence according to the feature information to obtain a paragraph sequence; for each paragraph in the paragraph sequence, obtaining a key sentence in the paragraph; acquiring a time period corresponding to the key sentence, and selecting a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph; the article is generated according to each paragraph in the paragraph sequence and the corresponding picture, wherein the article comprises each paragraph and the corresponding picture, so that the video content can be effectively embodied, a user can easily select a video to be watched, and the video playing efficiency is improved.
Fig. 2 is a schematic structural diagram of an article generating apparatus according to an embodiment of the present invention. As shown in fig. 2, includes: an acquisition module 21, a dividing module 22, a selection module 23 and a generation module 24.
The acquiring module 21 is configured to acquire a video and a corresponding voice, and recognize the voice to obtain each sentence;
the dividing module 22 is configured to obtain feature information of each sentence, and perform paragraph division on each sentence according to the feature information to obtain a paragraph sequence;
the obtaining module 21 is further configured to obtain, for each paragraph in the paragraph sequence, a key sentence in the paragraph;
a selecting module 23, configured to obtain a time period corresponding to the key sentence, and select a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph;
and the generating module 24 is configured to generate an article according to each paragraph in the paragraph sequence and the corresponding picture.
The article generation device provided by the invention can be hardware equipment such as terminal equipment and a server, or software installed on the hardware equipment. In this embodiment, the video may be, for example, a video to be published.
In this embodiment, in the speech corresponding to the video, because there is a pause in human speech, especially there is a pause between each sentence, each sentence in the speech can be determined according to the timestamp corresponding to each word. Correspondingly, the obtaining module 21 may be specifically configured to identify a voice, and obtain each word and a timestamp corresponding to each word; aiming at any two adjacent words, calculating the time difference value of the two adjacent words according to the timestamps corresponding to the two adjacent words; judging whether the time difference is greater than or equal to a first difference threshold value or not; if the time difference is smaller than a first difference threshold value, dividing two adjacent words into the same sentence; and if the time difference is greater than or equal to the first difference threshold, dividing two adjacent words into different sentences to obtain each sentence. Wherein the timestamp may be a start timestamp, an intermediate timestamp, or an end timestamp of the word. The time difference may be, for example, 0.2 seconds, 0.3 seconds, or the like.
In this embodiment, the feature information may include: the corresponding middle time stamp of the sentence, whether the sentence has a connective word, etc. Where, conjunctive words such as "and", "but", etc. Correspondingly, the dividing module 22 may be specifically configured to, for any two adjacent sentences, calculate a time difference value between the two adjacent sentences according to the intermediate time stamp corresponding to the two adjacent sentences; judging whether the time difference is greater than or equal to a second difference threshold value and whether a later sentence in two adjacent sentences has a connecting word; if the time difference is smaller than a second difference threshold value or the later sentence in the two adjacent sentences has a connecting word, dividing the two adjacent sentences into the same paragraph; and if the time difference is greater than or equal to the second difference threshold value and the later sentence in the two adjacent sentences has no connecting word, dividing the two adjacent sentences into different paragraphs to obtain a paragraph sequence.
The second difference threshold may be determined by generating a time difference set according to a time difference between any two adjacent sentences; calculating and determining the standard deviation of the time difference value set according to the time difference value set; and determining the product of the standard deviation and a preset coefficient as a second difference threshold value. Wherein the second difference threshold is greater than the first difference threshold.
Further, on the basis of the above embodiment, since the paragraphs generally have a certain number of words, in order to divide the paragraphs more accurately, with reference to fig. 3, the apparatus may further include: a judging module 25 and a merging module 26;
wherein, the obtaining module 21 is further configured to obtain, for each paragraph in the paragraph sequence, the word number of the paragraph;
the judging module 25 is configured to judge whether the word number of the paragraph is smaller than a preset word number threshold;
the merging module 26 is configured to merge the paragraph with a next adjacent paragraph when the number of words of the paragraph is smaller than a preset number threshold until the number of words of the merged paragraph is greater than or equal to the preset number threshold.
For example, if the number of words of the first paragraph is smaller than the preset number of words threshold, the first paragraph and the second paragraph are merged into one paragraph; and judging whether the number of words of the combined paragraph is less than a preset number threshold, if so, combining the combined paragraph and a third paragraph to obtain a combined paragraph. At this time, if the number of words of the recombined paragraph is greater than or equal to the preset number of words threshold, the operation on the recombined paragraph is stopped; and then acquiring a fourth paragraph, and judging whether the word number of the fourth paragraph is less than a preset word number threshold value.
In this embodiment, the key sentences in the paragraphs are the sentences that can best embody the idea of the center of the paragraph. Correspondingly, the obtaining module 21 may be specifically configured to obtain a title of a video; inputting all sentences and titles in the paragraph sequence into a preset keyword model to obtain each keyword and corresponding weight, and generating a keyword set; for each paragraph in the paragraph sequence, querying a keyword set according to each sentence in the paragraph to obtain keywords included in each sentence; determining the weight of each sentence according to the keywords contained in each sentence and the weight corresponding to the keywords; and determining the sentence with the maximum weight in the paragraph as a key sentence in the paragraph.
The keywords may be words with a large number of occurrences in all sentences or words that embody the central ideas of all paragraphs. The keyword model can be a neural network model and the like, and the keyword model can be trained according to the training text and a keyword set corresponding to the training text.
In this embodiment, the process of determining the weight of each sentence according to the keywords included in each sentence and the weights corresponding to the keywords may specifically be that, for each sentence, the keywords included in the sentence, the occurrence times of the keywords, and the weights corresponding to the keywords are obtained; and calculating the product of the occurrence times and the weight of each keyword to obtain a numerical value, and adding the numerical values of the included keywords to obtain the weight of the sentence.
In this embodiment, the process of acquiring the time period corresponding to the key sentence by the acquisition module 21 may specifically be to acquire an intermediate timestamp corresponding to the key sentence; determining a time period corresponding to the key sentence according to the middle timestamp corresponding to the key sentence and a preset threshold; the starting time point of the time period is the difference value between the middle timestamp and the preset threshold value, and the ending time point of the time period is the sum value between the middle timestamp and the preset threshold value.
The time period corresponding to the key sentence may be located in a time period determined according to the start time point and the end time point of the key sentence. In this embodiment, the article generation apparatus may select the most complete video frame from the video segment as the key video frame.
In this embodiment, after the article is generated, the article generation apparatus may generate a link address corresponding to the article. When the video is published, the link address corresponding to the article is displayed on the publishing page of the video, so that a user can browse the article through the link address before watching the video, determine whether the video is the video which the user wants to watch according to the article, and further determine whether to watch the video and the like.
Further, the article generation means may generate a link address corresponding to the video after the article is generated. The link address corresponding to the video is displayed on the page where the article is located, so that a user can directly click the link address to watch the video after watching the article under the condition that the content of the article is interested.
The article generation device of the embodiment of the invention identifies the voice by acquiring the video and the corresponding voice to obtain each sentence; acquiring feature information of each sentence, and performing paragraph division on each sentence according to the feature information to obtain a paragraph sequence; for each paragraph in the paragraph sequence, obtaining a key sentence in the paragraph; acquiring a time period corresponding to the key sentence, and selecting a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph; the article is generated according to each paragraph in the paragraph sequence and the corresponding picture, wherein the article comprises each paragraph and the corresponding picture, so that the video content can be effectively embodied, a user can easily select a video to be watched, and the video playing efficiency is improved.
Fig. 4 is a schematic structural diagram of another article generating apparatus according to an embodiment of the present invention. The article generation apparatus includes:
memory 1001, processor 1002, and computer programs stored on memory 1001 and executable on processor 1002.
The article generation method provided in the above-described embodiment is implemented when the processor 1002 executes the program.
Further, the article generation apparatus further includes:
a communication interface 1003 for communicating between the memory 1001 and the processor 1002.
A memory 1001 for storing computer programs that may be run on the processor 1002.
Memory 1001 may include high-speed RAM memory and may also include non-volatile memory (e.g., at least one disk memory).
The processor 1002 is configured to implement the article generation method according to the foregoing embodiment when executing the program.
If the memory 1001, the processor 1002, and the communication interface 1003 are implemented independently, the communication interface 1003, the memory 1001, and the processor 1002 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 4, but this does not indicate only one bus or one type of bus.
Optionally, in a specific implementation, if the memory 1001, the processor 1002, and the communication interface 1003 are integrated on one chip, the memory 1001, the processor 1002, and the communication interface 1003 may complete communication with each other through an internal interface.
The processor 1002 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present invention.
The present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the article generation method as described above.
The present invention also provides a computer program product, which when executed by an instruction processor in the computer program product, implements the article generation method as described above.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (16)

1. An article generation method, comprising:
acquiring a video and corresponding voice, and identifying the voice to obtain each sentence;
acquiring feature information of each sentence, and performing paragraph division on each sentence according to the feature information to obtain a paragraph sequence;
for each paragraph in the paragraph sequence, obtaining a key sentence in the paragraph;
acquiring a time period corresponding to the key sentence, and selecting a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph, wherein the time period is positioned in a time period determined according to a starting time point and an ending time point of the key sentence;
and generating an article according to each paragraph in the paragraph sequence and the corresponding picture, wherein the article comprises each paragraph in the paragraph sequence of the video and the picture corresponding to each paragraph.
2. The method of claim 1, wherein the recognizing the speech to obtain sentences comprises:
recognizing the voice to obtain each word and a timestamp corresponding to each word;
aiming at any two adjacent words, calculating the time difference value of the two adjacent words according to the timestamps corresponding to the two adjacent words;
judging whether the time difference is greater than or equal to a first difference threshold value or not;
if the time difference is smaller than a first difference threshold value, dividing the two adjacent words into the same sentence;
and if the time difference is greater than or equal to a first difference threshold value, dividing the two adjacent words into different sentences to obtain each sentence.
3. The method of claim 1, wherein the feature information comprises: the middle timestamp corresponding to the sentence is used for judging whether the sentence has a connecting word or not;
the paragraph segmentation is performed on each sentence according to the feature information to obtain a paragraph sequence, and the paragraph sequence comprises:
aiming at any two adjacent sentences, calculating the time difference value of the two adjacent sentences according to the middle time stamps corresponding to the two adjacent sentences;
judging whether the time difference is greater than or equal to a second difference threshold value or not, and whether a later sentence in the two adjacent sentences has a connecting word or not;
if the time difference is smaller than a second difference threshold value or a later sentence in the two adjacent sentences has a connecting word, dividing the two adjacent sentences into the same paragraph;
and if the time difference is greater than or equal to a second difference threshold value and the later sentence in the two adjacent sentences has no connecting word, dividing the two adjacent sentences into different paragraphs to obtain a paragraph sequence.
4. The method of claim 3, wherein the second difference threshold is determined by,
generating a time difference set according to the time difference of any two adjacent sentences;
calculating and determining the standard deviation of the time difference value set according to the time difference value set;
and determining the product of the standard deviation and a preset coefficient as the second difference threshold value.
5. The method according to claim 1 or 3, wherein after obtaining the feature information of each sentence and performing paragraph segmentation on each sentence according to the feature information to obtain a paragraph sequence, the method further comprises:
for each paragraph in the paragraph sequence, obtaining a word count for the paragraph;
judging whether the word number of the paragraph is smaller than a preset word number threshold value or not;
and if the number of words of the paragraph is less than the preset number of words threshold, combining the paragraph with the next adjacent paragraph until the number of words of the combined paragraph is greater than or equal to the preset number of words threshold.
6. The method of claim 1, wherein for each paragraph in the sequence of paragraphs, obtaining key sentences in the paragraph comprises:
acquiring a title of the video;
inputting all sentences and the titles in the paragraph sequence into a preset keyword model to obtain each keyword and corresponding weight, and generating a keyword set;
for each paragraph in the paragraph sequence, querying a keyword set according to each sentence in the paragraph to obtain keywords included in each sentence;
determining the weight of each sentence according to the keywords contained in each sentence and the weight corresponding to the keywords;
and determining the sentence with the maximum weight in the paragraph as a key sentence in the paragraph.
7. The method according to claim 1, wherein the obtaining of the time period corresponding to the key sentence comprises:
acquiring a middle timestamp corresponding to the key sentence;
determining a time period corresponding to the key sentence according to the middle timestamp corresponding to the key sentence and a preset threshold; the starting time point of the time period is the difference value between the middle timestamp and the preset threshold value, and the ending time point of the time period is the sum value between the middle timestamp and the preset threshold value.
8. An article generation apparatus, comprising:
the acquisition module is used for acquiring videos and corresponding voices and identifying the voices to obtain sentences;
the segmentation module is used for acquiring the characteristic information of each sentence and segmenting each sentence according to the characteristic information to obtain a paragraph sequence;
the obtaining module is further configured to obtain, for each paragraph in the paragraph sequence, a key sentence in the paragraph;
a selection module, configured to acquire a time period corresponding to the key sentence, and select a key video frame from a video segment corresponding to the time period in the video as a picture corresponding to the paragraph, where the time period is located in a time period determined according to a start time point and an end time point of the key sentence;
and the generation module is used for generating an article according to each paragraph in the paragraph sequence and the corresponding picture, wherein the article comprises each paragraph in the paragraph sequence of the video and the picture corresponding to each paragraph.
9. The apparatus of claim 8, wherein the obtaining module is specifically configured to,
recognizing the voice to obtain each word and a timestamp corresponding to each word;
aiming at any two adjacent words, calculating the time difference value of the two adjacent words according to the timestamps corresponding to the two adjacent words;
judging whether the time difference is greater than or equal to a first difference threshold value or not;
if the time difference is smaller than a first difference threshold value, dividing the two adjacent words into the same sentence;
and if the time difference is greater than or equal to a first difference threshold value, dividing the two adjacent words into different sentences to obtain each sentence.
10. The apparatus of claim 8, wherein the feature information comprises: the middle timestamp corresponding to the sentence is used for judging whether the sentence has a connecting word or not;
the dividing module is specifically configured to,
aiming at any two adjacent sentences, calculating the time difference value of the two adjacent sentences according to the middle time stamps corresponding to the two adjacent sentences;
judging whether the time difference is greater than or equal to a second difference threshold value or not, and whether a later sentence in the two adjacent sentences has a connecting word or not;
if the time difference is smaller than a second difference threshold value or a later sentence in the two adjacent sentences has a connecting word, dividing the two adjacent sentences into the same paragraph;
and if the time difference is greater than or equal to a second difference threshold value and the later sentence in the two adjacent sentences has no connecting word, dividing the two adjacent sentences into different paragraphs to obtain a paragraph sequence.
11. The apparatus of claim 10, wherein the second difference threshold is determined by,
generating a time difference set according to the time difference of any two adjacent sentences;
calculating and determining the standard deviation of the time difference value set according to the time difference value set;
and determining the product of the standard deviation and a preset coefficient as the second difference threshold value.
12. The apparatus of claim 8 or 10, further comprising: a judging module and a merging module;
the obtaining module is further configured to obtain, for each paragraph in the paragraph sequence, a word count of the paragraph;
the judging module is used for judging whether the word number of the paragraph is smaller than a preset word number threshold value;
and the merging module is used for merging the paragraph with a next adjacent paragraph when the number of words of the paragraph is less than a preset word number threshold value until the number of words of the merged paragraph is more than or equal to the preset word number threshold value.
13. The apparatus of claim 8, wherein the obtaining module is specifically configured to,
acquiring a title of the video;
inputting all sentences and the titles in the paragraph sequence into a preset keyword model to obtain each keyword and corresponding weight, and generating a keyword set;
for each paragraph in the paragraph sequence, querying a keyword set according to each sentence in the paragraph to obtain keywords included in each sentence;
determining the weight of each sentence according to the keywords contained in each sentence and the weight corresponding to the keywords;
and determining the sentence with the maximum weight in the paragraph as a key sentence in the paragraph.
14. The apparatus of claim 8, wherein the selection module is specifically configured to,
acquiring a middle timestamp corresponding to the key sentence;
determining a time period corresponding to the key sentence according to the middle timestamp corresponding to the key sentence and a preset threshold; the starting time point of the time period is the difference value between the middle timestamp and the preset threshold value, and the ending time point of the time period is the sum value between the middle timestamp and the preset threshold value.
15. An article generation apparatus, comprising:
memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the article generation method as claimed in any one of claims 1 to 7 when executing the program.
16. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the program, when executed by a processor, implements the article generation method of any one of claims 1-7.
CN201811600339.XA 2018-12-26 2018-12-26 Article generation method and device Active CN109743589B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811600339.XA CN109743589B (en) 2018-12-26 2018-12-26 Article generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811600339.XA CN109743589B (en) 2018-12-26 2018-12-26 Article generation method and device

Publications (2)

Publication Number Publication Date
CN109743589A CN109743589A (en) 2019-05-10
CN109743589B true CN109743589B (en) 2021-12-14

Family

ID=66359996

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811600339.XA Active CN109743589B (en) 2018-12-26 2018-12-26 Article generation method and device

Country Status (1)

Country Link
CN (1) CN109743589B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110245339B (en) * 2019-06-20 2023-04-18 北京百度网讯科技有限公司 Article generation method, article generation device, article generation equipment and storage medium
CN111883136A (en) * 2020-07-30 2020-11-03 潘忠鸿 Rapid writing method and device based on artificial intelligence
CN111966839B (en) * 2020-08-17 2023-07-25 北京奇艺世纪科技有限公司 Data processing method, device, electronic equipment and computer storage medium
CN112733654B (en) * 2020-12-31 2022-05-24 蚂蚁胜信(上海)信息技术有限公司 Method and device for splitting video
CN113286173B (en) * 2021-05-19 2023-08-04 北京沃东天骏信息技术有限公司 Video editing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794104A (en) * 2015-04-30 2015-07-22 努比亚技术有限公司 Multimedia document generating method and device
CN106134216A (en) * 2014-04-11 2016-11-16 三星电子株式会社 Broadcast receiving device and method for summary content service
CN107305541A (en) * 2016-04-20 2017-10-31 科大讯飞股份有限公司 Speech recognition text segmentation method and device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8302010B2 (en) * 2010-03-29 2012-10-30 Avid Technology, Inc. Transcript editor
CN106982344B (en) * 2016-01-15 2020-02-21 阿里巴巴集团控股有限公司 Video information processing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106134216A (en) * 2014-04-11 2016-11-16 三星电子株式会社 Broadcast receiving device and method for summary content service
CN104794104A (en) * 2015-04-30 2015-07-22 努比亚技术有限公司 Multimedia document generating method and device
CN107305541A (en) * 2016-04-20 2017-10-31 科大讯飞股份有限公司 Speech recognition text segmentation method and device

Also Published As

Publication number Publication date
CN109743589A (en) 2019-05-10

Similar Documents

Publication Publication Date Title
CN109743589B (en) Article generation method and device
CN110149540B (en) Recommendation processing method and device for multimedia resources, terminal and readable medium
CN108491529B (en) Information recommendation method and device
CN107679033B (en) Text sentence break position identification method and device
CN107644085B (en) Method and device for generating sports event news
CN110941738B (en) Recommendation method and device, electronic equipment and computer-readable storage medium
CN109862397B (en) Video analysis method, device, equipment and storage medium
CN110337011A (en) Method for processing video frequency, device and equipment
CN109511015B (en) Multimedia resource recommendation method, device, storage medium and equipment
KR20200008059A (en) Estimating and displaying social interest in time-based media
CN110059307B (en) Writing method, device and server
CN112199582B (en) A content recommendation method, device, equipment and medium
CN110019954A (en) A kind of recognition methods and system of the user that practises fraud
US12316891B2 (en) Video generating method and apparatus, electronic device, and readable storage medium
WO2017173801A1 (en) Personalized multimedia recommendation method and apparatus
US20240371370A1 (en) Subtitle generation method, apparatus, electronic device, storage medium and program
CN107547922B (en) Information processing method, device, system and computer readable storage medium
CN113378000B (en) Video title generation method and device
CN113129902B (en) Voice processing method and device, electronic equipment and storage medium
CN108235126B (en) Method and device for inserting recommendation information in video
CN111475409B (en) System test method, device, electronic equipment and storage medium
CN113259728A (en) Method and device for recommending video, electronic equipment and storage medium
CN110569447B (en) Network resource recommendation method and device and storage medium
CN110933504B (en) Video recommendation method, device, server and storage medium
CN114187545B (en) Progressive lens identification method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant