[go: up one dir, main page]

CN107480136B - A Method Applied to Sentiment Curve Analysis in Movie Scripts - Google Patents

A Method Applied to Sentiment Curve Analysis in Movie Scripts Download PDF

Info

Publication number
CN107480136B
CN107480136B CN201710652374.5A CN201710652374A CN107480136B CN 107480136 B CN107480136 B CN 107480136B CN 201710652374 A CN201710652374 A CN 201710652374A CN 107480136 B CN107480136 B CN 107480136B
Authority
CN
China
Prior art keywords
scene
character
word
emotion
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710652374.5A
Other languages
Chinese (zh)
Other versions
CN107480136A (en
Inventor
逄泽沐风
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pang Zewenyue
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201710652374.5A priority Critical patent/CN107480136B/en
Publication of CN107480136A publication Critical patent/CN107480136A/en
Application granted granted Critical
Publication of CN107480136B publication Critical patent/CN107480136B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Educational Administration (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a method for analyzing an emotional curve in a movie scenario, which comprises the following steps: 1) constructing an emotion dictionary, including a positive dictionary, a negative dictionary and a neutral dictionary; 2) carrying out scene division and sentence extraction on the movie scenario according to scenes to obtain scene sentences and character sentences; 3) segmenting words of the scene sentences and the character sentences to obtain scene vocabularies and character vocabularies; 4) performing part-of-speech division, word frequency statistics and weight determination on scene vocabularies and character vocabularies; 5) importing the word frequency and the weighted value into a calculation formula to obtain a scene emotion index and a character emotion index; 6) and drawing a scene emotion curve and a character emotion curve of the movie scenario. The invention provides a method for analyzing the emotional curve in the movie script, which has no requirement on working experience and is convenient and quick to calculate.

Description

一种应用于电影剧本中情感曲线分析的方法A Method Applied to Sentiment Curve Analysis in Movie Scripts

技术领域technical field

本发明涉及电影剧本分析领域,更具体地,涉及一种应用于电影剧本中情感曲线分析的方法。The invention relates to the field of movie script analysis, and more particularly, to a method applied to emotional curve analysis in movie scripts.

背景技术Background technique

观看电影是人们主要的休闲娱乐方式之一,随着物质生活的逐渐改善,人们对于电影的要求也随之增高,一部电影的优异程度与电影剧本的水平高低密切相关,所以,电影投资人在考察一部电影是否值得投资时,必然会对电影剧本进行相应的评价,而电影剧本整体的情感变化和剧本中每个人物的情感变化是评价一部电影剧本质量的重要指标,然而这种情感变化是隐藏在电影剧本字里行间无形的感情起伏,人们无法直接看到这种情感变化,目前,获取一部电影剧本中的情感变化仍旧较为困难,主要通过有经验的电影从业人员阅读电影剧本进行情感分析,并对该剧本中的情感变化做出归纳和评价,这种分析方法对于工作人员的电影从业经验要求较高,而且耗时较长,如何快速有效的分析一部电影剧本中的情感曲线,降低对工作人员的电影从业经验要求,减少评价过程用时,是本领域亟待解决的问题。Watching movies is one of people's main leisure and entertainment methods. With the gradual improvement of material life, people's requirements for movies also increase. The excellence of a movie is closely related to the level of the script. Therefore, movie investors When examining whether a movie is worth investing in, it is inevitable to evaluate the movie script accordingly, and the overall emotional changes of the movie script and the emotional changes of each character in the script are important indicators for evaluating the quality of a movie script. Emotional change is the invisible emotional ups and downs hidden between the lines of a movie script. People cannot directly see such emotional changes. At present, it is still difficult to obtain the emotional changes in a movie script. It is mainly carried out by experienced film practitioners reading the movie script. Sentiment analysis, and summarize and evaluate the emotional changes in the script. This analysis method requires high film experience for the staff and takes a long time. How to quickly and effectively analyze the emotions in a movie script Curves, reducing the film industry experience requirements for staff, and reducing the time spent in the evaluation process are urgent problems to be solved in this field.

因此有必要提供一种应用于电影剧本中情感曲线分析的方法,解决上述问题。Therefore, it is necessary to provide a method for emotional curve analysis in movie scripts to solve the above problems.

发明内容SUMMARY OF THE INVENTION

有鉴于此,本发明提供了一种应用于电影剧本中情感曲线分析的方法,以解决现有电影剧本情感分析过程对工作人员电影从业经验要求较高,而且耗时较长的问题。In view of this, the present invention provides a method for emotional curve analysis in a movie script, so as to solve the problem that the existing movie script emotional analysis process requires relatively high movie experience for staff and takes a long time.

为了解决上述技术问题,本申请有如下技术方案:In order to solve the above-mentioned technical problems, the application has the following technical solutions:

一种应用于电影剧本中情感曲线分析的方法,包括:A method applied to sentiment curve analysis in movie scripts, including:

构建所述电影剧本的情感词典,包括:积极词典、消极词典和中性词典,对所述情感词典中每个词语赋予权重值;constructing an emotion dictionary of the movie script, including: a positive dictionary, a negative dictionary and a neutral dictionary, and assigning a weight value to each word in the emotion dictionary;

对所述电影剧本,按照场景进行划分,对划分后的每个所述场景,进行语句提取,得到每个所述场景的场景语句;The movie script is divided according to the scene, and sentence extraction is performed on each of the divided scenes to obtain the scene sentence of each of the scenes;

从每个所述场景语句中,提取主语为人物的语句,并按照所述人物进行分类,得到每个所述场景内,每个所述人物的人物语句;From each of the scene sentences, extract the sentences whose subjects are characters, and classify them according to the characters, so as to obtain the character sentences of each of the characters in each of the scenes;

对所述场景语句进行分词,得到每个所述场景的场景词汇,Perform word segmentation on the scene sentences to obtain the scene vocabulary of each of the scenes,

对所述人物语句进行分词,得到每个所述场景内每个所述人物的人物词汇;Perform word segmentation on the character statement to obtain the character vocabulary of each of the characters in each of the scenes;

将所述场景词汇和所述人物词汇,依照所述情感词典,分为:积极词、消极词和中性词,并统计每个所述词语的词频和所述权重值;Divide the scene vocabulary and the character vocabulary into positive words, negative words and neutral words according to the sentiment dictionary, and count the word frequency and the weight value of each of the words;

将所述场景词汇的所述词频和所述权重值导入计算公式,得到每个所述场景的场景情感指数,The word frequency and the weight value of the scene vocabulary are imported into the calculation formula to obtain the scene emotion index of each of the scenes,

将所述人物词汇的所述词频和所述权重值导入所述计算公式,得到每个人物的人物情感指数,其中,The word frequency and the weight value of the character vocabulary are imported into the calculation formula to obtain the character emotion index of each character, wherein,

所述计算公式为:The calculation formula is:

Figure GDA0002405699210000021
Figure GDA0002405699210000021

其中:n为所述积极词的总个数,m为所述消极词的总个数,Pi为第i个所述积极词的所述词频,Wi为第i个所述积极词的所述权重,Nj为第j个所述消极词的所述词频,Wj为第j个所述消极词的所述权重,L为所述积极词、所述消极词和所述中性词的个数总和,x为不小于1的正数;Wherein: n is the total number of the positive words, m is the total number of the negative words, P i is the word frequency of the i-th positive word, Wi is the i -th positive word The weight, N j is the word frequency of the jth negative word, W j is the weight of the jth negative word, L is the positive word, the negative word and the neutral The sum of the number of words, x is a positive number not less than 1;

以场景编号为自变量,所述场景情感指数为因变量,绘制所述电影剧本的场景情感曲线,Taking the scene number as an independent variable, and the scene emotion index as a dependent variable, draw the scene emotion curve of the movie script,

以场景编号为所述自变量,所述人物情感指数为所述因变量,绘制每个所述人物的的人物情感曲线。Taking the scene number as the independent variable and the character emotion index as the dependent variable, draw a character emotion curve of each of the characters.

可选的,所述情感词典,包括:中文情感极性词典、知网的情感分析用词语集以及自主开发的情感词典。Optionally, the sentiment dictionary includes: a Chinese sentiment polarity dictionary, a set of words for sentiment analysis of HowNet, and a self-developed sentiment dictionary.

可选的,所述场景语句中,还包含:主语不是人物的氛围语句。Optionally, the scene sentence further includes: an atmosphere sentence in which the subject is not a character.

可选的,所述分词,采用汉语言处理包或汉语词法分析系统。Optionally, the word segmentation adopts a Chinese language processing package or a Chinese lexical analysis system.

可选的,所述积极词和所述消极词的所述权重值大于0,且不大于1,所述中性词的所述权重值等于0。Optionally, the weight value of the positive word and the negative word is greater than 0 and not greater than 1, and the weight value of the neutral word is equal to 0.

可选的,所述场景情感曲线和所述人物情感曲线,绘制成折线图、柱状图、散点图和扇形图。Optionally, the scene emotion curve and the character emotion curve are drawn as a line graph, a bar graph, a scatter graph and a fan graph.

与现有技术相比,本发明提供的一种应用于电影剧本中情感曲线分析的方法,实现了如下的有益效果:Compared with the prior art, the present invention provides a method for analyzing emotion curves in movie scripts, which achieves the following beneficial effects:

1)本发明所提供的一种应用于电影剧本中情感曲线分析的方法,整个方法中没有任何步骤对于人员有电影从业经验的要求,计算过程方便快捷,解决了以往对于人员经验要求高、耗时长的问题;1) A method provided by the present invention is applied to emotional curve analysis in movie scripts. There is no step in the whole method that requires personnel to have experience in the film industry, and the calculation process is convenient and fast, which solves the problem of high requirements for personnel experience and high consumption in the past. the question of time;

2)本发明所提供的一种应用于电影剧本中情感曲线分析的方法,可以分析每个人物的人物情感曲线,方便电影剧本作者判断每个人物的剧情安排是否合适,并根据该情感变化选取最匹配的演员;2) A kind of method that is applied to emotional curve analysis in movie scripts provided by the present invention can analyze the character emotional curve of each character, which is convenient for movie script writers to judge whether the plot arrangement of each character is suitable, and select according to the emotional changes. the best matching actor;

3)本发明所提供的一种应用于电影剧本中情感曲线分析的方法,将不可视的情感变化绘制成了可视的情感曲线,为投资方判断电影剧本的整体水平提供了依据,降低了投资风险。3) A method for analyzing emotion curves in movie scripts provided by the present invention draws invisible emotional changes into visible emotion curves, which provides a basis for investors to judge the overall level of movie scripts and reduces the risk of loss. investment risk.

附图说明Description of drawings

被结合在说明书中并构成说明书的一部分的附图示出了本发明的实施例,并且连同其说明一起用于解释本发明的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention.

图1为本发明“一种应用于电影剧本中情感曲线分析的方法”流程图Fig. 1 is the flow chart of "a method applied to emotional curve analysis in movie script" of the present invention

具体实施方式Detailed ways

现在将参照附图来详细描述本发明的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本发明的范围。Various exemplary embodiments of the present invention will now be described in detail with reference to the accompanying drawings. It should be noted that the relative arrangement of components and steps, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the invention unless specifically stated otherwise.

以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本发明及其应用或使用的任何限制。The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the invention, its application, or uses.

对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail, but where appropriate, such techniques, methods, and apparatus should be considered part of the specification.

在这里示出和讨论的所有例子中,任何具体值应被解释为仅仅是示例性的,而不是作为限制。因此,示例性实施例的其它例子可以具有不同的值。应注意到:一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。In all examples shown and discussed herein, any specific values should be construed as illustrative only and not limiting. Accordingly, other instances of the exemplary embodiment may have different values. It should be noted that once an item is defined in one figure, it does not require further discussion in subsequent figures.

实施例1:Example 1:

图1为本发明“一种应用于电影剧本中情感曲线分析的方法”流程图。Fig. 1 is a flow chart of "a method applied to emotional curve analysis in a movie script" of the present invention.

一种应用于电影剧本中情感曲线分析的方法,如图1所示,包括如下步骤:A method applied to sentiment curve analysis in movie scripts, as shown in Figure 1, includes the following steps:

步骤S101:构建所述电影剧本的情感词典,包括:积极词典、消极词典和中性词典,对所述情感词典中每个词语赋予权重值。Step S101: Build an emotion dictionary of the movie script, including: a positive dictionary, a negative dictionary, and a neutral dictionary, and assign a weight value to each word in the emotion dictionary.

具体的,积极词典、消极词典和中性词典没有交集、一个词语只能属于其中一个词典,一个词语的权重与该词表示的感情强烈程度有关,表示的程度越深,权重值越大。Specifically, there is no intersection between the positive dictionary, the negative dictionary and the neutral dictionary, and a word can only belong to one of the dictionaries. The weight of a word is related to the emotional intensity expressed by the word.

步骤S102:对电影剧本,按照场景进行划分,对划分后的每个场景,进行语句提取,得到每个场景的场景语句;从每个场景语句中,提取主语为人物的语句,并按照人物进行分类,得到每个场景内,每个人物的人物语句。Step S102: Divide the movie script according to the scene, extract sentences for each scene after the division, and obtain the scene sentences of each scene; Classification, get the character statement of each character in each scene.

具体的,对电影剧本进行场景划分时,以电影剧本中标注的场景编号为依据,电影剧本中的所有内容都应被包含在场景语句内。Specifically, when the movie script is divided into scenes, based on the scene number marked in the movie script, all the content in the movie script should be included in the scene statement.

步骤S103:对场景语句进行分词,得到每个场景的场景词汇,对人物语句进行分词,得到每个场景内每个人物的人物词汇。Step S103: Perform word segmentation on the scene sentences to obtain the scene vocabulary of each scene, and perform word segmentation on the character sentences to obtain the character vocabulary of each character in each scene.

具体的,对于人物语句的个数设定有阈值,如果某一个人物的人物语句个数小于该阈值,则不统计该人物的人物语句,对于场景语句的个数不设定阈值。Specifically, a threshold is set for the number of character sentences. If the number of character sentences of a certain character is less than the threshold, the character sentences of this character are not counted, and no threshold is set for the number of scene sentences.

步骤104:对场景语句进行分词,得到每个场景的场景词汇,对人物语句进行分词,得到每个场景内每个人物的人物词汇。Step 104: Perform word segmentation on the scene sentence to obtain the scene vocabulary of each scene, and perform word segmentation on the character sentence to obtain the character vocabulary of each character in each scene.

具体的对于人物词汇的个数设定有阈值,如果某一个人物的人物词汇个数小于该阈值,则不统计该人物的人物词汇,对于场景词汇的个数不设定阈值。Specifically, a threshold is set for the number of character vocabulary. If the number of character vocabulary of a certain character is less than the threshold, the character vocabulary of the character is not counted, and no threshold is set for the number of scene vocabulary.

步骤105:将场景词汇和人物词汇,依照情感词典,分为:积极词、消极词和中性词,并统计每个词语的词频和权重值;Step 105: Divide the scene vocabulary and character vocabulary into positive words, negative words and neutral words according to the sentiment dictionary, and count the word frequency and weight value of each word;

具体的,词频是指一个词语出现的次数,对于任一电影剧本,任一词语的权重值为定值,对于两部不同的电影剧本,同一个词语的权重值可以相同或不同。Specifically, the word frequency refers to the number of times a word appears. For any movie script, the weight value of any word is a fixed value, and for two different movie scripts, the weight value of the same word can be the same or different.

步骤106:将场景词汇的词频和权重值导入计算公式,得到每个场景的场景情感指数,将人物词汇的词频和权重值导入计算公式,得到每个人物的人物情感指数。Step 106: Import the word frequency and weight value of the scene vocabulary into the calculation formula to obtain the scene emotion index of each scene, and import the word frequency and weight value of the character vocabulary into the calculation formula to obtain the character emotion index of each character.

计算公式为:The calculation formula is:

Figure GDA0002405699210000051
Figure GDA0002405699210000051

其中:n为积极词的总个数,m为消极词的总个数,Pi为第i个积极词的词频,Wi为第i个积极词的权重,Nj为第j个消极词的词频,Wj为第j个消极词的权重,L为积极词、消极词和中性词的个数总和,x为大于等于1的正数。Among them: n is the total number of positive words, m is the total number of negative words, P i is the word frequency of the i -th positive word, Wi is the weight of the i-th positive word, N j is the j-th negative word The word frequency of , W j is the weight of the jth negative word, L is the sum of the number of positive words, negative words and neutral words, and x is a positive number greater than or equal to 1.

具体的,计算所得的场景情感指数如果大于0,表示该场景的情感是积极地,场景情感指数如果小于0,表示该场景的情感是消极的,场景情感指数如果等于0,表示该场景的情感是中性的。Specifically, if the calculated scene emotion index is greater than 0, it means that the emotion of the scene is positive, if the scene emotion index is less than 0, it means that the emotion of the scene is negative, and if the scene emotion index is equal to 0, it means that the emotion of the scene is is neutral.

计算所得的人物情感指数如果大于0,表示该人物在该场景的情感是积极地,人物情感指数如果小于0,表示该人物在该场景的情感是消极的,场景情感指数如果等于0,表示该人物在该场景的情感是中性的。If the calculated character emotion index is greater than 0, it means that the character's emotion in the scene is positive; if the character emotion index is less than 0, it means that the character's emotion in the scene is negative; if the scene emotion index is equal to 0, it means the The emotions of the characters in the scene are neutral.

步骤S107:以场景编号为自变量,场景情感指数为因变量,绘制所述电影剧本的场景情感曲线,以场景编号为自变量,所述人物情感指数为因变量,绘制每个人物的的人物情感曲线。Step S107: Using the scene number as the independent variable and the scene emotion index as the dependent variable, draw the scene emotion curve of the movie script, using the scene number as the independent variable and the character emotion index as the dependent variable, draw the character of each character Emotional curve.

具体的,所绘制的场景情感曲线表明整部电影剧本的情感变化走势,人物情感曲线表明某个特定人物的情感变化走势,任一人物的情感变化走势与电影剧本的情感变化走势无关。Specifically, the drawn scene emotion curve indicates the emotional change trend of the entire movie script, the character emotion curve indicates the emotional change trend of a specific character, and the emotional change trend of any character has nothing to do with the emotional change trend of the movie script.

实施例2Example 2

图1为本发明“一种应用于电影剧本中情感曲线分析的方法”流程图。Fig. 1 is a flow chart of "a method applied to emotional curve analysis in a movie script" of the present invention.

一种应用于电影剧本中情感曲线分析的方法,如图1所示,包括如下步骤:A method applied to sentiment curve analysis in movie scripts, as shown in Figure 1, includes the following steps:

步骤S101:构建所述电影剧本的情感词典,包括:积极词典、消极词典和中性词典,对所述情感词典中每个词语赋予权重值。Step S101: Build an emotion dictionary of the movie script, including: a positive dictionary, a negative dictionary, and a neutral dictionary, and assign a weight value to each word in the emotion dictionary.

具体的,积极词典、消极词典和中性词典没有交集、一个词语只能属于其中一个词典,一个词语的权重与该词表示的感情强烈程度有关,表示的程度越深,权重值越大。Specifically, there is no intersection between the positive dictionary, the negative dictionary and the neutral dictionary, and a word can only belong to one of the dictionaries. The weight of a word is related to the emotional intensity expressed by the word.

步骤S102:对电影剧本,按照场景进行划分,对划分后的每个场景,进行语句提取,得到每个场景的场景语句;从每个场景语句中,提取主语为人物的语句,并按照人物进行分类,得到每个场景内,每个人物的人物语句。Step S102: Divide the movie script according to the scene, extract sentences for each scene after the division, and obtain the scene sentences of each scene; Classification, get the character statement of each character in each scene.

具体的,对电影剧本进行场景划分时,以电影剧本中标注的场景编号为依据,电影剧本中的所有内容都应被包含在场景语句内。Specifically, when the movie script is divided into scenes, based on the scene number marked in the movie script, all the content in the movie script should be included in the scene statement.

步骤S103:对场景语句进行分词,得到每个场景的场景词汇,对人物语句进行分词,得到每个场景内每个人物的人物词汇。Step S103: Perform word segmentation on the scene sentences to obtain the scene vocabulary of each scene, and perform word segmentation on the character sentences to obtain the character vocabulary of each character in each scene.

具体的,对于人物语句的个数设定有阈值,如果某一个人物的人物语句个数小于该阈值,则不统计该人物的人物语句,对于场景语句的个数不设定阈值。Specifically, a threshold is set for the number of character sentences. If the number of character sentences of a certain character is less than the threshold, the character sentences of this character are not counted, and no threshold is set for the number of scene sentences.

步骤104:对场景语句进行分词,得到每个场景的场景词汇,对人物语句进行分词,得到每个场景内每个人物的人物词汇。Step 104: Perform word segmentation on the scene sentence to obtain the scene vocabulary of each scene, and perform word segmentation on the character sentence to obtain the character vocabulary of each character in each scene.

具体的对于人物词汇的个数设定有阈值,如果某一个人物的人物词汇个数小于该阈值,则不统计该人物的人物词汇,对于场景词汇的个数不设定阈值。Specifically, a threshold is set for the number of character vocabulary. If the number of character vocabulary of a certain character is less than the threshold, the character vocabulary of the character is not counted, and no threshold is set for the number of scene vocabulary.

步骤105:将场景词汇和人物词汇,依照情感词典,分为:积极词、消极词和中性词,并统计每个词语的词频和权重值;Step 105: Divide the scene vocabulary and character vocabulary into positive words, negative words and neutral words according to the sentiment dictionary, and count the word frequency and weight value of each word;

具体的,词频是指一个词语出现的次数,对于任一电影剧本,任一词语的权重值为定值,对于两部不同的电影剧本,同一个词语的权重值可以相同或不同。Specifically, the word frequency refers to the number of times a word appears. For any movie script, the weight value of any word is a fixed value, and for two different movie scripts, the weight value of the same word can be the same or different.

步骤106:将场景词汇的词频和权重值导入计算公式,得到每个场景的场景情感指数,将人物词汇的词频和权重值导入计算公式,得到每个人物的人物情感指数。Step 106: Import the word frequency and weight value of the scene vocabulary into the calculation formula to obtain the scene emotion index of each scene, and import the word frequency and weight value of the character vocabulary into the calculation formula to obtain the character emotion index of each character.

计算公式为:The calculation formula is:

Figure GDA0002405699210000071
Figure GDA0002405699210000071

其中:n为积极词的总个数,m为消极词的总个数,Pi为第i个积极词的词频,Wi为第i个积极词的权重,Nj为第j个消极词的词频,Wj为第j个消极词的权重,L为积极词、消极词和中性词的个数总和,x为大于等于1的正数。Among them: n is the total number of positive words, m is the total number of negative words, P i is the word frequency of the i -th positive word, Wi is the weight of the i-th positive word, N j is the j-th negative word The word frequency of , W j is the weight of the jth negative word, L is the sum of the number of positive words, negative words and neutral words, and x is a positive number greater than or equal to 1.

具体的,计算所得的场景情感指数如果大于0,表示该场景的情感是积极地,场景情感指数如果小于0,表示该场景的情感是消极的,场景情感指数如果等于0,表示该场景的情感是中性的。Specifically, if the calculated scene emotion index is greater than 0, it means that the emotion of the scene is positive, if the scene emotion index is less than 0, it means that the emotion of the scene is negative, and if the scene emotion index is equal to 0, it means that the emotion of the scene is is neutral.

计算所得的人物情感指数如果大于0,表示该人物在该场景的情感是积极地,人物情感指数如果小于0,表示该人物在该场景的情感是消极的,场景情感指数如果等于0,表示该人物在该场景的情感是中性的。If the calculated character emotion index is greater than 0, it means that the character's emotion in the scene is positive; if the character emotion index is less than 0, it means that the character's emotion in the scene is negative; if the scene emotion index is equal to 0, it means the The emotions of the characters in the scene are neutral.

步骤S107:以场景编号为自变量,场景情感指数为因变量,绘制所述电影剧本的场景情感曲线,以场景编号为自变量,所述人物情感指数为因变量,绘制每个人物的的人物情感曲线。Step S107: Using the scene number as the independent variable and the scene emotion index as the dependent variable, draw the scene emotion curve of the movie script, using the scene number as the independent variable and the character emotion index as the dependent variable, draw the character of each character Emotional curve.

具体的,所绘制的场景情感曲线表明整部电影剧本的情感变化走势,人物情感曲线表明某个特定人物的情感变化走势,任一人物的情感变化走势与电影剧本的情感变化走势无关。Specifically, the drawn scene emotion curve indicates the emotional change trend of the entire movie script, the character emotion curve indicates the emotional change trend of a specific character, and the emotional change trend of any character has nothing to do with the emotional change trend of the movie script.

进一步的,情感词典包括中文情感极性词典、知网的情感分析用词语集以及自主开发的情感词典。Further, the sentiment dictionary includes Chinese sentiment polarity dictionary, HowNet's sentiment analysis word set and self-developed sentiment dictionary.

具体的,自主开发的情感词典包括积极词典、消极词典和中性词典,对于其中的积极词典和消极词典,根据词语表达的感情强烈程度,还会做出进一步细分。Specifically, the self-developed sentiment dictionaries include positive dictionaries, negative dictionaries, and neutral dictionaries. For the positive dictionaries and negative dictionaries, further subdivisions will be made according to the intensity of the emotions expressed by the words.

进一步的,场景语句中场景语句中,还包含:主语不是人物的氛围语句。Further, the scene sentence in the scene sentence further includes: the subject is not an atmosphere sentence of a character.

具体的,可对氛围语句进行计算,绘制气氛情感曲线,如果场景语句中气氛语句的个数小于阈值,此时无法绘制气氛情感曲线。Specifically, the atmosphere sentence can be calculated to draw the atmosphere emotion curve. If the number of atmosphere sentences in the scene sentence is less than the threshold, the atmosphere emotion curve cannot be drawn at this time.

进一步的,分词采用汉语言处理包或汉语词法分析系统。Further, the word segmentation adopts a Chinese language processing package or a Chinese lexical analysis system.

具体的,分词过程采用计算软件,确保分词的准确性和完整性。Specifically, the word segmentation process adopts computing software to ensure the accuracy and completeness of the word segmentation.

进一步的,积极词和消极词的权重值大于0,且不大于1,中性词的权重值等于0。Further, the weight value of positive words and negative words is greater than 0 and not greater than 1, and the weight value of neutral words is equal to 0.

具体的,权重值是预先设定在情感词典中的数值,对任一电影剧本进行计算时,权重值是固定值,但对不同电影剧本进行计算时,权重值可以相同或不同,积极词和消极词的权重值根据不同的电影剧本会做出相应的调整,这种调整的依据,包括:词语表达的含义和情感强烈程度随时代变化而发生的改变。Specifically, the weight value is a value preset in the sentiment dictionary. When calculating any movie script, the weight value is a fixed value, but when calculating different movie scripts, the weight value can be the same or different. Positive words and The weight value of negative words will be adjusted accordingly according to different movie scripts. The basis of this adjustment includes: the meaning and emotional intensity expressed by the words change with the times.

进一步的,场景情感曲线和人物情感曲线,绘制成折线图、柱状图、散点图和扇形图。Further, the scene emotion curve and the character emotion curve are drawn into a line graph, a bar graph, a scatter graph and a fan graph.

具体的,可以绘制全部的场景情感曲线和人物情感曲线,也可以只绘制其中一部分。所绘制的场景情感曲线和人物情感曲线包括:二维曲线和三维曲线。二维曲线和三维曲线的表现形式包括:静态和动态。Specifically, all the scene emotion curves and character emotion curves may be drawn, or only a part of them may be drawn. The drawn scene emotion curves and character emotion curves include: two-dimensional curves and three-dimensional curves. The expressions of two-dimensional and three-dimensional curves include: static and dynamic.

通过上述实施例可知,本发明的,达到了如下的有益效果:It can be seen from the above embodiments that the present invention has achieved the following beneficial effects:

1)本发明所提供的一种应用于电影剧本中情感曲线分析的方法,整个方法中没有任何步骤对于人员有电影从业经验的要求,计算过程方便快捷,解决了以往对于人员经验要求高、耗时长的问题;1) A method provided by the present invention is applied to emotional curve analysis in movie scripts. There is no step in the whole method that requires personnel to have experience in the film industry, and the calculation process is convenient and fast, which solves the problem of high requirements for personnel experience and high consumption in the past. the question of time;

2)本发明所提供的一种应用于电影剧本中情感曲线分析的方法,可以分析每个人物的人物情感曲线,方便电影剧本作者判断每个人物的剧情安排是否合适,并根据该情感变化选取最匹配的演员;2) A kind of method that is applied to emotional curve analysis in movie scripts provided by the present invention can analyze the character emotional curve of each character, which is convenient for movie script writers to judge whether the plot arrangement of each character is suitable, and select according to the emotional changes. the best matching actor;

3)本发明所提供的一种应用于电影剧本中情感曲线分析的方法,将不可视的情感变化绘制成了可视的情感曲线,为投资方判断电影剧本的整体水平提供了依据,降低了投资风险。3) A method for analyzing emotion curves in movie scripts provided by the present invention draws invisible emotional changes into visible emotion curves, which provides a basis for investors to judge the overall level of movie scripts and reduces the risk of loss. investment risk.

虽然已经通过例子对本发明的一些特定实施例进行了详细说明,但是本领域的技术人员应该理解,以上例子仅是为了进行说明,而不是为了限制本发明的范围。本领域的技术人员应该理解,可在不脱离本发明的范围和精神的情况下,对以上实施例进行修改。本发明的范围由所附权利要求来限定。Although some specific embodiments of the present invention have been described in detail by way of examples, those skilled in the art should understand that the above examples are provided for illustration only and not for the purpose of limiting the scope of the present invention. Those skilled in the art will appreciate that modifications may be made to the above embodiments without departing from the scope and spirit of the present invention. The scope of the invention is defined by the appended claims.

Claims (6)

1.一种应用于电影剧本中情感曲线分析的方法,其特征在于,包括:1. a method that is applied to emotional curve analysis in movie script, is characterized in that, comprises: 构建所述电影剧本的情感词典,包括:积极词典、消极词典和中性词典,对所述情感词典中每个词语赋予权重值;constructing an emotion dictionary of the movie script, including: a positive dictionary, a negative dictionary and a neutral dictionary, and assigning a weight value to each word in the emotion dictionary; 对所述电影剧本,按照场景进行划分,对划分后的每个所述场景,进行语句提取,得到每个所述场景的场景语句;The movie script is divided according to the scene, and sentence extraction is performed on each of the divided scenes to obtain the scene sentence of each of the scenes; 从每个所述场景语句中,提取主语为人物的语句,并按照所述人物进行分类,得到每个所述场景内,每个所述人物的人物语句;From each of the scene sentences, extract the sentences whose subjects are characters, and classify them according to the characters, so as to obtain the character sentences of each of the characters in each of the scenes; 对所述场景语句进行分词,得到每个所述场景的场景词汇,Perform word segmentation on the scene sentences to obtain the scene vocabulary of each of the scenes, 对所述人物语句进行分词,得到每个所述场景内每个所述人物的人物词汇;Perform word segmentation on the character statement to obtain the character vocabulary of each of the characters in each of the scenes; 将所述场景词汇和所述人物词汇,依照所述情感词典,分为:积极词、消极词和中性词,并统计每个所述词语的词频和所述权重值;Divide the scene vocabulary and the character vocabulary into positive words, negative words and neutral words according to the sentiment dictionary, and count the word frequency and the weight value of each of the words; 将所述场景词汇的所述词频和所述权重值导入计算公式,得到每个所述场景的场景情感指数,The word frequency and the weight value of the scene vocabulary are imported into the calculation formula to obtain the scene emotion index of each of the scenes, 将所述人物词汇的所述词频和所述权重值导入所述计算公式,得到每个人物的人物情感指数,其中,The word frequency and the weight value of the character vocabulary are imported into the calculation formula to obtain the character emotion index of each character, wherein, 所述计算公式为:The calculation formula is:
Figure FDA0002405699200000011
Figure FDA0002405699200000011
其中:n为所述积极词的总个数,m为所述消极词的总个数,Pi为第i个所述积极词的所述词频,Wi为第i个所述积极词的所述权重,Nj为第j个所述消极词的所述词频,Wj为第j个所述消极词的所述权重,L为所述积极词、所述消极词和所述中性词的个数总和,x为不小于1的正数;Wherein: n is the total number of the positive words, m is the total number of the negative words, P i is the word frequency of the i-th positive word, Wi is the i -th positive word The weight, N j is the word frequency of the jth negative word, W j is the weight of the jth negative word, L is the positive word, the negative word and the neutral The sum of the number of words, x is a positive number not less than 1; 以场景编号为自变量,所述场景情感指数为因变量,绘制所述电影剧本的场景情感曲线,Taking the scene number as an independent variable, and the scene emotion index as a dependent variable, draw the scene emotion curve of the movie script, 以场景编号为所述自变量,所述人物情感指数为所述因变量,绘制每个所述人物的人物情感曲线。Taking the scene number as the independent variable and the character emotion index as the dependent variable, draw a character emotion curve of each of the characters.
2.如权利要求1所述的一种应用于电影剧本中情感曲线分析的方法,其特征在于,所述情感词典,包括:中文情感极性词典、知网的情感分析用词语集以及自主开发的情感词典。2. a kind of method that is applied to emotional curve analysis in movie script as claimed in claim 1, it is characterized in that, described emotional dictionary, comprises: Chinese emotional polarity dictionary, the sentiment analysis of HowNet uses word set and self-developed Emotional Dictionary. 3.如权利要求1所述的一种应用于电影剧本中情感曲线分析的方法,其特征在于,所述场景语句中,还包含:主语不是人物的氛围语句。3 . The method of claim 1 , wherein the scene sentence further comprises: an atmosphere sentence whose subject is not a character. 4 . 4.如权利要求1所述的一种应用于电影剧本中情感曲线分析的方法,其特征在于,所述分词,采用汉语言处理包或汉语词法分析系统。4 . The method as claimed in claim 1 , wherein the word segmentation adopts a Chinese language processing package or a Chinese lexical analysis system. 5 . 5.如权利要求1所述的一种应用于电影剧本中情感曲线分析的方法,其特征在于,所述积极词和所述消极词的所述权重值大于0,且不大于1,所述中性词的所述权重值等于0。5. The method applied to sentiment curve analysis in movie scripts according to claim 1, wherein the weights of the positive words and the negative words are greater than 0 and not greater than 1, and the weights of the positive words and the negative words are The weight value for neutral words is equal to zero. 6.如权利要求1所述的一种应用于电影剧本中情感曲线分析的方法,其特征在于,所述场景情感曲线和所述人物情感曲线,绘制成折线图、柱状图、散点图和扇形图。6. a kind of method that is applied to emotional curve analysis in movie script as claimed in claim 1, it is characterized in that, described scene emotional curve and described character emotional curve are drawn into line graph, bar graph, scatter graph and Fan chart.
CN201710652374.5A 2017-08-02 2017-08-02 A Method Applied to Sentiment Curve Analysis in Movie Scripts Active CN107480136B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710652374.5A CN107480136B (en) 2017-08-02 2017-08-02 A Method Applied to Sentiment Curve Analysis in Movie Scripts

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710652374.5A CN107480136B (en) 2017-08-02 2017-08-02 A Method Applied to Sentiment Curve Analysis in Movie Scripts

Publications (2)

Publication Number Publication Date
CN107480136A CN107480136A (en) 2017-12-15
CN107480136B true CN107480136B (en) 2020-07-03

Family

ID=60598271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710652374.5A Active CN107480136B (en) 2017-08-02 2017-08-02 A Method Applied to Sentiment Curve Analysis in Movie Scripts

Country Status (1)

Country Link
CN (1) CN107480136B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108549630B (en) * 2018-03-29 2021-07-30 西安影视数据评估中心有限公司 Method for identifying turning points of film and television script stories
CN110427485B (en) * 2018-04-27 2022-07-05 北京海马轻帆娱乐科技有限公司 Literature and literature classification method and apparatus
JP2019219830A (en) * 2018-06-18 2019-12-26 株式会社コミチ Emotion evaluation method
CN110457691B (en) * 2019-07-26 2023-03-24 北京影谱科技股份有限公司 Script role based emotional curve analysis method and device
CN113553423B (en) * 2021-07-05 2023-10-10 北京奇艺世纪科技有限公司 Scenario information processing method and device, electronic equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095183A (en) * 2014-05-22 2015-11-25 株式会社日立制作所 Text emotional tendency determination method and system
CN106919661A (en) * 2017-02-13 2017-07-04 腾讯科技(深圳)有限公司 A kind of affective style recognition methods and relevant apparatus

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101261623A (en) * 2007-03-07 2008-09-10 国际商业机器公司 Word splitting method and device for word border-free mark language based on search

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095183A (en) * 2014-05-22 2015-11-25 株式会社日立制作所 Text emotional tendency determination method and system
CN106919661A (en) * 2017-02-13 2017-07-04 腾讯科技(深圳)有限公司 A kind of affective style recognition methods and relevant apparatus

Also Published As

Publication number Publication date
CN107480136A (en) 2017-12-15

Similar Documents

Publication Publication Date Title
CN107480136B (en) A Method Applied to Sentiment Curve Analysis in Movie Scripts
CN106649519B (en) A method for mining and evaluating product features
KR101713558B1 (en) Method of classification and analysis of sentiment in social network service
CN107025284A (en) The recognition methods of network comment text emotion tendency and convolutional neural networks model
Paik et al. The world of an octopus: How reporting bias influences a language model's perception of color
US20120259617A1 (en) System and method for slang sentiment classification for opinion mining
CN109508373A (en) Calculation method, equipment and the computer readable storage medium of enterprise's public opinion index
CN103617158A (en) Method for generating emotion abstract of dialogue text
CN111538828A (en) Text emotion analysis method and device, computer device and readable storage medium
CN106202584A (en) A kind of microblog emotional based on standard dictionary and semantic rule analyzes method
CN107818173B (en) A Chinese fake comment filtering method based on vector space model
CN108090098B (en) Text processing method and device
CN108090099A (en) A kind of text handling method and device
CN102929860A (en) Chinese clause emotion polarity distinguishing method based on context
CN106528533A (en) Dynamic sentiment word and special adjunct word-based text sentiment analysis method
CN108717459A (en) A kind of mobile application defect positioning method of user oriented comment information
Dubey et al. Deep models for converting sarcastic utterances into their non sarcastic interpretation
CN108108462A (en) A kind of text emotion analysis method of feature based classification
CN117033558A (en) BERT-WWM and multi-feature fused film evaluation emotion analysis method
CN108363699A (en) A kind of netizen's school work mood analysis method based on Baidu's mhkc
CN108733652A (en) The test method of film review emotional orientation analysis based on machine learning
Zhao et al. Multi-modal sarcasm generation: Dataset and solution
Dou et al. Improving large-scale paraphrase acquisition and generation
CN113377949A (en) Method and device for generating abstract of target object
CN108197274B (en) Abnormal personality detection method and device based on conversation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20180118

Address after: No. 3, House No. 4, car Gong Zhuang Street, Xicheng District, Beijing, 308

Applicant after: Pangzemufeng

Address before: 100044 House No. 3, No. 4, Kung Gong Zhuang Street, Beijing, Xicheng District, Beijing, 308

Applicant before: Chen Lei

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220301

Address after: 100102 Room 303, unit 5, building 210, Huigu sunshine, Wangjing garden, Chaoyang District, Beijing

Patentee after: Pang Zewenyue

Address before: Room 308, building 3, courtyard 4, Chegongzhuang street, Xicheng District, Beijing 100044

Patentee before: Pang Zemufeng