[go: up one dir, main page]

CN106598944B - A kind of civil aviaton's security public sentiment sentiment analysis method - Google Patents

A kind of civil aviaton's security public sentiment sentiment analysis method Download PDF

Info

Publication number
CN106598944B
CN106598944B CN201611062208.1A CN201611062208A CN106598944B CN 106598944 B CN106598944 B CN 106598944B CN 201611062208 A CN201611062208 A CN 201611062208A CN 106598944 B CN106598944 B CN 106598944B
Authority
CN
China
Prior art keywords
word
text
microblog
score
sentiment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201611062208.1A
Other languages
Chinese (zh)
Other versions
CN106598944A (en
Inventor
韩萍
李杉
贾云飞
牛勇钢
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Civil Aviation University of China
Original Assignee
Civil Aviation University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Civil Aviation University of China filed Critical Civil Aviation University of China
Priority to CN201611062208.1A priority Critical patent/CN106598944B/en
Publication of CN106598944A publication Critical patent/CN106598944A/en
Application granted granted Critical
Publication of CN106598944B publication Critical patent/CN106598944B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

A kind of civil aviaton's public sentiment sentiment analysis method.It includes being retrieved, pre-processed and being segmented operation to the microblogging text for including civil aviaton's security public sentiment keyword on internet;Construct dictionary;It gives a mark to microblogging, obtains the microblog emotional score value;Subjective and objective differentiation is carried out to microblogging according to emotion score value, obtains the microblogging to the Threat score value of safety of civil aviation;Determine the speech in microblogging text to the Threat grade of safety of civil aviation according to Threat score value.The present invention by text semantic and microblogging emoticon combine in the way of determine the emotion score of microblogging text, overcome the limitation of dictionary and semantic rules, improve emotion score accuracy of judgement degree.The characteristics of making full use of microblogging text determines Threat more reasonable grade.The present invention is different from machine learning method, does not need to be trained with extensive tagged data, therefore be more suitable for real time data stream process.

Description

A kind of civil aviaton's security public sentiment sentiment analysis method
Technical field
The invention belongs to the text emotion analysis technical fields in natural language processing, more particularly to a kind of civil aviaton's security Public sentiment sentiment analysis method.
Background technique
In the Internet era of information rapid expansion, more and more users tend to share oneself by internet Viewpoint or experience, so there is a large amount of short texts with subjective emotional color in social networks.Sina weibo be big The information that crowd provides amusement and leisure service for life shares and intercommunion platform, and the active users of Sina weibo are maintained at 200,000,000 at present The advantages of controlling, inheriting the forms such as traditional forum, blog enables information to real-time, quickly in conjunction with mobile terminals such as mobile phones Publication and acquisition.Microblogging integrates amusement, social activity, marketing, from the social demand for meeting people's " weak relationship " gradually Developing becomes popular public opinion platform, becomes a most important real time information source and a kind of network that influence power is increasingly enhanced Public Opinion Transmission center, more and more mechanisms and public figure are issued or are propagated information by microblogging.
Sentiment analysis is the process that the text with emotional color is handled, analyzed and applied, and is at natural language Compared with the research field in forward position in reason.It is a kind of concrete application in conjunction with existing many research achievements, and as new network social intercourse The microblogging of media combines, and has important practical value.The main purpose of microblog emotional analysis is exactly to know from micro-blog information Other subjective information excavates viewpoint and attitude that user holds the comment informations such as product, news, focus incident.
Some negative effects are brought while civil aviaton field, network public opinion height liberalization, such as issue false prestige Coerce speech, rumour, extreme language etc..By carrying out emotional orientation analysis to microblogging text relevant to civil aviaton, can filter There is the microblogging of threat to safety of civil aviation out, so that locking has the emphasis user of criminal tendency, is pushed to related public security department in time It is handled.In addition to this, there are also the applications of the following aspects for text emotion analysis: prediction box office receipts, shares changing tendency, city Field dynamic etc..Therefore microblog text affective proneness analysis is had a very important significance.
Currently, Chinese text sentiment analysis method mainly has based on semantic understanding and based on two class method of machine learning.But Both methods is applied to be primarily present following problems in microblog emotional analysis: the 1. building benchmark of the method based on semantic understanding The method passed judgement on dictionary and define display rule carries out pattern match to corpus, complicated, the irregular microblogging for expression way There is significant limitation in text-processing.2. the method based on machine learning is limited to selection and the corpus scale of feature, And it is easy to produce over-fitting effect, it is not suitable for real-time high-volume text-processing.
Summary of the invention
To solve the above-mentioned problems, the purpose of the present invention is to provide a kind of civil aviaton's security public sentiment sentiment analysis methods.
In order to achieve the above object, civil aviaton's security public sentiment sentiment analysis method provided by the invention includes carrying out in order The following steps:
(1) behaviour is retrieved, pre-processed and segmented to the microblogging text on internet including civil aviaton's security public sentiment keyword Make;
(2) building is for all kinds of dictionaries needed for the analysis of microblogging text semantic, construction method be divided into choose existing dictionary and The mode independently constructed;
(3) dictionary constructed according to above-mentioned steps (2) is given a mark to the above-mentioned microblogging after step (1) participle, is obtained The emotion score value of the microblogging;
(4) the emotion score value according to obtained in step (3) carries out subjective and objective differentiation to microblogging, exists for filtering news report Interior objective microblogging retains the microblogging for having subjectivity, finally obtains the microblogging to the Threat score value of safety of civil aviation;
(5) determine the speech in microblogging text to the Threat of safety of civil aviation according to the Threat score value that step (4) obtains Then grade filters out the high emphasis personnel of Threat grade, and reports and submits relevant departments as warning information.
In step (1), it is described on internet include civil aviaton's security public sentiment keyword microblogging text retrieved, The method of pretreatment and participle operation is: including the microblogging text of civil aviaton's security public sentiment keyword on crawl internet, from these Retrieval is related to the keyword of civil aviaton's security public sentiment in microblogging text, and keyword is divided into two class of place word and behavior word, retrieval Strategy is divided into single word retrieval and combined retrieval two ways;Then pretreatment operation is carried out to above-mentioned search result, to go The noise information including user's pet name, topic label, spcial character when except web page interlinkage, forwarding, reply microblogging, and extract table Feelings symbol;It is segmented later to above-mentioned by pretreated result using participle tool, participle tool is increased income using Java Participle tool Ansj.
In step (2), the dictionary includes sentiment dictionary, negative dictionary, modification dictionary, conjunction dictionary, emoticon Number dictionary, network hot word dictionary and civil aviaton's security public sentiment dictionary.
In step (3), the dictionary constructed according to above-mentioned steps (2), to above-mentioned micro- after step (1) participle Rich to give a mark, the method for obtaining the emotion score value of the microblogging includes the following steps:
1) it extracts from the above-mentioned microblogging text after step (1) participle or determines emotion word:
The method for extracting emotion word is the word and above-mentioned sentiment dictionary that will be obtained after participle in above-mentioned microblogging text It is matched with network hot word dictionary, if a certain word is present in above-mentioned two dictionary, is chosen for emotion word;
The method for determining emotion word is to the word not appeared in sentiment dictionary and network hot word dictionary using semantic Similarity based method carries out;Specific method is for two word w1And w2If word w1There are the n senses of a dictionary entry or concept: x1,x2…, xn, word w2There are the m senses of a dictionary entry or concept: y1,y2…,ym, it is specified that word w1And w2Similarity be that each senses of a dictionary entry or concept are similar The maximum value of degree, it may be assumed that
The former calculating formula of similarity of two justice are as follows:
Wherein, λ is positive variable element;d(x1,y2) indicate adopted original x1With adopted original y2Distance in hierarchical tree;
Each seed words in word w and positive emotion dictionary are subjected to similarity calculation by formula (1) and formula (2) and obtain the word With the similarity of front seed words, then seed words each in word w and negative emotion dictionary is subjected to similarity calculation and are somebody's turn to do The similarity of word and negative seed words finally obtains the Sentiment orientation value of word w by comparing the equal difference between them, meter It is as follows to calculate formula:
Wherein, piIndicate a certain positive emotion seed words, njIndicate a certain negative emotion seed words;Sentiment orientation value Sw's Value range is (- 1,1);Given threshold T, by calculated Sentiment orientation value SwIt is compared with threshold value T, to determine word w Whether emotion word is belonged to;When | Sw| when > T, determine that word w is emotion word, the intensity of the emotion word is set to 10Sw
2) the text emotion score of each microblogging clause in microblogging comprising above-mentioned emotion word is determined;
If 2.1) in microblogging clause include emotion word, and occur the negative word belonged in negative dictionary or modification before it When qualifier in dictionary, the text emotion score Sa of microblogging clause is calculated by following several situations:
E) degree adverb+emotion word, emotion word intensity change with adverbial word intensity, text emotion score are as follows:
Sa=Ma·ps·pa (4)
F) polarity of negative word+emotion word, emotion word changes, text emotion score according to the number of negative word are as follows:
Sa=(- 1)n·ps·pa (5)
G) degree adverb+negative word+emotion word, the reversion of emotion word polarity, and intensity changes with adverbial word intensity, text feelings Feel score are as follows:
Sa=(- 1) Ma·ps·pa (6)
H) negative word+degree adverb+emotion word, before appearing in degree adverb due to negative, after the reversion of emotion word polarity, Emotion word intensity more directly negates to be weakened, and introduces the first weight factor z1=0.5, text emotion score are as follows:
Sa=(- 1) Ma·ps·pa·z1 (7)
Wherein, ps indicates the intensity of emotion word, and pa indicates emotion word polarity, MaIndicate the intensity of degree adverb:
2.2) if comprising the adversative conjunction in conjunction dictionary in microblogging clause, microblogging clause belongs to compound sentence, it is contemplated that Feeling polarities transfer between sentence, the text emotion score of microblogging clause is calculated by following several situations:
D) turning relation: when occur in microblogging clause " still ", " however " etc. semantic reversion vocabulary when, previous microblogging clause Polarity will change, the integral polarity of the two microbloggings clause will be identical as the latter microblogging clause, introduce second power Repeated factor z2=-1, text emotion score are as follows:
Sen=z2Sen1+Sen2 (8)
E) progressive relationship: former and later two microbloggings clause's polarity is identical, enhanced strength, introduces third weight factor z3=1.5, Text emotion score are as follows:
Sen=z3(Sen1+Sen2) (9)
F) concession relationship: the polarity of the latter microblogging clause can invert, the polarity of whole sentence and previous microblogging clause phase Together, the 4th weight factor z is introduced4=-1, text emotion score are as follows:
Sen=Sen1+z4Sen2 (10)
Wherein, Sen1Indicate the text emotion score of previous microblogging clause, Sen2Indicate the text of the latter microblogging clause Emotion score;
3) emoticon score in microblogging is determined;
According to emoticon dictionary, the polarity and intensity of all emoticons in the microblogging are found, and records each expression The number of symbol;Enable NiFor the number of i-th of emoticon, eiFor the intensity of the emoticon, piFor the pole of the emoticon Property, then the emoticon score calculation formula in microblogging are as follows:
4) above-mentioned microblog text affective score and emoticon score are weighted summation, obtain each microblogging Emotion score value, formula are as follows:
S1=α scoreemo+β·scoretext (12)
Wherein, α, β are adjustable weight, and value range is (0,1), and alpha+beta=1 can be selected by the verifying of cross-beta collection α, β value when correct class probability maximum;scoretextIt is each microblogging clause text emotion for the text emotion score of the microblogging The average value of score.
In step (4), the emotion score value according to obtained in step (3) carries out subjective and objective differentiation to microblogging, uses Objective microblogging including filtering news report retains the microblogging for having subjectivity, finally obtains the microblogging to safety of civil aviation The method of Threat score value is:
Subjective and objective differentiation is carried out to microblogging text using following methods first:
1) for emotion score value S1=0 microblogging, if wherein including first person noun or pronoun, then it is assumed that be subjective micro- Otherwise blog article sheet is objective microblogging text;
2) for emotion score value S1≠ 0 microblogging, if the wherein special predicate word comprising news report or microblogging text In hop count at least 2 times, then it is assumed that be objective microblogging text, be otherwise subjective microblogging text;
The Threat score value of objective microblogging text is set as 0, and is calculated without Threat score value, is only calculated subjective The Threat score value of microblogging, shown in calculation formula such as formula (13):
Wherein, D indicates Threat score value, and range is between [- 10,10];S1Indicate the emotion score value of microblogging text;S2< w1,w2> is that civil aviaton's security public sentiment threatens score, w1Indicate place word, w2Expression behavior word;
Civil aviaton's security public sentiment threatens score S2< w1,w2The calculating process of > is as follows: searching the behavior word in microblogging text w2, then judge the type of behavior word;When behavior word is Direct-type, civil aviaton's security public sentiment threatens score S2< w1, w2The value of > takes the intensity of behavior word;When behavior word is indirect-type, judge in the microblogging text whether and meanwhile deposit In place word, if existed simultaneously, civil aviaton's security public sentiment threatens score S2< w1,w2The value of > takes the strong of behavior word Degree threatens score S if do not existed simultaneously2< w1,w2> is 0.
In step (5), the Threat score value obtained according to step (4) determines the speech in microblogging text to the people The method of the Threat grade for safety of navigating is:
As Threat score value D > 0, which is positive emotion, belongs to safe speech, therefore without Threat grade determines;As Threat score value D≤0, determine that the microblogging text contains civil aviaton's security public sentiment keyword, and express Be Negative Affect, need to pay close attention to, then according to following Threat classification standard to microblogging text degree of impending etc. Grade determines;Threat classification standard is obtained from testing existing microblogging text, specific as follows:
It 1) is low Threat when -4.5≤D≤0;
It 2) is medium Threat when -7≤D < -4.5;
It 3) is high Threat when -10≤D < 7.
Civil aviaton's security public sentiment sentiment analysis method provided by the invention has the advantage that (1) present invention utilizes text language Justice and the mode that combines of microblogging emoticon determine the emotion score of microblogging text, overcome the office of dictionary and semantic rules It is sex-limited, improve the accuracy of emotion score judgement.(2) to the prestige of the microblogging text on the basis of microblog text affective score Stress score value is calculated, and obtains Threat grade, improves public security department, civil aviaton pre-alerting ability, has highly important meaning Justice.(3) the characteristics of making full use of microblogging text determines Threat more reasonable grade.(4) present invention is different from machine learning Method does not need to be trained with extensive tagged data, therefore is more suitable for real time data stream process.
Detailed description of the invention
Fig. 1 is civil aviaton's security public sentiment sentiment analysis method flow diagram provided by the invention.
Fig. 2 is emotion score value calculation method flow chart in the present invention.
Specific embodiment
Civil aviaton's security public sentiment sentiment analysis method provided by the invention is carried out in the following with reference to the drawings and specific embodiments detailed It describes in detail bright.
As shown in Figure 1, civil aviaton's security public sentiment sentiment analysis method provided by the invention includes the following step carried out in order It is rapid:
(1) behaviour is retrieved, pre-processed and segmented to the microblogging text on internet including civil aviaton's security public sentiment keyword Make;
Grab include on internet civil aviaton's security public sentiment keyword microblogging text, as analysis object of the invention, from Retrieval is related to the keyword of civil aviaton's security public sentiment in these microblogging texts, and keyword is divided into two class of place word and behavior word, To guarantee the comprehensive of data acquisition, search strategy is divided into single word retrieval and combined retrieval two ways, single word inspection Rope is individually retrieved two class words, wherein place word such as airport, runway, terminal, flight etc., behavior word Such as aircraft bombing, airplane hijacking, sky make a noise, have a fist fight, protest;Combined retrieval be place+behavior search modes, such as " airport+bomb ", " flight+explosion ", " airport+airplane hijacking " etc., and in the database by search result storage;
Meanwhile in order to improve system effectiveness, it is arranged microblog users " white list ", the microblog users in the list are various regions machine Field public security official's microblogging and news portal website microblogging.Since these microblog users are often issued comprising civil aviaton's security public sentiment key The microblogging of word, but not within the scope of early warning and monitoring, therefore removed in keyword retrieval.
Then pretreatment operation is carried out to above-mentioned search result, to remove noise unrelated with emotional expression in microblogging text Information, such as: (1) web page interlinkage, shaped like " http://t.cn/Rtj0WWN " etc., due to not including useful information, pre- It is removed when processing.(2) user's pet name when forwarding, reply microblogging, topic label, spcial character etc., shaped like " Li Qiongzi: time Multiple sketch craftsman Lao Wang: this thing return insane asylum pipe uncle police regardless of ", wherein the microblog users name needs after symbol are gone It removes.And emoticon is extracted, due to emoticon square brackets textual representation in the microblogging text that grabs, such as " here it is mouths The witness [strong] of upright stone tablet and effect ", emoticon therein isThe text in square brackets is extracted, the feelings for emoticon Inductance value calculates.
It is segmented later to above-mentioned by pretreated result using participle tool, participle tool is increased income using Java Participle tool Ansj.
(2) building is for all kinds of dictionaries needed for the analysis of microblogging text semantic, construction method be divided into choose existing dictionary and The composition of the mode independently constructed, all kinds of dictionaries is as follows:
1) sentiment dictionary: the sentiment dictionary of mainstream has Taiwan Univ.'s NTUSD dictionary, Hownet HowNet Chinese and English emotion at present Dictionary, Dalian University of Technology's emotion vocabulary ontology library etc..In these types of mainstream dictionary, Dalian University of Technology's emotion vocabulary ontology Word is divided into front, negative, neutrality three classes by library, and marks feeling polarities according to 1,3,5,7,9 five kind of intensity, is conducive to microblogging The calculating of text emotion value, therefore select the dictionary as the sentiment dictionary in the present invention.
It 2) negate dictionary: when occurring negative word before emotion word, it may occur that the reversion of feeling polarities, the present invention count Common negative word totally 19, thus it is configured to negative dictionary.The negative word are as follows: not, do not have, nothing, it is non-, not, not, not, not, It is no, other, be not nothing but, not enough, or not never, may not, not have, not be difficult to, not.
3) modify dictionary: according to the semantic rules of Chinese grammer, the power of emotional value and the modification of degree adverb have Direct relation, so it is very necessary that qualifier is chosen for one of judgment rule.Modification dictionary in the present invention selects Hownet (HowNet) the Chinese degree rank word in, is divided into 6 ranks, as shown in table 1.
Table 1 modifies dictionary example
4) conjunction dictionary: in compound sentence, conjunction will lead to the variation of feeling polarities, so that the judgement to feeling polarities produces It is raw to influence.The present invention has chosen the conjunction with turning relation, progressive relationship and concession relationship as conjunction dictionary.Described Indicate turning relation conjunction are as follows: still but however and can be but be exactly, be, can;Indicate the company of progressive relationship Word are as follows: and even, so that and especially;Indicate concession relationship conjunction are as follows: even if although even if although, even,.
5) emoticon dictionary: providing a large amount of emoticon in Sina weibo, the present invention has chosen in Sina weibo Expression is set, feeling polarities are divided into positively and negatively two class, and are manually marked according to intensity.As shown in table 2.
2 emoticon dictionary example of table
6) network hot word dictionary: microblogging as a kind of social media, text have the characteristics that it is unofficial and colloquial, because The frequency of use of this cyberspeak is very high, but these vogue words are not comprised in traditional sentiment dictionary, by network neologisms It is very necessary that sentiment dictionary is added.In net word net (http://wangci.net), there are network prevalence term dictionary and paraphrase. 494 network hot topic words that the present invention has grabbed the website mark out feeling polarities according to criterion identical with sentiment dictionary It is configured to network hot word dictionary with intensity, the supplement as sentiment dictionary.
7) civil aviaton's security public sentiment dictionary: the place word and behavior word chosen in civil aviaton's security public sentiment keyword are configured to Civil aviaton's security public sentiment dictionary.In addition to basic word, the present invention has collected the unisonance wrong word of common word to increase the complete of dictionary Property.For example, the unisonance wrong word " swindleness " of " fried ", the unisonance wrong word " cutting " of " misfortune " etc., in dictionary increased word be " swindleness machine ", " swindleness bullet ", " cutting machine " etc..
(3) dictionary constructed according to above-mentioned steps (2) is given a mark to the above-mentioned microblogging after step (1) participle, is obtained The emotion score value of the microblogging;
Specific steps are as shown in Figure 2.
1) it extracts from the above-mentioned microblogging text after step (1) participle or determines emotion word:
The word and above-mentioned sentiment dictionary and network hot word dictionary that obtain after participle in above-mentioned microblogging text are carried out Matching, if a certain word is present in above-mentioned two dictionary, is chosen for emotion word;
If word does not appear in sentiment dictionary and network hot word dictionary, determined with the method for semantic similarity Emotion word.In order to reduce operand, noun, verb and adjective alternately emotion word are only remained.The present invention utilizes Hownet Arithmetic of Semantic Similarity has good effect as benchmark algorithm on measuring two Words similarities.Specific method is For two word w1And w2If word w1There are the n senses of a dictionary entry or concept: x1,x2…,xn, word w2There are the m senses of a dictionary entry or concept: y1,y2…,ym, it is specified that word w1And w2Similarity be each senses of a dictionary entry or concept similarity maximum value, it may be assumed that
The former calculating formula of similarity of two justice are as follows:
Wherein, λ is positive variable element;d(x1,y2) indicate adopted original x1With adopted original y2Distance in hierarchical tree.
For any one word, can be obtained by calculating the similarity in the word and sentiment dictionary between seed words Its Sentiment orientation value, calculation method is: word w is carried out phase by formula (1) and formula (2) with seed words each in positive emotion dictionary The similarity of the word Yu front seed words is calculated like degree, then seed words each in word w and negative emotion dictionary are carried out Similarity calculation obtains the similarity of the word Yu negative seed words, by comparing the equal difference between them, finally obtains word w Sentiment orientation value, calculation formula is as follows:
Wherein, piIndicate a certain positive emotion seed words, njIndicate a certain negative emotion seed words;Sentiment orientation value Sw's Value range is (- 1,1).Given threshold T, by calculated Sentiment orientation value SwIt is compared with threshold value T, to determine word w Whether emotion word is belonged to.When | Sw| when > T, determine that word w is emotion word.The intensity of the emotion word is set to 10Sw, thus and feelings Level of intensity in sense dictionary is consistent.
2) the text emotion score of each microblogging clause in microblogging comprising above-mentioned emotion word is determined;
If 2.1) in microblogging clause include emotion word, and occur the negative word belonged in negative dictionary or modification before it When qualifier in dictionary, the text emotion score Sa of microblogging clause is calculated by following several situations:
I) degree adverb+emotion word, emotion word intensity change with adverbial word intensity, text emotion score are as follows:
Sa=Ma·ps·pa (4)
J) polarity of negative word+emotion word, emotion word changes, text emotion score according to the number of negative word are as follows:
Sa=(- 1)n·ps·pa (5)
K) degree adverb+negative word+emotion word, the reversion of emotion word polarity, and intensity changes with adverbial word intensity, text feelings Feel score are as follows:
Sa=(- 1) Ma·ps·pa (6)
L) negative word+degree adverb+emotion word, before appearing in degree adverb due to negative, after the reversion of emotion word polarity, Emotion word intensity more directly negates to be weakened, and introduces the first weight factor z1=0.5, text emotion score are as follows:
Sa=(- 1) Ma·ps·pa·z1 (7)
Wherein, ps indicates the intensity of emotion word, and pa indicates emotion word polarity, MaIndicate the intensity of degree adverb:
2.2) if comprising the adversative conjunction in conjunction dictionary in microblogging clause, microblogging clause belongs to compound sentence, it is contemplated that Feeling polarities transfer between sentence, the text emotion score of microblogging clause is calculated by following several situations:
G) turning relation: when occur in microblogging clause " still ", " however " etc. semantic reversion vocabulary when, previous microblogging clause Polarity will change, the integral polarity of the two microbloggings clause will be identical as the latter microblogging clause, introduce second power Repeated factor z2=-1, text emotion score are as follows:
Sen=z2Sen1+Sen2 (8)
H) progressive relationship: former and later two microbloggings clause's polarity is identical, enhanced strength, introduces third weight factor z3=1.5, Text emotion score are as follows:
Sen=z3(Sen1+Sen2) (9)
I) concession relationship: the polarity of the latter microblogging clause can invert, the polarity of whole sentence and previous microblogging clause phase Together, the 4th weight factor z is introduced4=-1, text emotion score are as follows:
Sen=Sen1+z4Sen2 (10)
Wherein, Sen1Indicate the text emotion score of previous microblogging clause, Sen2Indicate the text of the latter microblogging clause Emotion score;
3) emoticon score in microblogging is determined;
A large amount of emoticon is provided in Sina weibo, by using emoticon that can brightly indicate in microblogging The Sentiment orientation of the microblogging out.Using emoticon as a weighted term of emotion score value, for the emotion of whole microblogging text Tendency determines there is certain correcting action.According to emoticon dictionary, find in the microblogging polarity of all emoticons and Intensity, and record the number of each emoticon;Enable NiFor the number of i-th of emoticon, eiFor the intensity of the emoticon, piFor the polarity of the emoticon, then the emoticon score calculation formula in microblogging are as follows:
4) the text emotion score of above-mentioned microblogging and emoticon score are weighted summation, it is micro- that each can be obtained Rich emotion score value, formula are as follows:
S1=α scoreemo+β·scoretext (12)
Wherein, α, β are adjustable weight, and value range is (0,1), and alpha+beta=1 can be selected by the verifying of cross-beta collection α, β value when correct class probability maximum;scoretextIt is each microblogging clause text emotion for the text emotion score of the microblogging The average value of score.As emotion score value S1For timing, determine that the microblogging expresses positive emotion;As emotion score value S1When being negative, determine The microblogging expresses negative sense emotion.
(4) the emotion score value according to obtained in step (3) carries out subjective and objective differentiation to microblogging, exists for filtering news report Interior objective microblogging retains the microblogging for having subjectivity, finally obtains the microblogging to the Threat score value of safety of civil aviation;
Subjective and objective differentiation is carried out to microblogging text using following methods first:
1) for emotion score value S1=0 microblogging, if wherein including first person noun or pronoun, then it is assumed that be subjective micro- Otherwise blog article sheet is objective microblogging text.
2) for emotion score value S1≠ 0 microblogging, if the wherein special predicate word comprising news report or microblogging text In hop count at least 2 times, then it is assumed that be objective microblogging text, be otherwise subjective microblogging text.
The Threat score value of objective microblogging text is set as 0, and is calculated without following Threat score value, is only counted The Threat score value for calculating subjective microblogging, shown in calculation formula such as formula (13):
Wherein, D indicates Threat score value, and range is between [- 10,10];S1Indicate the emotion score value of microblogging text;S2< w1,w2> is that civil aviaton's security public sentiment threatens score, w1Indicate place word, w2Expression behavior word;
In civil aviaton's security public sentiment dictionary, place word includes airport, runway, terminal, flight etc., and behavior word includes Aircraft bombing, airplane hijacking, sky make a noise, have a fist fight, protest;Wherein there are two attributes for behavior word, and first attribute is intensity, has measured the word Language is divided into 1,3,5,7,9 five kinds of intensity, the strength metric one with emotion word to the threat degree of civil aviaton's security, module It causes.Second attribute is type of word, and type of word is divided into two classes, and one kind is Direct-type, i.e., this word energy only occurs It is determined as there is threat to civil aviaton, such as aircraft bombing, airplane hijacking, despot's machine etc.;Another kind of is indirect-type, i.e., must go out simultaneously with place word Whether to civil aviaton security have threat, such as have a fist fight, protest, smoke if now can just determine.When only existing indirect-type behavior word, It is not enough to judge that it has threat to civil aviaton's security.
Civil aviaton's security public sentiment threatens score S2< w1,w2The calculating process of > is as follows: searching the behavior word in microblogging text w2, then judge the type of behavior word;When behavior word is Direct-type, civil aviaton's security public sentiment threatens score S2< w1, w2The value of > takes the intensity of behavior word;When behavior word is indirect-type, judge in the microblogging text whether and meanwhile deposit In place word, if existed simultaneously, civil aviaton's security public sentiment threatens score S2< w1,w2The value of > takes the strong of behavior word Degree threatens score S if do not existed simultaneously2< w1,w2> is 0.
(5) determine the speech in microblogging text to the Threat of safety of civil aviation according to the Threat score value that step (4) obtains Then grade filters out the high emphasis personnel of Threat grade, and reports and submits relevant departments as warning information.
The Threat score value obtained in step (4) can be seen that the microblogging text representation as Threat score value D > 0 Be positive emotion, belong to safe speech, therefore determine without Threat grade;As Threat score value D≤0, determining should Microblogging text contains civil aviaton's security public sentiment keyword, and what is expressed is Negative Affect, needs to pay close attention to, then according to following Threat classification standard determines microblogging text degree of impending grade.Threat classification standard be to existing microblogging text into It is specific as follows obtained from row test:
It 1) is low Threat when -4.5≤D≤0.
It 2) is medium Threat when -7≤D < -4.5.
It 3) is high Threat when -10≤D < 7.
Table 3 is listed certain microblogging texts are handled according to the method for the present invention after obtained Threat score value and threat Spend grade.As can be seen from the table, the method for the present invention can accurately determine whether microblogging text has safety of civil aviation It threatens.
The Threat of 3 microblogging text of table determines result

Claims (2)

1.一种民航安保舆情情感分析方法,所述的民航安保舆情情感分析方法包括按顺序进行的下列步骤:1. A method for analyzing public opinion and sentiment of civil aviation security, comprising the following steps in order: (1)对互联网上包含民航安保舆情关键词的微博文本进行检索、预处理和分词操作;(1) Retrieve, preprocess and segment the microblog texts on the Internet containing the keywords of civil aviation security public opinion; (2)构建用于微博文本语义分析所需的各类词典,构建方法分为选取现有词典和自主构造的方式;(2) Constructing various dictionaries required for the semantic analysis of Weibo text, the construction methods are divided into the methods of selecting existing dictionaries and self-constructing; (3)根据上述步骤(2)构建的词典,对上述经步骤(1)分词后的微博进行打分,得到该微博的情感分值;(3) According to the dictionary constructed in the above-mentioned step (2), the above-mentioned microblog after word segmentation in the step (1) is scored to obtain the sentiment score of the microblog; (4)根据步骤(3)中得到的情感分值对微博进行主客观判别,用于过滤新闻报道在内的客观微博,保留带有主观性的微博,最终得到该微博对民航安全的威胁度分值;(4) According to the sentiment score obtained in step (3), subjectively and objectively discriminate the microblog, which is used to filter objective microblogs including news reports, retain the subjective microblogs, and finally obtain the microblog’s impact on civil aviation. security threat score; (5)根据步骤(4)得到的威胁度分值判定微博文本中的言论对民航安全的威胁度等级,方法是当威胁度分值D>0时,该微博文本表达的是积极情感,属于安全言论,因此不进行威胁度等级判定;当威胁度分值D≤0时,判定该微博文本含有民航安保舆情关键词,并表达的是消极情感,需要重点关注,然后根据下面的威胁度等级标准对微博文本进行威胁度等级判定;威胁度等级标准是对现有的微博文本进行测试而得到的,具体如下:(5) According to the threat score obtained in step (4), determine the threat level of the speech in the microblog text to civil aviation safety, the method is that when the threat score D>0, the microblog text expresses positive emotions , belongs to security speech, so no threat level determination is performed; when the threat degree score D≤0, it is determined that the microblog text contains the keywords of civil aviation security public opinion, and expresses negative emotions, which needs to be focused on, and then according to the following The threat level standard determines the threat level of the microblog text; the threat level standard is obtained by testing the existing microblog text, as follows: 1)-4.5≤D≤0时为低等威胁度;1) -4.5≤D≤0 is low threat degree; 2)-7≤D<-4.5时为中等威胁度;2) -7≤D<-4.5 is a medium threat; 3)-10≤D<7时为高等威胁度;3) -10≤D<7 is a high threat degree; 然后筛选出具有高等威胁度等级的重点人员,并作为预警信息;Then screen out key personnel with high threat level and use it as early warning information; 其特征在于:在步骤(3)中,所述的根据上述步骤(2)构建的词典,对上述经步骤(1)分词后的微博进行打分,得到该微博的情感分值的方法包括下列步骤:It is characterized in that: in step (3), according to the dictionary constructed according to the above-mentioned step (2), the above-mentioned microblogs after word segmentation in the step (1) are scored, and the method for obtaining the sentiment score of the microblogs includes: The following steps: 1)从上述经步骤(1)分词后的微博文本中提取或确定情感词:1) Extract or determine emotional words from the above-mentioned microblog text after word segmentation in step (1): 提取情感词的方法是将上述微博文本中经过分词后得到的词语与上述情感词典和网络热词词典进行匹配,若某一词语存在于上述两个词典中,则选取为情感词;The method for extracting emotional words is to match the words obtained after word segmentation in the above-mentioned microblog text with the above-mentioned emotional dictionary and online hot word dictionary, and if a certain word exists in the above-mentioned two dictionaries, it is selected as the emotional word; 确定情感词的方法是对没有出现在情感词典和网络热词词典中的词语采用语义相似度方法进行;具体方法是对于两个词语w1和w2,如果词语w1有n个义项或概念:x1,x2…,xn,词语w2有m个义项或概念:y1,y2…,ym,规定词语w1和w2的相似度是各个义项或概念相似度的最大值,即:The method of determining sentiment words is to use the semantic similarity method for the words that do not appear in the sentiment dictionary and the online hot word dictionary; the specific method is for two words w 1 and w 2 , if the word w 1 has n meanings or concepts : x 1 , x 2 ..., x n , word w 2 has m sense items or concepts: y 1 , y 2 ..., y m , it is stipulated that the similarity between words w 1 and w 2 is the maximum similarity of each sense item or concept value, that is: 两个义原的相似度计算公式为:The formula for calculating the similarity of two semes is: 其中,λ是正的可变参数;d(x1,y2)表示义原x1和义原y2在层次树中的距离;Among them, λ is a positive variable parameter; d(x 1 , y 2 ) represents the distance between the sememe x 1 and the sememe y 2 in the hierarchical tree; 将词语w与正面情感词典中每个种子词按式(1)及式(2)进行相似度计算得到该词与正面种子词的相似度,再将词语w与负面情感词典中每个种子词进行相似度计算得到该词与负面种子词的相似度,通过比较它们之间的均差值,最终得到词语w的情感倾向值,计算公式如下:Calculate the similarity between the word w and each seed word in the positive sentiment dictionary according to formula (1) and formula (2) to obtain the similarity between the word and the positive seed word, and then compare the word w with each seed word in the negative sentiment dictionary. The similarity between the word and the negative seed word is obtained by calculating the similarity. By comparing the average difference between them, the emotional tendency value of the word w is finally obtained. The calculation formula is as follows: 其中,pi表示某一正面情感种子词,nj表示某一负面情感种子词;情感倾向值Sw的取值范围为(-1,1);设定阈值T,将计算出的情感倾向值Sw与阈值T进行比较,以判定词语w是否属于情感词;当|Sw|>T时,判定词语w为情感词,该情感词的强度定为10·SwAmong them, pi represents a positive emotional seed word, n j represents a negative emotional seed word; the value range of the emotional tendency value S w is (-1, 1); set the threshold T, the calculated emotional tendency The value S w is compared with the threshold value T to determine whether the word w belongs to an emotional word; when | Sw |>T, the word w is determined to be an emotional word, and the intensity of the emotional word is set as 10· Sw ; 2)确定微博中包含上述情感词的每一微博子句的文本情感得分;2) determine the text sentiment score of each microblog clause that includes the above sentiment words in the microblog; 2.1)若微博子句中包含情感词,且在其之前出现属于否定词典中的否定词或修饰词典中的修饰词时,按以下几种情况计算该微博子句的文本情感得分Sa:2.1) If the microblog clause contains sentiment words, and the negative words in the negative dictionary or the modifiers in the modified dictionary appear before it, the text sentiment score Sa of the microblog clause is calculated according to the following situations: a)程度副词+情感词,情感词强度随副词强度改变,文本情感得分为:a) Degree adverb + sentiment word, the intensity of sentiment word changes with the strength of the adverb, and the text sentiment score is: Sa=Ma·ps·pa (4)Sa=M a ·ps·pa (4) b)否定词+情感词,情感词的极性按照否定词的个数而改变,文本情感得分为:b) Negative words + sentiment words, the polarity of sentiment words changes according to the number of negative words, and the text sentiment score is: Sa=(-1)n·ps·pa (5)Sa=(-1) n ·ps·pa (5) c)程度副词+否定词+情感词,情感词极性反转,并且强度随副词强度改变,文本情感得分为:c) Degree adverb + negative word + sentiment word, the polarity of sentiment word is reversed, and the intensity changes with the strength of the adverb, the text sentiment score is: Sa=(-1)·Ma·ps·pa (6)Sa=(-1)·M a ·ps·pa (6) d)否定词+程度副词+情感词,由于否定出现在程度副词之前,情感词极性反转后,情感词强度较直接否定有所减弱,引入第一权重因子z1=0.5,文本情感得分为:d) Negative word + degree adverb + sentiment word, since the negation appears before the degree adverb, after the polarity of the sentiment word is reversed, the intensity of the sentiment word is weakened compared with the direct negation, the first weight factor z 1 = 0.5 is introduced, and the text sentiment score is for: Sa=(-1)·Ma·ps·pa·z1 (7)Sa=(-1)·M a ·ps·pa·z 1 (7) 其中,ps表示情感词的强度,pa表示情感词极性,Ma表示程度副词的强度:Among them, ps represents the intensity of emotional words, pa represents the polarity of emotional words, and Ma represents the intensity of degree adverbs: 2.2)若微博子句中包含连词词典中的转折连词,该微博子句属于复合句,考虑到句间的情感极性转移,按以下几种情况计算该微博子句的文本情感得分:2.2) If the microblog clause contains the transition conjunction in the conjunction dictionary, the microblog clause belongs to a compound sentence. Considering the emotional polarity transfer between sentences, the text sentiment score of the microblog clause is calculated according to the following situations: a)转折关系:当微博子句中出现“但是”、“然而”在内的语义反转词汇时,前一微博子句的极性将会发生改变,这两个微博子句的整体极性将与后一个微博子句相同,引入第二权重因子z2=-1,文本情感得分为:a) Turning relationship: When semantic reversal words including "but" and "however" appear in the microblog clause, the polarity of the previous microblog clause will change, and the overall polarity of the two microblog clauses will change. Will be the same as the latter microblog clause, introduce the second weight factor z 2 =-1, and the text sentiment score is: Sen=z2Sen1+Sen2 (8)Sen=z 2 Sen 1 +Sen 2 (8) b)递进关系:前后两个微博子句极性相同,强度增强,引入第三权重因子z3=1.5,文本情感得分为:b) Progressive relationship: the polarity of the two microblog clauses before and after is the same, and the intensity is enhanced. The third weight factor z 3 =1.5 is introduced, and the text sentiment score is: Sen=z3(Sen1+Sen2) (9)Sen=z 3 (Sen 1 +Sen 2 ) (9) c)让步关系:后一个微博子句的极性会发生反转,整句的极性与前一微博子句相同,引入第四权重因子z4=-1,文本情感得分为:c) Concession relationship: the polarity of the latter microblog clause will be reversed, and the polarity of the whole sentence will be the same as that of the previous microblog clause. The fourth weighting factor z 4 =-1 is introduced, and the text sentiment score is: Sen=Sen1+z4Sen2 (10)Sen = Sen 1 +z 4 Sen 2 (10) 其中,Sen1表示前一个微博子句的文本情感得分,Sen2表示后一个微博子句的文本情感得分;Among them, Sen 1 represents the text sentiment score of the previous microblog clause, and Sen 2 represents the text sentiment score of the next microblog clause; 3)确定微博中表情符号得分;3) Determine the score of emoji in Weibo; 根据表情符号词典,查出该微博中所有表情符号的极性及强度,并记录每个表情符号的个数;令Ni为第i个表情符号的个数,ei为该表情符号的强度,pi为该表情符号的极性,则微博中的表情符号得分计算公式为:According to the emoji dictionary, find out the polarity and intensity of all emojis in the microblog, and record the number of each emoji; let N i be the number of the ith emoji, and e i be the number of the emoji strength, pi is the polarity of the emoji , the formula for calculating the emoji score in Weibo is: 4)将上述的微博文本情感得分和表情符号得分进行加权求和,得到每一条微博的情感分值,公式如下:4) The above-mentioned microblog text sentiment score and emoticon score are weighted and summed to obtain the sentiment score of each microblog. The formula is as follows: S1=α·scoreemo+β·scoretext (12)S 1 =α·score emo +β·score text (12) 其中,α、β为可调权值,取值范围是(0,1),α+β=1,通过交叉测试集验证能够选择正确分类概率最大时的α、β值;scoretext为该微博的文本情感得分,为各微博子句文本情感得分的平均值。Among them, α and β are adjustable weights, the value range is (0,1), α+β=1, and the α and β values with the highest probability of correct classification can be selected through cross-test set verification; score text is the micro The text sentiment score of the blog is the average of the text sentiment scores of each microblog clause. 2.根据权利要求1所述的民航安保舆情情感分析方法,其特征在于:在步骤(4)中,所述的根据步骤(3)中得到的情感分值对微博进行主客观判别,用于过滤新闻报道在内的客观微博,保留带有主观性的微博,最终得到该微博对民航安全的威胁度分值的方法是:2. Civil aviation security public opinion sentiment analysis method according to claim 1, is characterized in that: in step (4), described according to the sentiment score value obtained in step (3) carries out subjective and objective discrimination to micro-blog, using In order to filter objective microblogs including news reports and retain subjective microblogs, the method to finally obtain the threat score of the microblogs to civil aviation safety is as follows: 首先采用以下方法对微博文本进行主客观判别:Firstly, the following methods are used to judge the microblog text subjectively and objectively: 1)对于情感分值S1=0的微博,若其中包含第一人称名词或代词,则认为是主观微博文本,否则为客观微博文本;1) For the microblog with emotional score S 1 =0, if it contains a first-person noun or pronoun, it is considered to be a subjective microblog text, otherwise it is an objective microblog text; 2)对于情感分值S1≠0的微博,若其中包含新闻报道的特殊谓语用词,或微博文本中的转发次数至少2次,则认为是客观微博文本,否则是主观微博文本;2) For a microblog with a sentiment score S 1 ≠ 0, if it contains special predicates of news reports, or the number of reposts in the microblog text is at least 2 times, it is considered an objective microblog text, otherwise it is a subjective microblog text; 将客观微博文本的威胁度分值设定为0,并且不进行威胁度分值计算,只计算主观微博的威胁度分值,计算公式如式(13)所示:The threat degree score of the objective microblog text is set to 0, and the threat degree score calculation is not performed, but only the threat degree score of the subjective microblog text is calculated. The calculation formula is shown in formula (13): 其中,D表示威胁度分值,范围在[-10,10]之间;S1表示微博文本的情感分值;S2<w1,w2>为民航安保舆情威胁分数,w1表示地点词语,w2表示行为词语;Among them, D represents the threat score, the range is between [-10, 10]; S 1 represents the sentiment score of the microblog text; S 2 <w 1 , w 2 > is the threat score of civil aviation security public opinion, w 1 represents place words, w 2 means action words; 民航安保舆情威胁分数S2<w1,w2>的计算过程如下:查找微博文本中的行为词语w2,然后判断该行为词语的类型;当该行为词语为直接型时,民航安保舆情威胁分数S2<w1,w2>的值取该行为词语的强度;当该行为词语为间接型时,判断该微博文本中是否同时存在地点词语,如果同时存在,则民航安保舆情威胁分数S2<w1,w2>的值取该行为词语的强度,如果不同时存在,威胁分数S2<w1,w2>为0。The calculation process of the civil aviation security public opinion threat score S 2 <w 1 , w 2 > is as follows: find the behavior word w 2 in the microblog text, and then judge the type of the behavior word; when the behavior word is direct, the civil aviation security public opinion The value of the threat score S 2 <w 1 , w 2 > takes the intensity of the behavior word; when the behavior word is indirect, it is judged whether there is a location word in the microblog text at the same time, and if both exist, the civil aviation security public opinion threatens The value of the score S 2 <w 1 , w 2 > takes the intensity of the word of the action, and if it does not exist at the same time, the threat score S 2 <w 1 , w 2 > is 0.
CN201611062208.1A 2016-11-25 2016-11-25 A kind of civil aviaton's security public sentiment sentiment analysis method Expired - Fee Related CN106598944B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611062208.1A CN106598944B (en) 2016-11-25 2016-11-25 A kind of civil aviaton's security public sentiment sentiment analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611062208.1A CN106598944B (en) 2016-11-25 2016-11-25 A kind of civil aviaton's security public sentiment sentiment analysis method

Publications (2)

Publication Number Publication Date
CN106598944A CN106598944A (en) 2017-04-26
CN106598944B true CN106598944B (en) 2019-03-19

Family

ID=58594761

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611062208.1A Expired - Fee Related CN106598944B (en) 2016-11-25 2016-11-25 A kind of civil aviaton's security public sentiment sentiment analysis method

Country Status (1)

Country Link
CN (1) CN106598944B (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273671B (en) * 2017-05-31 2018-03-30 江苏金琉璃科技有限公司 A kind of method and system realized medical performance and quantified
CN107291899A (en) * 2017-06-22 2017-10-24 努比亚技术有限公司 A kind of recommendation method and terminal and computer-readable recording medium based on label
CN107908715A (en) * 2017-11-10 2018-04-13 中国民航大学 Microblog emotional polarity discriminating method based on Adaboost and grader Weighted Fusion
CN107945033A (en) * 2017-11-14 2018-04-20 李勇 A kind of analysis method of network public-opinion, system and relevant apparatus
CN107943789A (en) * 2017-11-17 2018-04-20 新华网股份有限公司 Mood analysis method, device and the server of topic information
CN108021651B (en) * 2017-11-30 2020-07-28 中科金联(北京)科技有限公司 Network public opinion risk assessment method and device
CN108319587B (en) * 2018-02-05 2021-11-19 中译语通科技股份有限公司 Multi-weight public opinion value calculation method and system and computer
CN108536801A (en) * 2018-04-03 2018-09-14 中国民航大学 A kind of civil aviaton's microblogging security public sentiment sentiment analysis method based on deep learning
CN109145306A (en) * 2018-09-11 2019-01-04 刘瑞军 The three-dimensional expression generation method of text-driven
CN110084427A (en) * 2019-04-26 2019-08-02 飞叶科技股份有限公司 A kind of smart city public sentiment event prediction algorithm
CN110069786A (en) * 2019-05-06 2019-07-30 北京理琪教育科技有限公司 Analysis method, device and the equipment of language composition Sentiment orientation
CN110163688A (en) * 2019-05-30 2019-08-23 复旦大学 Commodity network public sentiment detection system
CN111104515A (en) * 2019-12-24 2020-05-05 山东众志电子有限公司 Emotional word text information classification method
CN111626050B (en) * 2020-05-25 2023-12-12 安徽理工大学 Microblog emotion analysis method based on expression dictionary and emotion general knowledge
CN111611385A (en) * 2020-05-27 2020-09-01 中航信移动科技有限公司 Flight monitoring and early warning system and method based on public opinion analysis
CN111950860B (en) * 2020-07-21 2024-04-16 中证征信(深圳)有限公司 Monitoring method and device for enterprise public opinion risk index
CN111881360A (en) * 2020-08-12 2020-11-03 杭州安恒信息技术股份有限公司 Public opinion data processing method, system, equipment and readable storage medium
CN113220962A (en) * 2020-09-10 2021-08-06 深圳信息职业技术学院 Public opinion analysis method based on internet big data
CN112016331A (en) * 2020-10-30 2020-12-01 成都智元汇信息技术股份有限公司 Passenger transport passenger emotion analysis method
CN112417258A (en) * 2020-12-02 2021-02-26 深圳市罗湖医院集团 Method, platform and terminal for crushing rumor information in health knowledge search engine
CN112364947B (en) * 2021-01-14 2021-06-29 北京育学园健康管理中心有限公司 Text similarity calculation method and device
CN114238624A (en) * 2021-06-30 2022-03-25 武汉众智数字技术有限公司 Intelligent Internet public opinion early warning and handling method and system
CN114443841A (en) * 2021-12-31 2022-05-06 深圳云天励飞技术股份有限公司 Netizen speech analysis method, device, server and storage medium
CN117010409B (en) * 2023-10-07 2023-12-12 成都中轨轨道设备有限公司 Text recognition method and system based on natural language semantic analysis
CN119336966B (en) * 2024-12-23 2025-07-11 山东理工职业学院 Network rumor recognition system based on artificial intelligence

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894102A (en) * 2010-07-16 2010-11-24 浙江工商大学 A method and device for analyzing subjective text sentiment tendency
US8165869B2 (en) * 2007-12-10 2012-04-24 International Business Machines Corporation Learning word segmentation from non-white space languages corpora
CN103207860A (en) * 2012-01-11 2013-07-17 北大方正集团有限公司 Method and device for extracting entity relationships of public sentiment events
CN103530360A (en) * 2013-10-12 2014-01-22 广西师范学院 Network Social Influence Maximization Algorithm Based on Microblog Text Emotional Computation
CN103559233A (en) * 2012-10-29 2014-02-05 中国人民解放军国防科学技术大学 Extraction method for network new words in microblogs and microblog emotion analysis method and system
CN104516962A (en) * 2014-12-18 2015-04-15 北京牡丹电子集团有限责任公司数字电视技术中心 Monitoring method and system for microblogging public opinion
CN104537097A (en) * 2015-01-09 2015-04-22 成都布林特信息技术有限公司 Microblog public opinion monitoring system
CN104809104A (en) * 2015-05-11 2015-07-29 苏州大学 Method and system for identifying micro-blog textual emotion
CN105389389A (en) * 2015-12-10 2016-03-09 安徽博约信息科技有限责任公司 Network public opinion transmission situation media linked analysis method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8165869B2 (en) * 2007-12-10 2012-04-24 International Business Machines Corporation Learning word segmentation from non-white space languages corpora
CN101894102A (en) * 2010-07-16 2010-11-24 浙江工商大学 A method and device for analyzing subjective text sentiment tendency
CN103207860A (en) * 2012-01-11 2013-07-17 北大方正集团有限公司 Method and device for extracting entity relationships of public sentiment events
CN103559233A (en) * 2012-10-29 2014-02-05 中国人民解放军国防科学技术大学 Extraction method for network new words in microblogs and microblog emotion analysis method and system
CN103530360A (en) * 2013-10-12 2014-01-22 广西师范学院 Network Social Influence Maximization Algorithm Based on Microblog Text Emotional Computation
CN104516962A (en) * 2014-12-18 2015-04-15 北京牡丹电子集团有限责任公司数字电视技术中心 Monitoring method and system for microblogging public opinion
CN104537097A (en) * 2015-01-09 2015-04-22 成都布林特信息技术有限公司 Microblog public opinion monitoring system
CN104809104A (en) * 2015-05-11 2015-07-29 苏州大学 Method and system for identifying micro-blog textual emotion
CN105389389A (en) * 2015-12-10 2016-03-09 安徽博约信息科技有限责任公司 Network public opinion transmission situation media linked analysis method

Also Published As

Publication number Publication date
CN106598944A (en) 2017-04-26

Similar Documents

Publication Publication Date Title
CN106598944B (en) A kind of civil aviaton&#39;s security public sentiment sentiment analysis method
CN113378565B (en) Event analysis method, device, device and storage medium for multi-source data fusion
Biyani et al. " 8 amazing secrets for getting more clicks": detecting clickbaits in news streams using article informality
CN103793503B (en) Opinion mining and classification method based on web texts
Mugdha et al. Evaluating machine learning algorithms for bengali fake news detection
Kareem et al. Pakistani media fake news classification using machine learning classifiers
CN106202372A (en) A kind of method of network text information emotional semantic classification
Lin et al. Opinion mining and sentiment analysis in social networks: A retweeting structure-aware approach
CN103559233A (en) Extraction method for network new words in microblogs and microblog emotion analysis method and system
Sabuna et al. Summarizing Indonesian text automatically by using sentence scoring and decision tree
CN109597995A (en) A kind of document representation method based on BM25 weighted combination term vector
Teh et al. Profanity and hate speech detection
CN104915443A (en) Extraction method of Chinese Microblog evaluation object
Chader et al. Sentiment Analysis for Arabizi: Application to Algerian Dialect.
Buntoro et al. Sentiment analysis candidates of Indonesian Presiden 2014 with five class attribute
Ding et al. Scoring tourist attractions based on sentiment lexicon
Tiwari et al. Comparative analysis of different machine learning methods for hate speech recognition in twitter text data
Azarafza et al. Textrank-based microblogs keyword extraction method for Persian language
Dung Natural language understanding
Campbell et al. Content+ context networks for user classification in twitter
Alzuabidi et al. Hybrid technique for detecting extremism in Arabic social media texts
Buntoro Sentiments analysis for governor of east java 2018 in twitter
Shanthi et al. Suicidal Ideation Prediction Using Machine Learning
Nandan et al. Sentiment Analysis of Twitter Classification by Applying Hybrid-Based Techniques
Wang et al. Towards tracking political sentiment through microblog data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190319

Termination date: 20191125