CN103455705A - Analysis and prediction system for cooperative correlative tracking and global situation of network social events - Google Patents
Analysis and prediction system for cooperative correlative tracking and global situation of network social events Download PDFInfo
- Publication number
- CN103455705A CN103455705A CN2013101971291A CN201310197129A CN103455705A CN 103455705 A CN103455705 A CN 103455705A CN 2013101971291 A CN2013101971291 A CN 2013101971291A CN 201310197129 A CN201310197129 A CN 201310197129A CN 103455705 A CN103455705 A CN 103455705A
- Authority
- CN
- China
- Prior art keywords
- social
- event
- events
- information
- tracking
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004927 fusion Effects 0.000 claims abstract description 34
- 238000000034 method Methods 0.000 claims description 67
- 230000008569 process Effects 0.000 claims description 40
- 238000005065 mining Methods 0.000 claims description 29
- 238000011161 development Methods 0.000 claims description 26
- 230000000007 visual effect Effects 0.000 claims description 18
- 238000005516 engineering process Methods 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 9
- 230000006399 behavior Effects 0.000 claims description 5
- 238000012098 association analyses Methods 0.000 claims description 4
- 238000004891 communication Methods 0.000 claims description 3
- 238000010276 construction Methods 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 6
- 230000000875 corresponding effect Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000000295 complement effect Effects 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 244000025254 Cannabis sativa Species 0.000 description 2
- 235000012766 Cannabis sativa ssp. sativa var. sativa Nutrition 0.000 description 2
- 235000012765 Cannabis sativa ssp. sativa var. spontanea Nutrition 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000008267 milk Substances 0.000 description 2
- 210000004080 milk Anatomy 0.000 description 2
- 235000013336 milk Nutrition 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000003449 preventive effect Effects 0.000 description 2
- 241000218236 Cannabis Species 0.000 description 1
- 235000010254 Jasminum officinale Nutrition 0.000 description 1
- 240000005385 Jasminum sambac Species 0.000 description 1
- 229920000877 Melamine resin Polymers 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007123 defense Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 238000013277 forecasting method Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- JDSHMPZPIAZGSV-UHFFFAOYSA-N melamine Chemical compound NC1=NC(N)=NC(N)=N1 JDSHMPZPIAZGSV-UHFFFAOYSA-N 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000035699 permeability Effects 0.000 description 1
- 230000000246 remedial effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 210000004722 stifle Anatomy 0.000 description 1
- 230000002195 synergetic effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明是一种网络社会事件的协同关联跟踪及全局态势分析与预测系统,该系统具有:信息融合单元,对网络社会事件数据的多模态特性进行融合,用于获得社会事件的多模态数据的融合信息,构建跨社会事件的多模态数据的语义描述模型;跟踪单元与信息融合单元连接,基于跨社会事件的多模态数据语义描述模型,面对包含丰富多媒体信息的网络内容,针对网络社会事件具有的跨模态属性、跨平台属性和跨时空属性,获得社会事件在各个属性上的语义关联跟踪数据;分析与预测单元与跟踪单元连接,基于语义关联跟踪数据,获得基于社会事件的全局态势分析与预测数据。
The present invention is a collaborative correlation tracking and global situation analysis and prediction system of network social events. The system has: an information fusion unit, which fuses the multi-modal characteristics of network social event data, and is used to obtain the multi-modal characteristics of social events. Data fusion information builds a semantic description model of multi-modal data across social events; the tracking unit is connected with the information fusion unit, based on the multi-modal data semantic description model of cross-social events, facing network content containing rich multimedia information, Aiming at the cross-modal attributes, cross-platform attributes and cross-time-space attributes of network social events, the semantic correlation tracking data of social events on each attribute is obtained; the analysis and prediction unit is connected with the tracking unit, and based on the semantic correlation tracking data, the social-based Global situational analysis and forecast data of events.
Description
技术领域technical field
本发明属于网络社会事件的关联跟踪与预测技术领域,涉及一种关于网络社会事件的协同关联跟踪及全局态势分析与预测框架体系。The invention belongs to the technical field of association tracking and prediction of network social events, and relates to a collaborative association tracking and overall situation analysis and prediction framework system of network social events.
背景技术Background technique
随着互联网的快速普及,网络已经成为大多数人生活中不可或缺的工具。网民数量正在快速的增长,逐渐成为一个庞大的群体,根据中国互联网络发展状况统计报告显示,截至2012年12月底,中国网页数量为1227亿个,比2011年同期增长41.7%。以上统计数据表明,我们已经生活在海量的网络数据中。我们不仅需要对这些网络数据进行合理的存储,并且还需要研究怎么处理网络中海量的数据,从而得到其隐藏在海量网络数据后的潜在有用信息。面对互联网上的海量文本,图像,视频数据,如何对社会热点事件产生的大规模数据进行有效地组织、获取、挖掘与监控已成为一项迫切的应用需求。With the rapid popularization of the Internet, the Internet has become an indispensable tool in most people's lives. The number of Internet users is growing rapidly and has gradually become a huge group. According to the statistical report on the development of the Internet in China, as of the end of December 2012, the number of Chinese web pages was 122.7 billion, an increase of 41.7% over the same period in 2011. The above statistics show that we already live in a massive amount of network data. We not only need to store these network data reasonably, but also need to study how to deal with the massive data in the network, so as to obtain the potentially useful information hidden behind the massive network data. In the face of massive text, image, and video data on the Internet, how to effectively organize, acquire, mine, and monitor large-scale data generated by social hot events has become an urgent application requirement.
一个新闻热点事件是随着时间不断发展变化的,随着事件的发展,有新的重要的新闻故事发生,有旧的故事消亡。当用户搜索这个新闻事件时,一个重要的任务就是发现新闻话题,找出各个新闻故事之间的相关性,并以此构建一个事件发展的过程图,将这个事件主题的发展脉络建立起来。因此,在搜索一个热点事件时,用户希望看到的搜索结果并不是仅根据文本相关性排列的一系列新闻条目,让用户在一堆杂乱无章的新闻条目中去寻找事件各个故事之间的关系和事件的进展,而应是能根据事件的发展将事件演化的关系、事件的发展、事件主题的演化过程等信息,以一种清晰的、图形化的界面展示给用户,让用户能够快速知道整个事件的发展演变过程,并且让用户快速了解和分析这个事件的主题。如何迅速、准确、适量地提供用户所需信息,并在一定程度上揭示信息之间的关联,把一个热点事件的完整演变过程呈现给用户,为用户提供智能的信息检索服务,并且让用户快速了解和分析这个事件的主题,已成为学术界和工业界共同关心的课题。A hot news event is constantly evolving and changing over time. With the development of the event, new and important news stories occur and old stories disappear. When users search for this news event, an important task is to discover news topics, find out the correlation between each news story, and construct a process map of event development, and establish the development context of this event theme. Therefore, when searching for a hot event, the search results that users want to see are not a series of news items arranged only according to textual relevance, allowing users to find the relationship and relationship between the various stories of the event in a pile of messy news items. The progress of the event should be based on the development of the event, and the information such as the relationship between the event evolution, the development of the event, and the evolution process of the event theme can be displayed to the user in a clear and graphical interface, so that the user can quickly know the entire The development and evolution process of the event, and allows users to quickly understand and analyze the theme of the event. How to quickly, accurately, and appropriately provide the information users need, and to a certain extent, reveal the correlation between information, present the complete evolution process of a hot event to users, provide users with intelligent information retrieval services, and allow users to quickly Understanding and analyzing the subject of this event has become a topic of common concern in both academia and industry.
由于互联网具有虚拟性、隐蔽性、发散性、渗透性和随意性等特点,当一个热点事件发生时,越来越多的网民愿意通过网络渠道来表达自己的观点。当今,信息传播与意见交互空前迅捷,网络舆论的表达诉求也日益多元。如果不能有效地进行引导,负面的网络舆情将对社会公共安全形成较大的威胁。纵观国际大环境和我国的国情,网络安全、社会稳定以及对热点事件的监管都面临着重大的挑战。中东地区的茉莉花革命在各个国家的迅速蔓延和网络信息的传播互动,以及我国近期出现的“江苏启东728事件”、“中日钓鱼岛之争”等一系列牵动亿万网友的热点事件在网络上迅速的传播以及网友广泛的参与,无不说明了我们所面临的社会事件分析形势极其严峻。我国正处于全面建设小康与和谐社会的关键时期,特别需要一个安全稳定的政治和社会环境,因此政府及相关管理者亟需对于网络上舆论信息进行有效地挖掘、分析与处理,把握处理危机事件的最佳时机。然而,仅仅依靠人工方式难以应对海量的网络信息的搜集和处理,需要融合多种信息技术,通过适当的计算机技术,分析当前的社会热点事件与突发事件,对网络中负面的舆情和误导性言论及时、迅速做出反应,挖掘网络中形成事件各个过程的缘由,将谣言和危机扼杀于无形,从而提高对热点事件和突发事件的监管和处理能力。Due to the characteristics of the Internet, such as virtuality, concealment, divergence, permeability and randomness, when a hot event occurs, more and more netizens are willing to express their views through network channels. Today, the dissemination of information and the interaction of opinions are unprecedentedly fast, and the demands of expressing online public opinion are increasingly diverse. If it cannot be effectively guided, negative Internet public opinion will pose a greater threat to social public security. Looking at the international environment and my country's national conditions, network security, social stability and the supervision of hot events are all facing major challenges. The rapid spread of the Jasmine Revolution in the Middle East in various countries and the interaction of network information, as well as the recent "Jiangsu Qidong 728 Incident" and "Sino-Japanese Diaoyu Islands Dispute" in my country, have attracted hundreds of millions of netizens on the Internet. The rapid dissemination and extensive participation of netizens all show that we are facing an extremely severe situation in the analysis of social events. my country is in a critical period of building a well-off and harmonious society in an all-round way, and a safe and stable political and social environment is especially needed. Therefore, the government and relevant managers urgently need to effectively mine, analyze and process public opinion information on the Internet, and grasp and handle crisis events best time. However, it is difficult to deal with the collection and processing of massive network information only by manual methods. It is necessary to integrate various information technologies, and analyze current social hotspot events and emergencies through appropriate computer technology, and to analyze negative public opinion and misleading information on the Internet. Respond to speech in a timely and rapid manner, dig out the causes of various events in the network, and stifle rumors and crises invisible, thereby improving the ability to supervise and deal with hot events and emergencies.
现有的社会事件的跟踪和预测算法只是使用了单个平台的文本信息。还在文本信息理解上,聚类技术是数据挖掘以及模式识别的基本技术,通过对文本信息进行聚类来得到社会事件的主题,传统的聚类方法对于社会事件的爆发性和区域性等特性,其主题挖掘精确度不高,很难对一个完整事件进行其语义描述。因为除了文本信息,事件也有其丰富的视觉信息。对一个事件来说,其在不同的网站会有不同的用户评论,然而,这两个网站可能有非常相似的视觉信息,比如,图像或者视频,这些信息为跨时间跨网站构建一个事件条目演变的桥梁是非常有用的。例如,事件“2012年美国总统大选”,这个事件的各个条目关于奥巴马的图像是很大地关联的。因此,采用多模态信息融合更能正确地对社会事件进行描述。并且,不同的平台也能够互相弥补和增强。例如,大部分事件在Google News上是来自于官方媒体,但是他们在Flickr上也有许多用户评论。因此,信息在不同的平台上能够互相帮助,尤其是在一个平台中的优势来补充另一个平台中的弱势更为有效。关键的挑战是如何来找到最有效的方法来构建两个平台之间的语义鸿沟。针对这种情况,如果能够把视觉信息融入,对社会事件建立一个统一的多模态信息描述,并且在不同平台上对社会事件进行有效地协同关联,从而实现跨平台的多模态信息融合的事件语义描述体系。基于跨平台多媒体信息的社会事件协同跟踪及全局态势分析与预测方法能够把多平台的多模态信息有效地结合起来,从而改善这一不足。Existing tracking and forecasting algorithms for social events only use textual information from a single platform. In terms of understanding text information, clustering technology is the basic technology of data mining and pattern recognition. The themes of social events are obtained by clustering text information. Traditional clustering methods are not suitable for the explosive and regional characteristics of social events. , the accuracy of topic mining is not high, and it is difficult to describe a complete event semantically. Because in addition to text information, events also have their rich visual information. For an event, there will be different user comments on different websites, however, the two websites may have very similar visual information, such as images or videos, which can be used to construct an event item evolution across time and across websites. Bridges are very useful. For example, for the event "2012 U.S. Presidential Election", the items of this event are highly correlated with images of Obama. Therefore, using multimodal information fusion can describe social events more correctly. Moreover, different platforms can also complement and enhance each other. For example, most events on Google News are from official media, but they also have many user comments on Flickr. Therefore, information on different platforms can help each other, especially when the advantages of one platform complement the weaknesses of another platform is more effective. The key challenge is how to find the most effective way to bridge the semantic gap between the two platforms. In response to this situation, if visual information can be integrated, a unified multi-modal information description can be established for social events, and social events can be effectively collaboratively correlated on different platforms, so as to realize cross-platform multi-modal information fusion. Event semantic description system. The collaborative tracking of social events based on cross-platform multimedia information and the global situation analysis and prediction method can effectively combine multi-platform and multi-modal information to improve this deficiency.
现有的针对网络社会事件的跟踪和预测方法研究还处于初始阶段,至今没有一个完整的流程和框架体系。本发明是一个基于网络社会事件的协同关联跟踪与预测的框架体系,有效地弥补了传统方法的不足,实现了基于多模态信息语义融合的跨模态、跨平台和跨时空的协同关联跟踪与全局态势分析与预测。The existing research on tracking and forecasting methods for network social events is still in the initial stage, and there is no complete process and framework system so far. The present invention is a framework system for collaborative correlation tracking and prediction based on network social events, which effectively makes up for the shortcomings of traditional methods, and realizes cross-modal, cross-platform and cross-time-space collaborative correlation tracking based on multi-modal information semantic fusion and global situation analysis and prediction.
发明内容Contents of the invention
(一)要解决的技术问题(1) Technical problems to be solved
本发明的目的是提供了一种基于网络社会事件协同跟踪预测的完整框架体系。针对网络社会事件数据的多模态特性,在不同模态、平台和时空上对热点社会事件进行有效地协同关联跟踪;通过跟踪多个事件之后,我们可以把整个事件的过程随着时间的推移可视化显示;基于跟踪的结果,还可以得到一个社会事件的完整描述,通过这些信息的关联分析和主题挖掘模型,可以挖掘出社会事件的主题和舆论导向,并利用统计学习的方法,获悉这个社会事件的全局态势,能够实现基于社会事件的全局态势分析与预测,为此提出关于网络社会事件的协同关联跟踪及全局态势分析与预测框架。The purpose of the present invention is to provide a complete framework system based on collaborative tracking and forecasting of network social events. According to the multi-modal characteristics of network social event data, we can effectively track hot social events in different modes, platforms, and time and space; after tracking multiple events, we can track the entire event process over time. Visual display; based on the tracking results, a complete description of a social event can also be obtained. Through the association analysis of these information and the topic mining model, the theme and public opinion orientation of the social event can be mined, and the statistical learning method can be used to learn about the social events. The overall situation of the event can realize the analysis and prediction of the global situation based on social events. Therefore, a framework for collaborative correlation tracking and global situation analysis and prediction of network social events is proposed.
(二)技术方案(2) Technical solutions
为实现上述目的,本发明提供关于网络社会事件的协同关联跟踪及全局态势分析与预测系统,该系统包括:In order to achieve the above object, the present invention provides a system for collaborative correlation tracking and overall situation analysis and forecasting of network social events, which includes:
信息融合单元,对网络社会事件数据的多模态特性进行融合,使用自然语言理解和图像与视频处理分析技术,用于获得社会事件的多模态数据的融合信息,构建跨社会事件的多模态数据的语义描述模型;The information fusion unit integrates the multi-modal characteristics of network social event data, uses natural language understanding and image and video processing and analysis technology to obtain fusion information of multi-modal data of social events, and constructs a multi-modal cross-social event Semantic description model of state data;
跟踪单元与信息融合单元连接,基于跨社会事件的多模态数据语义描述模型,面对包含丰富多媒体信息的网络内容,针对网络社会事件具有的跨模态属性、跨平台属性和跨时空属性,结合协同关联跟踪技术,获得社会事件在各个属性上的语义关联跟踪数据;The tracking unit is connected with the information fusion unit, based on the multi-modal data semantic description model of cross-social events, facing the network content containing rich multimedia information, aiming at the cross-modal attributes, cross-platform attributes and cross-time-space attributes of network social events, Combined with collaborative correlation tracking technology, the semantic correlation tracking data of social events on each attribute is obtained;
分析与预测单元与跟踪单元连接,基于语义关联跟踪数据,得到一个社会事件随着时间发展的整个过程信息,通过社会事件主题挖掘模型、社会事件模式预测和社会事件统计预测模型挖掘过程信息,挖掘出过程信息的社会事件的主题和舆论导向,从而知道这个社会事件的全局态势,获得基于社会事件的全局态势分析与预测数据。The analysis and prediction unit is connected with the tracking unit, and based on the semantic association tracking data, the whole process information of a social event over time is obtained, and the process information is mined through the social event theme mining model, social event pattern prediction and social event statistical prediction model. The theme and public opinion guidance of the social event of the process information, so as to know the overall situation of the social event, and obtain the global situation analysis and forecast data based on the social event.
本发明的有益效果:本发明采用了社会事件的多模态信息融合与建模,使用了基于跨平台、跨模态和跨时空这三个不同属性的社会事件的协同关联跟踪,挖掘出各个社会事件的主题和舆论导向,并利用统计学习的方法,从而预测出其后续发展的轨迹,最终提出了关于网络社会事件的协同关联跟踪及全局态势分析与预测框架体系,包括关于多媒体信息的社会事件的协同关联跟踪及全局态势分析与预测方法。该发明解决了网络上不同的热点社会事件在不同模态、平台和时空上完整演变过程的描述以及对其跟踪和预测问题,其中使用提出的协同概率多假设关联跟踪算法可以大大提高系统对社会事件在各个时间段的跟踪正确精度,从而更准确的提供整个事件的发展演变过程,方便用户的了解和舆情的分析与预测。Beneficial effects of the present invention: the present invention adopts the multi-modal information fusion and modeling of social events, uses the collaborative correlation tracking of social events based on three different attributes of cross-platform, cross-modal and cross-time and space, and excavates each The theme and public opinion orientation of social events, and use statistical learning methods to predict their subsequent development trajectory, and finally put forward a framework system for collaborative correlation tracking and overall situation analysis and prediction of social events on the Internet, including social media information about multimedia information. Event collaborative correlation tracking and global situation analysis and prediction methods. This invention solves the problem of describing the complete evolution process of different hot social events on the network in different modalities, platforms, and time and space, as well as tracking and predicting them. Using the proposed synergistic probability multi-hypothesis association tracking algorithm can greatly improve the system's social awareness. The tracking accuracy of events in each time period is accurate, so as to provide a more accurate development and evolution process of the entire event, which is convenient for users to understand and analyze and predict public opinion.
附图说明Description of drawings
图1a和图1b是本发明面向网络社会事件协同关联跟踪与预测系统的完整框架体系;Fig. 1a and Fig. 1b are the complete framework system of the present invention oriented to the network social event cooperative association tracking and prediction system;
图2是本发明中基于文本的主题挖掘语义描述示意图;Fig. 2 is a schematic diagram of semantic description of text-based topic mining in the present invention;
图3是本发明中基于图像视频的主题挖掘语义描述示意图;Fig. 3 is a schematic diagram of semantic description of topic mining based on image and video in the present invention;
图4是本发明中跨模态语义关联示意图片;Fig. 4 is a schematic diagram of cross-modal semantic association in the present invention;
图5是本发明中跨平台语义关联示意图片;Fig. 5 is a schematic diagram of cross-platform semantic association in the present invention;
图6是本发明中跨时空语义关联示意图片;Fig. 6 is a schematic diagram of temporal-spatial semantic association in the present invention;
图7是本发明中基于社会事件的多模态主题挖掘示意图片;Fig. 7 is a schematic diagram of multimodal topic mining based on social events in the present invention;
图8是一条公路最近几年各个时期路面拥堵情况分布示意图片;Figure 8 is a schematic picture of the distribution of road congestion in various periods of a highway in recent years;
具体实施方式Detailed ways
为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明进一步详细说明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.
如图1a和图1b所示是本发明面向网络社会事件协同跟踪与预测系统的完整框架体系,本发明实现对于多媒体信息的网络社会事件的协同跟踪及全局态势分析与预测,得到各个社会事件从开始到结束一个完整的描述。这样可以把整个事件过程随着时间的推移可视化显示,以一种清晰的、图形化的界面展示给用户,让用户能够快速知道整个事件的发展演变过程,并能够挖掘各个事件的语义主题,能够知道社会事件的舆论导向,并利用统计学习的方法,获悉这个社会事件的全局态势,能够实现基于社会事件的全局态势分析与预测。本发明的结构如图1a和图1b显示,其包含三个组成部分该系统包括:信息融合单元1跟踪单元2和分析与预测单元,实现所述系统的技术方案如下所述:As shown in Figure 1a and Figure 1b, the present invention is oriented to the complete framework system of the network social event cooperative tracking and prediction system. A complete description from start to finish. In this way, the entire event process can be visualized and displayed over time, and displayed to users in a clear and graphical interface, so that users can quickly know the development and evolution process of the entire event, and can mine the semantic theme of each event. Knowing the public opinion orientation of social events, and using statistical learning methods to learn the overall situation of this social event, can realize the analysis and prediction of the overall situation based on social events. The structure of the present invention is shown in Figure 1a and Figure 1b, and it comprises three components. The system includes: an
1社会事件的多模态信息融合与建模1 Multimodal information fusion and modeling of social events
利用信息融合单元1,对网络社会事件数据的多模态特性进行融合,使用自然语言理解和图像与视频处理分析技术,用于获得社会事件的多模态数据的融合信息,构建跨社会事件的多模态数据的语义描述模型;Use the
其中,所述构建跨社会事件的多模态数据的语义描述模型,用于提取各个社会事件的文本和视觉的跨媒体信息的特征,并对各个社会事件的文本、视觉的跨媒体信息进行语义层次挖掘,从而实现社会事件的语义描述,构建基于多模态的信息融合的语义知识库,从而得到各个社会事件的多模态语义描述。Wherein, the semantic description model of constructing the multi-modal data of cross-social events is used to extract the features of the text and visual cross-media information of each social event, and perform semantic analysis on the text and visual cross-media information of each social event. Hierarchical mining, so as to realize the semantic description of social events, build a semantic knowledge base based on multi-modal information fusion, and obtain multi-modal semantic descriptions of various social events.
其中,所述社会事件的多模态数据融合信息包含基于自然语言理解的文本分析、以及基于图像和视频处理,实现多模态信息融合的事件语义描述体系,实现多模态的语义信息融合。Wherein, the multimodal data fusion information of social events includes text analysis based on natural language understanding, and an event semantic description system based on image and video processing to realize multimodal information fusion, and realize multimodal semantic information fusion.
其中,基于文本分析、以及图像和视频的主题挖掘语义描述,采取提升算法选取有效的文本主题以及图像和视频主题。Among them, based on text analysis, and topic mining of images and videos, semantic descriptions are adopted, and effective text topics and image and video topics are selected by a lifting algorithm.
1.1基于自然语言理解的文本分析1.1 Text analysis based on natural language understanding
针对文本信息,考虑使用变形的主题模型(LDA,Latent DirichletAllocation)来获取其各个事件的主题信息,并利用文本的上下文信息,进一步实现事件文本的描述,建立以实体、事件与主题为核心的语义描述体系,主要从以下几个方面进行探讨:For text information, consider using the deformed topic model (LDA, Latent DirichletAllocation) to obtain the topic information of each event, and use the context information of the text to further realize the description of the event text and establish the semantics centered on entities, events and topics The description system is mainly discussed from the following aspects:
a)实体识别和关联a) Entity recognition and association
抽取文本中的命名性提及、代词性提及和名词性提及,并把指称同一实体概念的提及连接起来,还将利用统计翻译和文本检索相结合的方法建立中英文实体的对应,实现多语言信息的关联和整合。Extract named mentions, pronominal mentions, and nominal mentions in the text, and connect the mentions referring to the same entity concept, and also use the method of combining statistical translation and text retrieval to establish the correspondence between Chinese and English entities, Realize the association and integration of multilingual information.
b)事件抽取和描述b) Event extraction and description
考虑到事件类别识别和元素抽取是与以动词为中心的句法语义分析密切相关的,充分利用句法语义特征以及表层特征,进行事件类别识别和事件元素抽取。使用变形的主题模型(LDA)来实现社会事件文本的描述。Considering that event category recognition and element extraction are closely related to verb-centered syntactic and semantic analysis, we make full use of syntactic and semantic features and surface features for event category recognition and event element extraction. The Deformed Topic Model (LDA) is used to describe social event texts.
1.2基于图像视频处理技术1.2 Based on image and video processing technology
通过图像和视频,我们能够非常方便有效地了解各个社会事件。我们通过考虑图像和视频的结构信息,利用稀疏学习和字典学习,从而建立统一的基于词袋模型的视觉语义描述体系。Through images and videos, we can easily and effectively understand various social events. We use sparse learning and dictionary learning by considering the structural information of images and videos, so as to establish a unified visual semantic description system based on the bag-of-words model.
1.3基于多模态的语义信息融合1.3 Multimodal Semantic Information Fusion
社会事件内容包含文本、图像、视频和声音等多模态信息,对文本处理上,我们考虑使用变形的主题模型(LDA)来获取其各个事件的描述,对图像视频处理,考虑其空间结构关系,利用稀疏学习和字典学习,构建视觉主题模型,从而获取各个事件的视觉主题,并根据视觉主题进一步实现事件的语义描述。Social event content contains multi-modal information such as text, image, video, and sound. For text processing, we consider using the deformed topic model (LDA) to obtain the description of each event. For image and video processing, consider its spatial structure relationship , using sparse learning and dictionary learning to construct a visual topic model, so as to obtain the visual topics of each event, and further realize the semantic description of events according to the visual topics.
如图2本发明中基于文本的主题挖掘语义描述所示,给定所有的文档集合D,我们采用变形的主题模型(LDA)挖掘事件的主题Z1、Z2…Zi…Zk,然后基于主题Z1、Z2…Zi…Zk建立文档的语义描述的词W1、W2…Wj…Wn。As shown in Fig. 2, the text-based topic mining semantic description of the present invention, given all document collections D, we use the deformed topic model (LDA) to mine event topics Z 1 , Z 2 ...Z i ...Z k , and then Words W 1 , W 2 . . . W j .
如图3本发明中基于图像视频的主题挖掘语义描述所示,基于图像视频集合i的主题挖掘语义描述,给定所有事件的图片或者视频集,挖掘出各个事件的图片或者视频的主题类别C1、C2…Ci…Ck,基于主题类别C1、C2…Ci…Ck建立图片或者视频的语义关联的各个特征描述子W1、W2…Wj…Wn。As shown in Figure 3, the topic mining semantic description based on image and video in the present invention, based on the topic mining semantic description of the image and video set i, given the pictures or video sets of all events, the topic category C of the pictures or videos of each event is mined 1. C 2 ... C i ... C k , based on the topic categories C 1 , C 2 ... C i ... C k, each feature descriptor W 1 , W 2 ... W j ... W n of semantic association of pictures or videos is established.
基于文本以及图像和视频的主题挖掘语义描述,采取提升算法选取有效的文本主题以及图像和视频主题,从而实现多模态信息融合的事件语义描述体系,实现了多模态信息融合的社会事件描述。Based on the topic mining semantic description of text, image and video, the lifting algorithm is adopted to select effective text topics and image and video topics, so as to realize the event semantic description system of multi-modal information fusion, and realize the social event description of multi-modal information fusion .
2社会事件的协同关联跟踪模型2 Collaborative Association Tracking Model of Social Events
利用跟踪单元2与信息融合单元1连接,基于跨社会事件的多模态数据语义描述模型,面对包含丰富多媒体信息的网络内容,针对网络社会事件具有的跨模态属性、跨平台属性和跨时空属性,结合协同关联跟踪技术,获得社会事件在各个属性上的语义关联跟踪数据;Utilize the connection between the
其中,所述社会事件在各个属性上的语义关联跟踪数据,用于将互联网中语义相关的网络内容聚集起来,构成一个能够反映社会事件共同主题的集合。Wherein, the semantic correlation tracking data of each attribute of the social event is used to gather semantically related network content in the Internet to form a set that can reflect a common theme of the social event.
其中,针对网络社会事件的跨平台属性,提出协同概率多假设跟踪方法,实现两个平台上多个社会事件数据的语义协同关联跟踪,用于实现网络社会事件的跨模态及跨时空的关联跟踪。Among them, aiming at the cross-platform attribute of network social events, a collaborative probability multi-hypothesis tracking method is proposed to realize the semantic collaborative correlation tracking of multiple social event data on two platforms, which is used to realize the cross-modal and cross-temporal association of network social events track.
社会事件的协同关联跟踪模型,将互联网中语义相关的网络内容聚集起来,构成一个能够反映社会事件共同主题的集合。面对包含丰富多媒体信息的网络内容,针对网络社会事件具有跨平台、跨模态和跨时空等属性,结合协同学习关联等技术,实现社会事件在各个属性上的协同关联跟踪。The collaborative association tracking model of social events gathers semantically related network content in the Internet to form a collection that can reflect the common theme of social events. Facing the network content containing rich multimedia information, aiming at the cross-platform, cross-modal and cross-time and space attributes of network social events, combined with collaborative learning and association technologies, the collaborative correlation tracking of social events on various attributes is realized.
2.1跨模态语义关联2.1 Cross-modal semantic association
对于跨模态的语义关联,其目的是实现社会事件在不同模态间的关联,如图4本发明中跨模态语义关联所示,针对美国总统选举和发射火星勘测轨道飞行器这两个网络社会事件,将其文本域中的多个文本与视觉域的中的多个图像做出跨模态的语义关联,实现发生在同一段时期的相关文本和图像进行跨模态的语义关联。For cross-modal semantic association, its purpose is to realize the association of social events between different modes, as shown in Figure 4 cross-modal semantic association in the present invention, for the two networks of the US presidential election and the launch of the Mars survey orbiter Social events make cross-modal semantic associations between multiple texts in the text domain and multiple images in the visual domain, and realize cross-modal semantic associations between related texts and images that occurred in the same period.
我们拟在特征层采用多媒体特征融合分析来挖掘不同模态、媒体空间的潜在语义相关关系,将不同媒体的信息映射到一个较高维度的公共空间中,在同一空间选取合适的测度,从而衡量信息样本的相似性,并用跨媒体特征的隐含语义关联矩阵来刻画。在此基础上构建模态特征的隐含语义索引,用于将跨媒体样本投影到一个隐含的语义空间,建立跨媒体内容在不同特征空间上的映射和关联。We intend to use multimedia feature fusion analysis at the feature layer to mine potential semantic correlations between different modalities and media spaces, map information from different media to a higher-dimensional public space, and select appropriate measures in the same space to measure The similarity of information samples is characterized by an implicit semantic association matrix of cross-media features. On this basis, a latent semantic index of modal features is constructed, which is used to project cross-media samples into a latent semantic space, and establish the mapping and association of cross-media content on different feature spaces.
2.2跨平台语义关联2.2 Cross-platform Semantic Association
这里跨平台关联表示多源平台(如人民网和新浪微博)之间的关联关系。社会事件的发生和发展以及所经历的各个子事件往往在网络上同时存在于多个平台上,例如一个事件的信息在两个平台(人民网和新浪微博)上都会有相应的描述,并且互相补充,因此研究跨平台的复杂关系是有必要的,从而发现和挖掘信息样本跨平台的语义关系。社会事件的跨平台属性通常是由社会事件的某一子事件引发而来的,然后在网络空间的各个平台上产生大量的网络言论,因此跨平台语义关联的关键在于基于各个子事件通过协同互补关联,实现不同平台上的信息对应。图5本发明中跨平台语义关联所示,这是“希腊抗议”和“中日钓鱼岛争端”的社会事件,其各个子事件的演变过程在人民网和新浪微博上都有相应的描述,利用我们提出的协同概率多假设关联跟踪模型对这两个平台的数据建立关联,可以把这两个社会事件在人民网和新浪微博这两个平台上各个时刻的信息进行相互关联和补充,从而更加清楚地知道这个事件在这个时刻发生的具体信息,可以准确地分析各个子事件,更好地对社会事件进行监控和预测。Here, the cross-platform association refers to the association relationship between multi-source platforms (such as People's Daily Online and Sina Weibo). The occurrence and development of social events and the various sub-events experienced often exist on multiple platforms at the same time on the Internet. For example, information about an event will be described on both platforms (People. Complement each other, so it is necessary to study the complex relationship across platforms, so as to discover and mine the semantic relationship between information samples across platforms. The cross-platform attribute of social events is usually triggered by a certain sub-event of the social event, and then a large number of online speeches are generated on various platforms in cyberspace. Therefore, the key to cross-platform semantic association is based on each sub-event through synergy and complementarity. Association to realize information correspondence on different platforms. Figure 5 shows the cross-platform semantic correlation in the present invention. This is the social event of "Greek Protest" and "Sino-Japanese Diaoyu Islands Dispute". The evolution process of each sub-event has a corresponding description on People's Daily Online and Sina Weibo. Using the synergy probabilistic multi-hypothesis association tracking model we proposed to establish associations between the data of these two platforms, the information of these two social events on the People's Daily Online and Sina Weibo platforms at each moment can be correlated and supplemented. In this way, we can know more clearly the specific information of this event at this moment, can accurately analyze each sub-event, and better monitor and predict social events.
2.3跨时空语义关联2.3 Semantic association across time and space
这里的跨时空是指地理位置和时间上的不同。因为一个社会事件发生之后,在不同的空间上,人们对这个事件的看法是不同的,并且还会随着时间的改变而改变。所以,我们需要建立一种跨时空语义关联,把一个事件在不同地域所呈现的不同信息所描述出来,从而使决策者掌握各个地域之间的相互联系和区别,更好地引导事件的舆论发展,为国家服务。我们把这个事件文本中的词作为样本空间,对于每个信息样本,已知“词-时间-地点”三元组,我们将首先建立时间尺度和地理空间尺度的相似性测度,计算出各个样本间在地理位臵和时间上的相关性,继而对于样本空间“词典”中某词,分别统计以下四个频率值:同时在某个时间间隔和某个地区出现的频率,不在某个时间间隔但在某个地区出现的频率,在某个时间间隔但不在某个地区出现的频率,以及不在某个时间间隔也不在某个地区出现的频率这四个量,通过对词基于空间与时间的频率分析,得到每个词在一定空间与时间的热度系数,根据这一热度系数融合地理位置与时间上的相关性,进而得到跨时空语义的关联。如图6本发明中跨时空语义关联所示,这是事件“H7N9”蔓延的时空演变过程,其蔓延和爆发呈现地区性和时间性,上海和安徽是最新发现的,然后开始蔓延到周边省市,一开始都是小范围的时空扩展,但是,中间会有一个突发时刻,比如在4月3号,突然北京也发现H7N9感染者,H7N9的蔓延也是呈现时空特性,需要对这个事件建立一种跨时空语义关联,把一个事件在不同地域在不同时刻所呈现的不同信息所描述出来,从而使决策者掌握各个地域之间的相互联系和区别,更好地对事件有一个理智的分析和判断。The cross-time-space here refers to the difference in geographical location and time. Because after a social event occurs, people's views on this event are different in different spaces, and will change with time. Therefore, we need to establish a cross-temporal semantic association to describe the different information presented by an event in different regions, so that decision makers can grasp the interconnections and differences between regions, and better guide the development of public opinion on events. , to serve the country. We take the words in this event text as the sample space, and for each information sample, we know the "word-time-place" triplet, we will first establish the similarity measure of time scale and geospatial scale, and calculate the Then, for a word in the sample space "dictionary", the following four frequency values are counted: the frequency that appears in a certain time interval and a certain area at the same time, and the frequency that does not appear in a certain time interval But the frequency of appearing in a certain area, the frequency of appearing in a certain time interval but not in a certain area, and the frequency of not appearing in a certain time interval nor in a certain area, these four quantities, through the word based on space and time Frequency analysis obtains the popularity coefficient of each word in a certain space and time, and integrates the correlation between geographical location and time according to this popularity coefficient, and then obtains the semantic association across time and space. As shown in Figure 6, the temporal-spatial semantic correlation in the present invention, this is the spatio-temporal evolution process of the spread of the event "H7N9", its spread and outbreak are regional and temporal, Shanghai and Anhui are the latest discovery, and then began to spread to surrounding provinces At the beginning, there was a small expansion of time and space in the city, but there will be a sudden moment in the middle, such as on April 3, when H7N9 infections were suddenly found in Beijing, and the spread of H7N9 also showed time and space characteristics. A cross-temporal semantic association, which describes the different information presented by an event in different regions at different times, so that decision makers can grasp the interrelationships and differences between regions, and better have a rational analysis of events and judgment.
3社会事件的全局态势分析与预测3 Analysis and prediction of the overall situation of social events
利用分析与预测单元3与跟踪单元2连接,基于语义关联跟踪数据,得到一个社会事件随着时间发展的整个过程信息,通过社会事件主题挖掘模型、社会事件模式预测和社会事件统计预测模型挖掘过程信息,挖掘出过程信息的社会事件的主题和舆论导向,从而知道这个社会事件的全局态势,获得基于社会事件的全局态势分析与预测数据。Using the analysis and
其中,所述社会事件的全局态势分析与预测数据,获得多个事件在不同时间段的摘要细节信息和事件演变趋势,得到各个社会事件从开始到结束一个完整的描述,并把整个事件过程随着时间的推移可视化显示,能够根据事件的发展将事件演化的关系、事件的发展、事件主题的演化过程信息,以一种清晰的、图形化的界面展示给用户;通过对这些完整社会事件信息的关联分析,挖掘出这个社会事件的主题和舆论导向,并利用统计学习的方法,从而预测出其后续发展的轨迹,揭示传播行为的特性,知道这个社会事件的全局态势。Among them, the overall situation analysis and forecast data of the social events, obtain the summary details and event evolution trends of multiple events in different time periods, obtain a complete description of each social event from the beginning to the end, and follow the entire event process Visual display over time can show the relationship between event evolution, event development, and event theme evolution process information to users in a clear, graphical interface according to the development of events; through these complete social event information Correlation analysis to dig out the theme and public opinion orientation of this social event, and use statistical learning methods to predict its subsequent development trajectory, reveal the characteristics of communication behavior, and know the overall situation of this social event.
其中,所述基于社会事件的全局态势分析与预测数据是根据跟踪数据得到的每一个事件的所有信息,采取主题挖掘模型和统计学习的方法,预测出各个社会事件的主题及走势,用于准确地对社会事件进行全局态势分析与预测。Wherein, the global situation analysis and prediction data based on social events is based on all the information of each event obtained from the tracking data, adopts the method of topic mining model and statistical learning, predicts the theme and trend of each social event, and is used for accurate To analyze and predict the overall situation of social events.
一个社会事件的传播是一个典型的动态过程,是一个随着时间随着舆论导向不断演变的过程,而且不论从时间维还是空间维,任何一个社会事件行为都不是简单孤立的,每个行为之间有着错综复杂的关系,每个社会事件的发生也都有着显性的或隐性的迹象。社会事件在网络上会有大量的文本、图像与视频等多模态信息,使用我们提出的对社会事件的跨模态的信息融合与建模,能够把一个社会事件的演变过程详细的表示出来,通过这些信息的关联分析,可以挖掘出社会事件的主题和舆论导向,并利用统计学习的方法,知道这个社会事件的全局态势,我们创造性地提出了三种不同社会事件的预测模型。The dissemination of a social event is a typical dynamic process, a process that evolves over time and with the direction of public opinion, and no matter from the time dimension or the space dimension, any social event behavior is not simply isolated, each behavior There is an intricate relationship among them, and the occurrence of every social event also has obvious or hidden signs. There will be a large amount of multi-modal information such as text, images, and videos on social events on the Internet. Using the cross-modal information fusion and modeling of social events proposed by us, the evolution process of a social event can be expressed in detail , through the association analysis of these information, we can dig out the theme and public opinion orientation of social events, and use the method of statistical learning to know the overall situation of this social event, we creatively proposed three prediction models for different social events.
3.1社会事件主题挖掘3.1 Mining themes of social events
社会事件在网络上会有大量的文本、图像与视频等多模态信息,通过对社会事件的跨模态跨平台的信息融合与建模,能够把一个社会事件的演变过程详细地表示出来,然后对多模态的信息进行主题建模,这样可以把整个事件过程随着时间的推移可视化显示,一方面能够根据事件的发展将事件演化的关系、事件的发展、事件主题的演化过程等信息,以一种清晰的、图形化的界面展示给用户,让用户能够快速知道整个事件的发展演变过程,并且让用户快速了解和分析这个事件的主题,另一方面能够更好地分析出社会事件的全局态势和舆情信息,来进行相应的舆论导向和舆情监控。There will be a large amount of multi-modal information such as text, images, and videos on social events on the Internet. Through cross-modal and cross-platform information fusion and modeling of social events, the evolution process of a social event can be expressed in detail. Then carry out topic modeling for multi-modal information, so that the entire event process can be visualized and displayed over time. On the one hand, information such as the relationship between event evolution, event development, and event theme evolution process can be analyzed according to the development of the event. , displayed to users with a clear, graphical interface, allowing users to quickly know the development and evolution of the entire event, and allowing users to quickly understand and analyze the theme of the event, on the other hand, it can better analyze social events The overall situation and public opinion information to carry out corresponding public opinion guidance and public opinion monitoring.
对各个社会事件跟踪后,能够进行相应的文本主题和视觉主题挖掘,最后可以得到社会事件的各个演变过程的主题信息,从而对社会事件的跟踪防御和舆情的监控有着显著的帮助。如图7示出本发明中基于社会事件的多模态主题挖掘,通过对事件(“2012年美国总统选举”)进行跨模态,跨平台和跨时空的跟踪,得到了事件在各个时刻的演变过程,通过主题模型来对事件整个过程的文本和图像进行主题挖掘,得到了图7上事件在各个时刻的文本主题和视觉主题,然后,由这些各个时刻的文本主题和视觉主题,进行关联和统计,实现对事件主题的挖掘,最后可视化显示,方便用户了解和舆情分析。After tracking various social events, it is possible to carry out corresponding text and visual theme mining, and finally get the topic information of each evolution process of social events, which is of great help to the tracking and defense of social events and the monitoring of public opinion. Figure 7 shows the multi-modal topic mining based on social events in the present invention. By tracking the event ("2012 U.S. Presidential Election") across modes, platforms, and time and space, the results of the event at each moment are obtained. In the evolution process, the theme mining is carried out on the text and images of the whole process of the event through the theme model, and the text theme and visual theme of the event at each moment in Figure 7 are obtained, and then these text themes and visual themes at each moment are associated. and statistics, to realize the mining of event themes, and finally visualized display, which is convenient for users to understand and analyze public opinion.
3.2社会事件统计预测3.2 Statistical prediction of social events
社会事件的演变过程通过我们提出的方法可以详细地描绘出,并且能得到这个社会事件的主题分布情况,因此我们就可以对这个社会事件进行相应的分析和预测。我们拟根据跟踪的结果所得到的的每一个事件的所有信息,采取统计学习的方法,预测出各个社会事件的走势,这样能够准确地对社会事件进行分析监控,政府决策者能够及时地采取相应的应对措施来使负面的舆论遏制在萌芽之中。在整个预测周期中,会通过基础数据建立一个基准,然后随着时间的推移,根据跟踪的结果所得到的事件的信息越来越丰富,从而使事件的未来趋势越来越明显,借助统计数据挖掘的方法实现事件的预测。The evolution process of social events can be described in detail through our proposed method, and the topic distribution of this social event can be obtained, so we can analyze and predict this social event accordingly. We plan to use statistical learning methods to predict the trend of various social events based on all the information of each event obtained from the tracking results, so that we can accurately analyze and monitor social events, and government decision makers can take timely corresponding measures. countermeasures to nip negative public opinion in the bud. In the entire forecast cycle, a baseline will be established through basic data, and then as time goes by, the information of the event obtained according to the tracking results will become more and more abundant, so that the future trend of the event will become more and more obvious, with the help of statistical data The method of mining realizes the prediction of events.
3.3社会事件模式预测3.3 Social event pattern prediction
许多社会事件是有共同的时空模式,在一年的一个时间段中都会出现,我们可以根据最近几年社会事件的发生情况,对相同的社会事件进行细致地分析,使用我们提出的建模方法,得出其社会事件的主题分布情况,我们就可以对这个社会事件进行相应的分析和预测,从而知道以后这个类似社会事件最大可能在什么时间段发生,可以提前做到预防措施。Many social events have a common spatio-temporal pattern, and they will appear in a time period of the year. We can analyze the same social events in detail based on the occurrence of social events in recent years, using our proposed modeling method , to get the subject distribution of the social event, we can analyze and predict this social event accordingly, so as to know when this similar social event is most likely to occur in the future, and preventive measures can be taken in advance.
图8为某一条公路最近几年各个时期路面拥堵情况分布图,公路拥堵情况在一年中是随着时间不同而呈现不同的情况,但是,这个社会事件在这几年的发展轨迹都是相似的,即共同事件,我们可以对这些共同事件建模,采用图匹配等模式挖掘方法,可以分析和预测之后这个事件各个时期的发展轨迹,从而为决策者提供可靠地信息,可以在特定的时期进行相应的预防和补救。Figure 8 shows the distribution of road congestion in various periods of a certain highway in recent years. The road congestion varies with time in a year, but the development trajectory of this social event in recent years is similar. We can model these common events and use pattern mining methods such as graph matching to analyze and predict the development trajectory of this event in each period, so as to provide reliable information for decision makers. Take appropriate preventive and remedial measures.
4本发明的网络社会事件的协同关联跟踪及全局态势分析与预测方法,包括如下步骤:4. The collaborative association tracking and overall situation analysis and prediction method of network social events of the present invention comprises the following steps:
步骤S1:对各个社会事件的文本和视觉信息进行特征提取,对这些跨媒体信息进行语义层次挖掘,从而实现社会事件的语义描述,构建了基于社会事件的多模态的信息融合的语义知识库,从而得到各个社会事件的多模态语义描述模型。Step S1: Extract the features of the text and visual information of each social event, and carry out semantic level mining on these cross-media information, so as to realize the semantic description of social events, and build a semantic knowledge base based on multi-modal information fusion of social events , so as to obtain the multimodal semantic description model of each social event.
步骤S2:将互联网中语义相关的网络内容聚集起来,构成一个能够反映社会事件共同主题的集合。面对包含丰富多媒体信息的网络内容,针对网络社会事件具有跨平台、跨模态和跨时空等属性,结合协同学习关联等技术,实现社会事件在各个属性上的协同关联跟踪模型。Step S2: Gather semantically related network content in the Internet to form a collection that can reflect the common theme of social events. Facing the network content containing rich multimedia information, aiming at the cross-platform, cross-modal and cross-time-space attributes of network social events, combined with collaborative learning and association technologies, a collaborative correlation tracking model of social events on various attributes is realized.
步骤S3:通过步骤2,能够获得多个事件在不同时间段的摘要细节信息和事件演变趋势,由此可以得到各个社会事件从开始到结束一个完整的描述。这样可以把整个事件过程随着时间的推移可视化显示,能够根据事件的发展将事件演化的关系、事件的发展、事件主题的演化过程等信息,以一种清晰的、图形化的界面展示给用户。通过对这些完整社会事件信息的关联分析,可以挖掘出这个社会事件的主题和舆论导向,并利用统计学习的方法,从而预测出其后续发展的轨迹,揭示传播行为的特性,知道这个社会事件的全局态势。Step S3: Through
5实施效果5 Implementation effect
为了评估本发明,我们进行了一些实验研究,从谷歌新闻(Google News)和Flickr上选择了18个热门主题作为研究的热点社会事件,数据集如表1所示,其覆盖政治,经济,技术,娱乐,军事,社会等主题。谷歌新闻上有3583个文档和6742个图像,Flickr上有4356个文档和4356个图像。In order to evaluate the present invention, we conducted some experimental research, selected 18 popular topics from Google News (Google News) and Flickr as the hot social events of research, the data set is shown in Table 1, which covers politics, economy, technology , entertainment, military, social and other themes. There are 3583 documents and 6742 images on Google News and 4356 documents and 4356 images on Flickr.
表1本发明实施效果的谷歌英文新闻和Flickr图片信息数据集Table 1 Google English news and Flickr picture information dataset of the present invention's implementation effect
表2是本发明在谷歌英文新闻和Flickr图片信息这两个平台上,针对跨平台与单平台跟踪结果的比较。从表2可以看出,本发明的方法在对各个社会事件跨平台上的关联跟踪有显著的效果,相比较在文本或者视觉这种单一信息,利用多模态的信息融合技术,可以获得更好的社会事件描述。在加入了多个平台,使用了跨平台的协同关联跟踪,相比较单个平台的跟踪方法,能够得到更好的跟踪精度。Table 2 is the comparison of the present invention at the cross-platform and single-platform tracking results on the two platforms of Google English News and Flickr picture information. It can be seen from Table 2 that the method of the present invention has a significant effect on the correlation tracking of various social events across platforms. Compared with single information such as text or vision, using multi-modal information fusion technology can obtain more Good description of social events. After joining multiple platforms and using cross-platform collaborative correlation tracking, better tracking accuracy can be obtained compared with the tracking method of a single platform.
表2本发明谷歌英文新闻数据集中跨平台与单平台跟踪结果的比较Table 2 Comparison of cross-platform and single-platform tracking results in the Google English news data set of the present invention
表3是本发明在谷歌英文新闻和Flickr图片信息数据集中对各个社会事件的主题挖掘信息。从表3可以看出,本发明的方法在对各个社会事件跨平台上的主题挖掘信息有着不错的效果,事件1是“中日钓鱼岛争端”,事件2是“美国的大麻合法化过程”,事件3是“2011年挪威袭击事件”,事件4是“2008年中国牛奶丑闻事件”,事件5是“2012年美国总统选举过程”。事件1挖掘出的主题词有中国,日本,钓鱼岛,台湾,监视,军事等,明显反应了这个事件1的主题信息,再次证明了系统的可靠性。Table 3 is the topic mining information of various social events in Google English News and Flickr image information data sets according to the present invention. As can be seen from Table 3, the method of the present invention has a good effect on topic mining information on various social events across platforms.
表3本发明在谷歌英文新闻数据集中对社会事件的主题挖掘Table 3 The present invention excavates the topics of social events in the Google English news data set
以上实验充分证实了本次发明的有效性和完备性。The above experiments have fully confirmed the effectiveness and completeness of the invention.
以上所述,仅为本发明中的具体实施方式,但本发明的保护范围并不局限于此,任何熟悉该技术的人在本发明所揭露的技术范围内,可理解想到的变换或替换,都应涵盖在本发明的包含范围之内,因此,本发明的保护范围应该以权利要求书的保护范围为准。The above is only a specific implementation mode in the present invention, but the scope of protection of the present invention is not limited thereto. Anyone familiar with the technology can understand the conceivable transformation or replacement within the technical scope disclosed in the present invention. All should be covered within the scope of the present invention, therefore, the protection scope of the present invention should be based on the protection scope of the claims.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013101971291A CN103455705A (en) | 2013-05-24 | 2013-05-24 | Analysis and prediction system for cooperative correlative tracking and global situation of network social events |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2013101971291A CN103455705A (en) | 2013-05-24 | 2013-05-24 | Analysis and prediction system for cooperative correlative tracking and global situation of network social events |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103455705A true CN103455705A (en) | 2013-12-18 |
Family
ID=49738060
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2013101971291A Pending CN103455705A (en) | 2013-05-24 | 2013-05-24 | Analysis and prediction system for cooperative correlative tracking and global situation of network social events |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103455705A (en) |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699761A (en) * | 2014-01-13 | 2014-04-02 | 中国人民解放军国防科学技术大学 | Method and device for displaying general artificial social situation based on data flow mapping |
CN103699663A (en) * | 2013-12-27 | 2014-04-02 | 中国科学院自动化研究所 | Hot event mining method based on large-scale knowledge base |
CN103955505A (en) * | 2014-04-24 | 2014-07-30 | 中国科学院信息工程研究所 | Micro-blog-based real-time event monitoring method and system |
CN105354280A (en) * | 2015-10-30 | 2016-02-24 | 中国科学院自动化研究所 | Social event tracking and evolving method based on social media platform |
CN105630845A (en) * | 2014-11-19 | 2016-06-01 | 北京航天长峰科技工业集团有限公司 | Specific network information based analysis research and judgment method |
CN105808722A (en) * | 2016-03-08 | 2016-07-27 | 苏州大学 | Information discrimination method and system |
CN106126558A (en) * | 2016-06-16 | 2016-11-16 | 东软集团股份有限公司 | A kind of public sentiment monitoring method and device |
CN106295565A (en) * | 2016-08-10 | 2017-01-04 | 中用环保科技有限公司 | Monitor event identifications based on big data and in real time method of crime prediction |
US9742719B2 (en) | 2014-06-02 | 2017-08-22 | International Business Machines Corporation | Method for real-time viral event prediction from social data |
CN107590733A (en) * | 2017-08-08 | 2018-01-16 | 杭州灵皓科技有限公司 | Platform methods of risk assessment is borrowed based on the net of geographical economy and social networks |
CN107958322A (en) * | 2017-10-09 | 2018-04-24 | 中国电子科技集团公司第二十八研究所 | A kind of urban network spatial synthesis governing system |
CN108846043A (en) * | 2018-05-30 | 2018-11-20 | 深圳市锐安信息安全技术有限公司 | Network trace mining analysis method and system based on internet big data |
CN109635995A (en) * | 2018-10-25 | 2019-04-16 | 中国电子科技集团公司电子科学研究院 | A kind of social security events anomaly method and device based on multidimensional data |
CN110019812A (en) * | 2018-02-27 | 2019-07-16 | 中国科学院计算技术研究所 | A kind of user is from production content detection algorithm and system |
CN110825839A (en) * | 2019-11-07 | 2020-02-21 | 成都国腾实业集团有限公司 | Incidence relation analysis method for targets in text information |
CN110855467A (en) * | 2019-08-19 | 2020-02-28 | 中国电子科技集团公司第三十研究所 | A network comprehensive situation prediction method based on computer vision technology |
CN111382276A (en) * | 2018-12-29 | 2020-07-07 | 中国科学院信息工程研究所 | Event development venation map generation method |
CN111444402A (en) * | 2019-10-29 | 2020-07-24 | 天津大学 | Analysis method for community detection based on index construction and social factor control network |
WO2020155496A1 (en) * | 2019-01-31 | 2020-08-06 | 平安科技(深圳)有限公司 | Public opinion tracking method and device for combined video-text data, and computer apparatus |
CN112765485A (en) * | 2021-01-18 | 2021-05-07 | 深圳市网联安瑞网络科技有限公司 | Network social event prediction method, system, terminal, computer device and medium |
CN113449116A (en) * | 2021-06-22 | 2021-09-28 | 青岛海信网络科技股份有限公司 | Map construction and early warning method, device and medium |
CN113486754A (en) * | 2021-06-29 | 2021-10-08 | 中国科学院自动化研究所 | Event evolution prediction method and system based on video |
CN113761337A (en) * | 2020-12-31 | 2021-12-07 | 国家计算机网络与信息安全管理中心 | Event prediction method and device based on implicit elements and explicit relations of events |
CN119808934A (en) * | 2024-11-20 | 2025-04-11 | 中国科学院自动化研究所 | Social world simulation method, system and electronic device based on big model |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020099744A1 (en) * | 2001-01-25 | 2002-07-25 | International Business Machines Corporation | Method and apparatus providing capitalization recovery for text |
CN101299241A (en) * | 2008-01-14 | 2008-11-05 | 浙江大学 | Method for detecting multi-mode video semantic conception based on tensor representation |
CN102411638A (en) * | 2011-12-30 | 2012-04-11 | 中国科学院自动化研究所 | A Multimedia Abstract Generation Method for News Retrieval Results |
-
2013
- 2013-05-24 CN CN2013101971291A patent/CN103455705A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020099744A1 (en) * | 2001-01-25 | 2002-07-25 | International Business Machines Corporation | Method and apparatus providing capitalization recovery for text |
CN101299241A (en) * | 2008-01-14 | 2008-11-05 | 浙江大学 | Method for detecting multi-mode video semantic conception based on tensor representation |
CN102411638A (en) * | 2011-12-30 | 2012-04-11 | 中国科学院自动化研究所 | A Multimedia Abstract Generation Method for News Retrieval Results |
Non-Patent Citations (2)
Title |
---|
赵旭剑: "中文新闻话题动态演化及其关键技术研究", 《中国博士学位论文全文数据库》 * |
齐海凤: "网络舆情热点发现与事件跟踪技术研究", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103699663A (en) * | 2013-12-27 | 2014-04-02 | 中国科学院自动化研究所 | Hot event mining method based on large-scale knowledge base |
CN103699663B (en) * | 2013-12-27 | 2017-02-08 | 中国科学院自动化研究所 | Hot event mining method based on large-scale knowledge base |
CN103699761B (en) * | 2014-01-13 | 2015-08-05 | 中国人民解放军国防科学技术大学 | Based on general artificial social product battle state display method and the device of data stream |
CN103699761A (en) * | 2014-01-13 | 2014-04-02 | 中国人民解放军国防科学技术大学 | Method and device for displaying general artificial social situation based on data flow mapping |
CN103955505A (en) * | 2014-04-24 | 2014-07-30 | 中国科学院信息工程研究所 | Micro-blog-based real-time event monitoring method and system |
US9742719B2 (en) | 2014-06-02 | 2017-08-22 | International Business Machines Corporation | Method for real-time viral event prediction from social data |
CN105630845A (en) * | 2014-11-19 | 2016-06-01 | 北京航天长峰科技工业集团有限公司 | Specific network information based analysis research and judgment method |
CN105354280A (en) * | 2015-10-30 | 2016-02-24 | 中国科学院自动化研究所 | Social event tracking and evolving method based on social media platform |
CN105808722A (en) * | 2016-03-08 | 2016-07-27 | 苏州大学 | Information discrimination method and system |
CN105808722B (en) * | 2016-03-08 | 2020-07-24 | 苏州大学 | Information discrimination method and system |
CN106126558B (en) * | 2016-06-16 | 2019-09-20 | 东软集团股份有限公司 | A kind of public sentiment monitoring method and device |
CN106126558A (en) * | 2016-06-16 | 2016-11-16 | 东软集团股份有限公司 | A kind of public sentiment monitoring method and device |
CN106295565A (en) * | 2016-08-10 | 2017-01-04 | 中用环保科技有限公司 | Monitor event identifications based on big data and in real time method of crime prediction |
CN107590733A (en) * | 2017-08-08 | 2018-01-16 | 杭州灵皓科技有限公司 | Platform methods of risk assessment is borrowed based on the net of geographical economy and social networks |
CN107958322A (en) * | 2017-10-09 | 2018-04-24 | 中国电子科技集团公司第二十八研究所 | A kind of urban network spatial synthesis governing system |
CN107958322B (en) * | 2017-10-09 | 2021-08-10 | 中国电子科技集团公司第二十八研究所 | Urban network space comprehensive treatment system |
CN110019812A (en) * | 2018-02-27 | 2019-07-16 | 中国科学院计算技术研究所 | A kind of user is from production content detection algorithm and system |
CN110019812B (en) * | 2018-02-27 | 2021-08-20 | 中国科学院计算技术研究所 | A user-generated content detection method and system |
CN108846043A (en) * | 2018-05-30 | 2018-11-20 | 深圳市锐安信息安全技术有限公司 | Network trace mining analysis method and system based on internet big data |
CN109635995A (en) * | 2018-10-25 | 2019-04-16 | 中国电子科技集团公司电子科学研究院 | A kind of social security events anomaly method and device based on multidimensional data |
CN111382276A (en) * | 2018-12-29 | 2020-07-07 | 中国科学院信息工程研究所 | Event development venation map generation method |
CN111382276B (en) * | 2018-12-29 | 2023-06-20 | 中国科学院信息工程研究所 | A Method for Generating Event Development Context Diagram |
WO2020155496A1 (en) * | 2019-01-31 | 2020-08-06 | 平安科技(深圳)有限公司 | Public opinion tracking method and device for combined video-text data, and computer apparatus |
CN110855467A (en) * | 2019-08-19 | 2020-02-28 | 中国电子科技集团公司第三十研究所 | A network comprehensive situation prediction method based on computer vision technology |
CN110855467B (en) * | 2019-08-19 | 2022-02-08 | 中国电子科技集团公司第三十研究所 | Network comprehensive situation prediction method based on computer vision technology |
CN111444402A (en) * | 2019-10-29 | 2020-07-24 | 天津大学 | Analysis method for community detection based on index construction and social factor control network |
CN110825839A (en) * | 2019-11-07 | 2020-02-21 | 成都国腾实业集团有限公司 | Incidence relation analysis method for targets in text information |
CN113761337A (en) * | 2020-12-31 | 2021-12-07 | 国家计算机网络与信息安全管理中心 | Event prediction method and device based on implicit elements and explicit relations of events |
CN113761337B (en) * | 2020-12-31 | 2023-10-27 | 国家计算机网络与信息安全管理中心 | Event prediction method and device based on implicit event element and explicit connection |
CN112765485A (en) * | 2021-01-18 | 2021-05-07 | 深圳市网联安瑞网络科技有限公司 | Network social event prediction method, system, terminal, computer device and medium |
CN113449116A (en) * | 2021-06-22 | 2021-09-28 | 青岛海信网络科技股份有限公司 | Map construction and early warning method, device and medium |
CN113449116B (en) * | 2021-06-22 | 2022-12-20 | 青岛海信网络科技股份有限公司 | Map construction and early warning method, device and medium |
CN113486754A (en) * | 2021-06-29 | 2021-10-08 | 中国科学院自动化研究所 | Event evolution prediction method and system based on video |
CN113486754B (en) * | 2021-06-29 | 2024-01-09 | 中国科学院自动化研究所 | Event evolution prediction method and system based on video |
CN119808934A (en) * | 2024-11-20 | 2025-04-11 | 中国科学院自动化研究所 | Social world simulation method, system and electronic device based on big model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103455705A (en) | Analysis and prediction system for cooperative correlative tracking and global situation of network social events | |
Unankard et al. | Emerging event detection in social networks with location sensitivity | |
Kou et al. | Social network search based on semantic analysis and learning | |
Jackoway et al. | Identification of live news events using Twitter | |
Ratkiewicz et al. | Truthy: mapping the spread of astroturf in microblog streams | |
CN103544255B (en) | Text semantic relativity based network public opinion information analysis method | |
CN114444516B (en) | Cantonese rumor detection method based on deep semantic perception map convolutional network | |
TWI501097B (en) | System and method of analyzing text stream message | |
Li et al. | Online bursty event detection from microblog | |
CN103577404B (en) | A kind of completely new accident towards microblogging finds method | |
CN104536956A (en) | A Microblog platform based event visualization method and system | |
CN107315734A (en) | A kind of method and system for becoming pronouns, general term for nouns, numerals and measure words standardization based on time window and semanteme | |
CN102982110B (en) | Method for extracting hot spot event information of cyberspace in physical space | |
CN110929683B (en) | Video public opinion monitoring method and system based on artificial intelligence | |
Sherchan et al. | Harnessing Twitter and Instagram for disaster management | |
CN103488787A (en) | Method and device for pushing online playing entry objects based on video retrieval | |
Dong et al. | Sentiment-aware fake news detection on social media with hypergraph attention networks | |
US20230112385A1 (en) | Method of obtaining event information, electronic device, and storage medium | |
CN116244446A (en) | Method and system for detecting social media cognitive threats | |
CN115809334A (en) | Training method for event correlation classification model, text processing method and device | |
Hashemi et al. | Five hundred days of Farsi Twitter: An overview of what Farsi Twitter looks like, what we know about it, and why it matters | |
Fan et al. | Mining the personal interests of microbloggers via exploiting wikipedia knowledge | |
CN103500214B (en) | Word segmentation information pushing method and device based on video searching | |
Miller et al. | Digging into human rights violations: Data modelling and collective memory | |
Dong et al. | Similarity-Aware Attention Network for Multimodal Fake News Detection. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20131218 |
|
RJ01 | Rejection of invention patent application after publication |