CN103634760A - System and method for personal address book management - Google Patents
System and method for personal address book management Download PDFInfo
- Publication number
- CN103634760A CN103634760A CN201210306925.XA CN201210306925A CN103634760A CN 103634760 A CN103634760 A CN 103634760A CN 201210306925 A CN201210306925 A CN 201210306925A CN 103634760 A CN103634760 A CN 103634760A
- Authority
- CN
- China
- Prior art keywords
- address book
- information
- short message
- text
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Information Transfer Between Computers (AREA)
Abstract
本发明提供了一种用于个人通讯录管理的系统和方法。所述系统包括客户端和服务器,其中,所述客户端包括:监听控制模块,用于监听短信模块中是否存在接收的新短信;信息过滤模块,用于对接收的新短信进行过滤以只保留与通讯录有关的短信;第一发送模块,用于将与通讯录有关的短信发送给服务器,服务器包括:语义分析模块,用于对与通讯录有关的短信进行语义分析,以提取通讯录更新相关的人名信息、号码信息、号码种类信息以及是否保留旧号信息;第二发送模块,用于将提取的该短信中的人名信息、号码信息、号码种类信息以及是否保留旧号信息作为分析结果发送给客户端,其中,客户端还包括:通讯录更新模块,用于基于接收的分析结果来更新本地通讯录。
The present invention provides a system and method for managing a personal address book. The system includes a client and a server, wherein the client includes: a monitoring control module, used to monitor whether there are new received text messages in the text message module; an information filtering module, used to filter the received new text messages to retain only the text messages related to the address book; a first sending module, used to send the text messages related to the address book to the server, and the server includes: a semantic analysis module, used to perform semantic analysis on the text messages related to the address book to extract the name information, number information, number type information and whether to retain the old number information related to the address book update; a second sending module, used to send the name information, number information, number type information and whether to retain the old number information extracted from the text message as the analysis result to the client, wherein the client also includes: an address book update module, used to update the local address book based on the received analysis result.
Description
技术领域 technical field
本发明涉及个人通讯录管理,更具体地讲,涉及一种基于短信语义分析的个人通讯录管理系统和方法。The present invention relates to personal address book management, more specifically, to a personal address book management system and method based on short message semantic analysis.
背景技术 Background technique
随着科技日新月异的发展,手持设备的普及程度至少是人手一台了,这些设备已经成为人们生活中不可缺少的部分,除了便于相互之间联系之外,人们也已经习惯于将一些重要的信息存储在这些设备中。但是当人们更换联系方式的时候,需要将更新的联系方式发送给通讯录上的联系人,而这些联系人需要手动地更新其手机上的联系方式,这种方式给用户带来了一定的麻烦,而且效率低下。为了给用户带来方便,设备通讯录的自动更新技术也应运而生。With the rapid development of science and technology, the popularity of handheld devices is at least one. These devices have become an indispensable part of people's lives. stored on these devices. But when people change their contact information, they need to send the updated contact information to the contacts in the address book, and these contacts need to manually update the contact information on their mobile phones, which brings some troubles to users , and is inefficient. In order to bring convenience to the user, the automatic update technology of the device address book also emerges as the times require.
现有的通讯录自动更新技术通常分为两种,一种是利用网络服务器作为中介,需要用户将自己的信息注册到服务器上,并且实时更新,网络服务器会将更新信息发送给该用户开放的接收者,从而实现同步功能;另一种是通过短信的方式告知联系方式的更新,但是对于发送端有一定的要求,需要按照固定的格式填写更新信息,接收方才能解析该短信进行更新信息的提取,并更新到本地通讯录。The existing automatic update technology of address book is usually divided into two types. One is to use the network server as an intermediary, which requires the user to register his information on the server and update it in real time. The network server will send the updated information to the user's open The receiver, so as to realize the synchronization function; the other is to inform the update of the contact information through the SMS, but there are certain requirements for the sender, and the update information needs to be filled in according to a fixed format, so that the receiver can parse the SMS to update the information. Extract and update to the local address book.
因此,现有的通讯录自动更新技术存在的问题在于:第一种方式需要网络平台和服务器来提供用户注册和用户信息的存储,带来了安全问题和网络存储资源的浪费;第二种方式对短信格式做了限制,短信必须按照固定的格式输入才能被解析,这个限制在一定程度上也给用户带来了不方便性。如果针对一条任意的短信信息,没有格式的限制,没有网络服务平台的注册,那么如何进行通讯录更新信息的提取和本地通讯录的更新就存在一定的困难了。Therefore, the problem existing in the automatic update technology of the address book is: the first method requires a network platform and a server to provide user registration and user information storage, which brings security problems and waste of network storage resources; the second method There is a restriction on the format of the text message. The text message must be input in a fixed format before it can be parsed. This restriction also brings inconvenience to the user to a certain extent. If there is no format limitation and no registration on the network service platform for an arbitrary text message, then there are certain difficulties in how to extract the update information of the address book and update the local address book.
发明内容 Contents of the invention
本发明的目的在于克服上述现有技术中的不足,提供一种能够基于短信语义分析的个人通讯录管理系统和方法,通过该个人通讯录管理系统和方法,不需要发送端按照固定的格式来填写更新信息。The object of the present invention is to overcome the deficiencies in the above-mentioned prior art, and provide a personal address book management system and method based on short message semantic analysis. Through the personal address book management system and method, it is not necessary for the sending end to Fill in the update information.
为了实现上述目的,提供了一种用于个人通讯录管理的系统,所述系统包括客户端和服务器,其中,所述客户端包括:监听控制模块,用于监听短信模块中是否存在接收的新短信;信息过滤模块,用于对接收的新短信进行过滤以只保留与通讯录有关的短信;第一发送模块,用于将与通讯录有关的短信发送给服务器,服务器包括:语义分析模块,用于对与通讯录有关的短信进行语义分析,以提取通讯录更新相关的人名信息、号码信息、号码种类信息以及是否保留旧号信息;第二发送模块,用于将提取的该短信中的人名信息、号码信息、号码种类信息以及是否保留旧号信息作为分析结果发送给客户端,其中,客户端还包括:通讯录更新模块,用于基于接收的分析结果来更新本地通讯录。In order to achieve the above object, a system for personal address book management is provided, the system includes a client and a server, wherein the client includes: a monitoring control module for monitoring whether there is a new message received in the short message module Short message; Information filtering module, is used for filtering the new short message that receives to keep only the short message relevant with address book; The first sending module, is used for sending the short message relevant with address book to server, and server comprises: Semantic analysis module, It is used to perform semantic analysis on the short messages related to the address book, so as to extract the name information, number information, number type information and whether to keep the old number information related to the update of the address book; the second sending module is used to extract the information in the short message Name information, number information, number type information and whether to keep the old number information are sent to the client as analysis results, wherein the client further includes: an address book updating module, which is used to update the local address book based on the received analysis results.
优选地,信息过滤模块可在确定接收的新短信的发件人信息是号码的情况、或者接收的新短信的发件人信息不是号码而发件人信息为本地通讯录中的联系人并且该短信的内容中存在号码的情况下,确定接收的短信是与通讯录有关的短信。Preferably, the information filtering module can determine that the sender information of the new text message received is a number, or the sender information of the new text message received is not a number and the sender information is a contact person in the local address book and the If there is a number in the content of the short message, it is determined that the received short message is a short message related to the address book.
优选地,信息过滤模块还可用于去除骚扰短信。Preferably, the information filtering module can also be used to remove harassing short messages.
优选地,客户端还可包括:文本生成模块,将信息过滤模块保留的与通讯录有关的短信的内容组合成要分析的文本,所述要分析的文本包括发件人信息、短信内容信息、以及为每条短信分配的一个标识号,其中,第一发送模块用于将所述要分析的文本发送给服务器。Preferably, the client can also include: a text generating module, which combines the content of the short message related to the address book retained by the information filtering module into text to be analyzed, and the text to be analyzed includes sender information, short message content information, And an identification number assigned to each short message, wherein the first sending module is used to send the text to be analyzed to the server.
优选地,语义分析模块可从与通讯录有关的短信中提取发件人信息、短信内容信息以及每条短信的标识符,以形成要分析的文本。Preferably, the semantic analysis module can extract the sender information, the content information of the short message and the identifier of each short message from the short messages related to the address book to form the text to be analyzed.
优选地,通讯录更新模块可通过将分析结果与本地通讯录进行匹配来更新本地通讯录。Preferably, the address book update module can update the local address book by matching the analysis result with the local address book.
优选地,通讯录更新模块可根据用户的判断来更新通讯录。Preferably, the address book updating module can update the address book according to the judgment of the user.
优选地,通讯录更新模块可将分析结果中的人名和号码分别与本地通讯录中的联系人及其号码进行匹配,如果人名不匹配,则将该人名及其号码增加到通讯录中,如果人名匹配而号码不匹配,则根据旧有号码是否保留信息来确定在通讯录中执行替换还是新增操作。Preferably, the address book update module can match the person's name and number in the analysis result with the contact person and its number in the local address book, if the name does not match, then add the person's name and its number to the address book, if If the person's name matches but the number does not match, it is determined whether to perform a replacement or an addition operation in the address book according to whether the old number retains information.
根据另一方面,提供了一种用于个人通讯录管理的方法,所述方法包括:客户端监听短信模块中是否存在接收的新短信;对接收的新短信进行过滤以只保留与通讯录有关的短信,将与通讯录有关的短信发送给服务器,服务器对与通讯录有关的短信进行语义分析,以提取该短信中的人名信息、号码信息、号码种类信息以及是否保留旧号信息;将提取的该短信中的人名信息、号码信息、号码种类信息以及是否保留旧号信息作为分析结果发送给客户端,客户端基于接收的分析结果来更新本地通讯录。According to another aspect, a method for personal address book management is provided, the method comprising: the client monitors whether there is a received new note in the short message module; and filters the received new note so as to keep only information related to the address book. The short message related to the address book is sent to the server, and the server performs semantic analysis on the short message related to the address book to extract the name information, number information, number type information and whether to keep the old number information in the short message; The name information, number information, number type information and whether to keep the old number information in the short message are sent to the client as the analysis result, and the client updates the local address book based on the received analysis result.
优选地,对接收的新短信进行过滤的步骤可包括:确定接收的新短信的发件人信息是否是号码;如果是号码,则确定该短信是与通讯录有关的短信,如果不是号码,则确定该发件人信息是否为本地通讯录中的联系人;如果确定为本地通讯录中的联系人,则进一步确定该短信的内容中是否存在号码;如果存在号码则确定该短信是与通讯录有关的短信。Preferably, the step of filtering the received new text messages may include: determining whether the sender information of the received new text messages is a number; if it is a number, then determining that the text message is a text message related to the address book; Determine whether the sender information is a contact person in the local address book; if it is determined to be a contact person in the local address book, then further determine whether there is a number in the content of the short message; related text messages.
优选地,对接收的新短信进行过滤的步骤还可包括:去除骚扰短信。Preferably, the step of filtering the received new short messages may further include: removing nuisance short messages.
优选地,该方法还可包括:客户端将保留的与通讯录有关的短信的内容组合成要分析的文本,所述要分析的文本包括发件人信息、短信内容信息、以及为每条短信分配的一个标识号,将所述要分析的文本发送给服务器。Preferably, the method may further include: the client combines the contents of the retained short messages related to the address book into text to be analyzed, and the text to be analyzed includes sender information, content information of short messages, and An identification number assigned to send the text to be analyzed to the server.
优选地,该方法还可包括:服务器从与通讯录有关的短信中提取发件人信息、短信内容信息以及每条短信的标识符,以形成要分析的文本。Preferably, the method may further include: the server extracts sender information, content information of the short message and an identifier of each short message from the short message related to the address book, so as to form the text to be analyzed.
优选地,可通过将分析结果与本地通讯录进行匹配来更新本地通讯录。Preferably, the local address book can be updated by matching the analysis result with the local address book.
优选地,可根据用户的判断来更新通讯录。Preferably, the address book can be updated according to the user's judgment.
优选地,更新本地通讯录的步骤可包括:将分析结果中的人名和号码分别与本地通讯录中的联系人及其号码进行匹配;如果人名不匹配,则将该人名及其号码增加到通讯录中;如果人名匹配而号码不匹配,则根据旧有号码是否保留信息来确定在通讯录中执行替换还是新增操作。Preferably, the step of updating the local address book may include: matching the person's name and number in the analysis result with the contact person and its number in the local address book; if the person's name does not match, adding the person's name and its number to the communication in the address book; if the name matches but the number does not match, it is determined whether to perform a replacement or add operation in the address book according to whether the old number retains information.
附图说明 Description of drawings
图1是示出根据本发明实施例的基于短信语义分析的个人通讯录管理系统的框图;Fig. 1 is a block diagram showing a personal address book management system based on short message semantic analysis according to an embodiment of the present invention;
图2是示出信息过滤模块执行过滤操作的流程图;Fig. 2 is a flow chart showing that the information filtering module performs filtering operations;
图3是示出语义分析模块对短信息进行语义分析的过程的流程图;Fig. 3 is the flow chart that shows the process that semantic analysis module carries out semantic analysis to short message;
图4示出中文姓名识别框架;Fig. 4 shows Chinese name recognition framework;
图5是示出获取号码种类的过程的流程图;Fig. 5 is a flow chart showing the process of acquiring a number category;
图6是示出更新本地通讯录的过程的流程图;Fig. 6 is a flowchart illustrating the process of updating a local address book;
图7是示出将分析结果和本地通讯录进行匹配的过程的流程图;Fig. 7 is a flow chart showing the process of matching the analysis result with the local address book;
图8是示出根据本发明实施例的个人通讯录管理方法的流程图。Fig. 8 is a flowchart illustrating a method for managing a personal address book according to an embodiment of the present invention.
具体实施方式 Detailed ways
现在,详细描述本发明的实施例,其示例在附图中表示,其中,相同的标号始终表示相同的部件。以下通过参考附图描述实施例以解释本发明。Embodiments of the present invention will now be described in detail, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like parts throughout. The embodiments are described below in order to explain the present invention by referring to the figures.
图1中示出根据本发明实施例的基于短信语义分析的个人通讯录管理系统的框图。FIG. 1 shows a block diagram of a personal address book management system based on semantic analysis of short messages according to an embodiment of the present invention.
如图1所示,根据本发明实施例的个人通讯录管理系统包括客户端10和服务器20。As shown in FIG. 1 , the personal address book management system according to the embodiment of the present invention includes a client 10 and a server 20 .
客户端10包括监听控制模块11、信息过滤模块12、文本生成模块13、文本发送模块14、文本接收模块15和通讯录更新模块16。服务器20包括文本接收模块21、语义分析模块22和文本发送模块23。应该理解,这里示出的客户端10和服务器20所包括的模块只是示例性的,本领域的技术人员可根据具体情况省略部分模块和添加新的模块。下面分别对上述模块进行详细描述。The client 10 includes a
客户端10中的监听控制模块11与用户终端上的短信模块(未在图中示出)连接,用于监听短信模块是否有未读的新短信。针对设备的不同操作系统,监听的方法也有所不同,针对Android操作系统,一种可实施的方案如下:The
在监听到有未读的新短信时,将未读的新短信发送给客户端10的信息过滤模块12。When an unread new short message is detected, the unread new short message is sent to the
信息过滤模块12在接收到监听控制模块11发送来的未读的新短信时,对接收的新短信进行过滤,即去除与通讯录无关的信息,只保留与通讯录有关的信息。具体地讲,信息过滤模块12可对接收的新短信的发件人信息进行判断。如果发件人信息是号码信息,则确定该短信是与通讯录有关的短信,将该新短信发送给文本生成模块13。如果发件人信息不是号码信息,而是本地通讯录中存在的联系人,则需要进一步判定该短信内容中是否有疑似号码的信息,如果有,则确定该短信是与通讯录有关的短信,并将该新短信发送给文本生成模块13。否则,过滤该新短信。下面参照图2对信息过滤模块12的操作过程进行详细描述。When the
如图2所示,在步骤S201,客户端的信息过滤模块12获取接收的新短信的发件人信息。As shown in FIG. 2 , in step S201 , the
在步骤S202,确定发件人信息是否为纯数字串。纯数字串的判定需要首先去除开头的国际代码,例如“+86”等。In step S202, it is determined whether the sender information is a string of pure numbers. The determination of pure numeric strings needs to remove the international code at the beginning, such as "+86".
如果在步骤S202确定发件人信息是纯数字串,则在步骤S203信息过滤模块12将获得的数字串与特殊号码库进行比对以确定该数字串是否为特殊号码。如果该数字串属于特殊号码库,例如,该数字串是“10086”,则过滤掉该纯数字串。If it is determined in step S202 that the sender information is a pure number string, then in step S203 the
如果获得的数字串与特殊号码库不匹配,则在步骤S204将该新短信发送给文本生成模块13。If the obtained number string does not match the special number database, then the new short message is sent to the
如果在步骤S202确定发件人信息不是纯数字串,则在步骤S205确定发件人信息是否为通讯录中的人名。即,将非纯数字串的发件人与本地通讯录进行匹配。如果在步骤S205确定在本地通讯录中不存在接收的新短信中的发件人信息,则过滤掉该非纯数字串。例如,“WAP信息”的发件人信息则属于不存在于接收的新短信中的发件人信息,因此通过步骤S205,可过滤掉该“WAP信息”。相反,如果在步骤S205确定在本地通讯录中存在接收的新短信中的发件人信息,则在步骤S206确定该短信内容中是否包含号码。其中,可通过根据不同国家的号码特点来实现该确定步骤,即根据不同国家的号码特点来提取短信内容中的号码,将在下面对服务器20的描述中进行详细叙述。If it is determined in step S202 that the sender information is not a pure number string, then in step S205 it is determined whether the sender information is a name in the address book. That is, match senders that are not pure numeric strings with the local address book. If it is determined in step S205 that the sender information in the received new short message does not exist in the local address book, the non-pure number string is filtered out. For example, the sender information of "WAP information" belongs to sender information that does not exist in the received new short message, so the "WAP information" can be filtered out through step S205. On the contrary, if it is determined in step S205 that the sender information in the received new short message exists in the local address book, then in step S206 it is determined whether the content of the short message contains a number. Wherein, the determining step can be realized according to the number characteristics of different countries, that is, the number in the short message content is extracted according to the number characteristics of different countries, which will be described in detail in the description of the server 20 below.
如果在步骤S206确定该短信内容中包含号码,则在步骤S204信息过滤模块12将新接收的短信发送给文本生成模块13。If it is determined in step S206 that the content of the short message contains a number, then in step S204 the
此外,还可以在信息过滤模块12去除与通讯录无关的信息之前,去除骚扰短信。例如,可通过统计一些常见的骚扰短信格式(比如“开发票,139********,王**”),将接收的短信的内容与这些统计好的内容进行比对,如果满足这些骚扰短信格式,则去除这些短信。In addition, before the
文本生成模块13接收信息过滤模块12发送的短信,并将接收的短信的相关内容组合成要分析的文本。所述要分析的文本包括发件人信息、短信内容信息、以及为每条短信分配的一个id(标识)号。其中,通过由信息过滤模块12为每条短信分配id号来区分不同的短信。例如,可基于id号将要分析的文本中包括的发件人信息、短信内容信息分别存储在不同的字段中。The
客户端10的文本发送模块14通过有线或无线的方式将文本生成模块13生成的所述要分析的文本发送给服务器20。The
服务器20中的文本接收模块21从客户端10接收所述要分析的文本,并对按照与客户端协商好的方式解析成可供语义分析模块分析的格式。即,按照文本生成模块生成的所述要分析的文本的格式来解析该文本。例如,可从所述要分析的文本中按照id号分别提取出发件人信息和短信内容信息。The
服务器20中的语义分析模块22用于对文本接收模块21发送的所述要分析的文本进行语义分析,提取该短信中的人名信息、号码信息、号码种类信息以及是否保留旧号等信息。The
可选择地,客户端10可不包括文本生成模块13,在信息过滤模块12获得了与通讯录有关的短信之后,文本发送模块14可作为发送模块直接将与通讯录有关的短信发送给服务器20。另外可由服务器20中的语义分析模块22对发送的短信中的相关信息进行提取。具体地讲,可从发送的短息中提取发件人信息、短信内容信息、以及为每条短信分配的一个id号,以形成要分析的文本。随后,语义分析模块22继续对形成的要分析的文本进行语义分析,以提取短信中的人名信息、号码信息、号码种类信息以及是否保留旧号等信息。Optionally, the client 10 may not include the
下面将参照图3对语义分析模块22的语义分析操作进行详细描述。The semantic analysis operation of the
在步骤S301,语义分析模块22接收基于一个短信id的所述要分析的文本。In step S301, the
在步骤S302,语义分析模块22获取所述要分析的文本中的短信内容信息中的号码信息。In step S302, the
具体地讲,可根据不同国家的号码特点来获取短信内容信息中的号码信息。例如,就中国的电话号码而言,通常为11位、12位或者8位,因此,在步骤S302中,语义分析模块22通过提取连续的11位、12位或者8位的数字来获取号码信息。Specifically, the number information in the short message content information can be obtained according to the number characteristics of different countries. For example, as far as China's telephone numbers are concerned, they are usually 11, 12 or 8 digits. Therefore, in step S302, the
在步骤S303确定获取号码信息是否成功。即,是否从短信内容信息中找到号码信息。In step S303, it is determined whether the number information is acquired successfully. That is, whether to find the number information from the content information of the short message.
如果在步骤S303获取号码不成功,即没有从短信内容信息中找到号码信息,则在步骤S305确定发件人信息是否为号码。如果在步骤S305确定发件人不是号码,则结束操作。If the acquisition of the number is unsuccessful in step S303, that is, the number information is not found from the content information of the short message, then in step S305 it is determined whether the sender information is a number. If it is determined in step S305 that the sender is not a number, the operation ends.
如果在步骤S303从短信内容信息中获取号码信息成功,则保存该号码信息,并且在步骤S304获取短信内容信息中的人名信息。If the number information is successfully obtained from the short message content information in step S303, the number information is saved, and the name information in the short message content information is obtained in step S304.
如果在步骤S303没有从短信内容信息中获取到号码信息(即,短信内容中没有号码并且在在步骤S305确定发件人是号码,则保存该号码信息,并且在步骤S304获取短信内容信息中的人名信息。If in step S303, do not obtain number information (that is, there is no number in the content of the message and determine that the sender is a number in step S305) from the message content information in step S303, then save the number information, and obtain the message in the message content information in step S304 Name information.
在步骤S306确定获取人名信息是否成功。即,是否从短信内容信息中找到人名信息。稍后将参照图4对获取人名信息的过程进行详细描述。In step S306, it is determined whether the person name information is acquired successfully. That is, whether to find the name information from the text message content information. The process of acquiring personal name information will be described in detail later with reference to FIG. 4 .
如果在步骤S306确定获取人名信息不成功,则步骤S308确定发件人信息是否是人名。如果在步骤S308确定发件人信息不是人名,则结束操作。If it is determined in step S306 that acquiring the name information is unsuccessful, then step S308 determines whether the sender information is a name. If it is determined in step S308 that the sender information is not a person's name, the operation ends.
如果在步骤S306确定获取人名信息成功,则保存该人名信息,并且随后执行步骤S307。If it is determined in step S306 that the person's name information has been acquired successfully, the person's name information is saved, and then step S307 is executed.
如果在在步骤S306确定获取人名信息不成功并且在步骤S308确定发件人信息是人名,则保存该人名信息并且随后执行步骤S307。If it is determined in step S306 that acquiring the name information is unsuccessful and in step S308 it is determined that the sender information is a name, then the name information is saved and then step S307 is executed.
在步骤S307,语义分析模块从短信内容信息中获取是否保留旧有号码信息。具体地讲,可通过关键字匹配的方法来实现该获取步骤。例如,通过在中文中,对于旧有号码,大家的普遍习惯用语是“原来的号码”,“旧号”,“原号”。在找到这些关键字之后,判定它们所在的句子中是否有“继续使用”、“不再使用”等词语。如果句子中有“继续使用”等含义的词语,则确定保留该旧有号码,如果句子中有“不再使用”等含义的词语,则确定不再保留该旧有号码。另一方面,本发明也可默认设置为不再保留旧有号码,即不执行步骤S307。In step S307, the semantic analysis module obtains whether to keep the old number information from the short message content information. Specifically, the obtaining step can be realized through a keyword matching method. For example, in Chinese, for old numbers, everyone's common idioms are "original number", "old number" and "original number". After finding these keywords, determine whether there are words such as "continue to use" and "no longer use" in the sentence where they are located. If there are words with meanings such as "continue to use" in the sentence, it is determined to retain the old number, and if there are words with the meaning of "no longer used" in the sentence, then it is determined not to retain the old number. On the other hand, the present invention can also be set by default to no longer retain the old number, that is, step S307 is not executed.
随后,在步骤S309,获取号码种类信息。Subsequently, in step S309, the number type information is acquired.
稍后将参照图5对获取号码种类信息的过程进行详细描述。The process of acquiring number category information will be described in detail later with reference to FIG. 5 .
最后,在步骤S310,语义分析模块22将按照短信id提取的号码信息、人名信息、号码种类、旧号保留作为分析结果传送给文本发送模块。Finally, in step S310, the
另外,如果在步骤S305确定发件人不是号码(即,当短信内容中和发件人中都无法获取号码信息时),则在步骤S311中将结果置空并且随后结束操作。如果在步骤S308确定发件人不是人名(即,当号码信息获取成功,但是短信内容和发件人中都无法获取人名信息时),则在步骤S312将分析结果设置为空。In addition, if it is determined in step S305 that the sender is not a number (that is, when the number information cannot be obtained in the content of the short message or in the sender), the result is blanked in step S311 and then the operation ends. If it is determined in step S308 that the sender is not a person's name (that is, when the number information is obtained successfully, but the person's name information cannot be obtained in the content of the short message and the sender), then in step S312 the analysis result is set to be empty.
下面参照图4来描述获取人名信息的过程。图4中示出了中文姓名识别框架。The process of obtaining name information will be described below with reference to FIG. 4 . The Chinese name recognition framework is shown in Fig. 4 .
在获取了号码信息之后,在步骤S304获取短信内容信息中的人名信息。根据图3,一种情况是发件人是号码(如在步骤S305),那么根据本发明实施例确定人名所在位置的一种方案是将提取得到的第一个人名作为对应该号码的人名;另外一种情况是短信内容中得到的号码信息(如在步骤S303中获取号码信息成功),那么针对该号码所在的句子进行人名提取,如果未成功,则向前向后进行提取,直至提取到人名位置或者是检测完所有的句子则结束。After the number information is obtained, in step S304, the name information in the short message content information is obtained. According to Fig. 3, a kind of situation is that the sender is a number (as in step S305), so a kind of scheme of determining the location of the person's name according to the embodiment of the present invention is to extract the first person's name obtained as the person's name corresponding to the number; Another kind of situation is the number information that obtains in the short message content (as obtaining number information success in step S303), carry out name extraction for the sentence that this number is in so, if unsuccessful, then extract forward and backward, until extracting The position of the person's name or after detecting all the sentences is ended.
针对不同的语言,人名的特点也各有不同。尤其针对中文,最大的问题中文的句子对于词语不会进行自动分隔,所以针对中文的姓名提取存在很大的困难。根据本发明实施例的一种可行的方案是利用统计的方法进行中文姓名的识别。For different languages, the characteristics of personal names are also different. Especially for Chinese, the biggest problem is that Chinese sentences do not automatically separate words, so there are great difficulties in extracting Chinese names. A feasible solution according to an embodiment of the present invention is to use a statistical method to identify Chinese names.
图4中给出的是一种利用统计语义单元进行中文姓名识别的框架。首先针对该框架介绍几个基本概念:Presented in Fig. 4 is a framework for Chinese name recognition using statistical semantic units. First introduce a few basic concepts for the framework:
概念1-句义:任何一个自然语言在语言层次上的语义称为句义,记作ss;Concept 1-sentence meaning: the semantics of any natural language at the language level is called sentence meaning, denoted as ss;
概念2-语义单元:句义内表达一个意思的单元称为语义单元,记作se;Concept 2-semantic unit: a unit that expresses a meaning in a sentence is called a semantic unit, denoted as se;
概念3-语义单元表示:语义单元在自然语言中的具体描述形式,记作ser;Concept 3-semantic unit representation: the specific description form of a semantic unit in natural language, denoted as ser;
概念4-语义单元图:句子包含的语义单元表示所组成的有向无环图,涵盖了所有可能的语言单元序列。Concept 4-semantic unit graph: a directed acyclic graph composed of semantic unit representations contained in a sentence, covering all possible sequences of language units.
语义单元在本系统中的表示形式为r(se)=(id,se,n,Vp(se),ser,ty,p(se)),id表示的是se的编号,n为se的参数个数,Vp(se)为se的参数向量,ser是se在汉语上的表示,ty为se的类型,p(se)是se的概率值。Rc={r(se)}是汉语语义单元表示库,是所有语义单元表示的集合,对应图4中的Rc。表1是对Rc中人名相关语义单元的部分举例:The representation form of the semantic unit in this system is r(se)=(id, se, n, Vp(se), ser, ty, p(se)), where id represents the number of se, and n is the parameter of se The number, Vp(se) is the parameter vector of se, ser is the representation of se in Chinese, ty is the type of se, and p(se) is the probability value of se. Rc={r(se)} is the Chinese semantic unit representation library, which is a collection of all semantic unit representations, corresponding to Rc in Figure 4. Table 1 is a partial example of semantic units related to personal names in Rc:
表1人名组成部分及其边界的语义单元举例Table 1 Examples of Semantic Units of Personal Name Components and Their Boundaries
下面对框架图中涉及的几个数据库进行下介绍:The following introduces several databases involved in the framework diagram:
Nc和CpN在本系统中是训练语料库,Nc是人名样本库,CpN是北京大学语言研究所标注的标准语料库,通过对这两个库中内容的标注和提取,可以得到人名知识库、类型库、人名模式集,根据人名知识库得到本系统中所需的Rc。Nc and CpN are the training corpus in this system, Nc is the sample database of personal names, and CpN is the standard corpus marked by the Linguistic Institute of Peking University. The type library, the name pattern set, and the Rc needed in this system are obtained according to the name knowledge base.
类型库中存储的是各个语义单元对应的类别信息,本系统中的类别信息包含了人名组成部分相关的类别信息以及人名边界的语义单元类别信息。在表1中的ty列存储的就是每个语义单元对应的类别信息,比如Nss表示的就是单姓,Nds表示的就是双姓。The category information corresponding to each semantic unit is stored in the type library. The category information in this system includes the category information related to the components of the name and the category information of the semantic unit of the boundary of the name. The ty column in Table 1 stores the category information corresponding to each semantic unit. For example, N ss indicates a single surname, and N ds indicates a double surname.
人名模式集中存储的是符合人名组合模式的集合,以及人的结构的语义单元。人名组合模式就是根据上述的和人名组成部分相关的类别信息的组合,比如(Nss,Nsn)是指单姓单名,(Nss,Ndnh,Ndnt)单名双姓;人的结构的语义单元的意思是指在姓、名、姓名之前或之后加上的一些称谓词等等,比如(Aprex,Nss)表示的是形容性姓前缀加上单姓,如“老刘”。The person name pattern centrally stores the collection that conforms to the person name combination pattern and the semantic unit of the person structure. The combination mode of personal names is based on the combination of the above-mentioned category information related to the components of personal names. For example, (N ss , N sn ) refers to a single surname and a single name, (N ss , N dnh , N dnt ) a single name with double surnames; the structure of a person The meaning of the semantic unit refers to some appellations added before or after the surname, first name, and name, etc., such as (A prex , N ss ) means a descriptive surname prefix plus a single surname, such as "Old Liu".
Vser(c)表示的是从Rc中找到的以中文字符c打头的语义单元表示的向量;What V ser (c) represents is the vector represented by the semantic unit starting with the Chinese character c found from Rc;
V′ser(c)表示的是和输入句子匹配之后剔除了未匹配项,只保留了匹配上的语义单元表示的向量。V' ser (c) means that after matching the input sentence, the unmatched items are removed, and only the vector represented by the semantic unit on the match is retained.
下面对该框架的整个流程进行说明:The entire process of the framework is described below:
该框架中包含三个模块,提取模块,负责从Rc中读取所有以字符c打头的语义单元表示,组成语义单元表示向量,即Vser(c);匹配模块,负责将Vser(c)中每一个语义单元和句子进行匹配,匹配成功的语义单元保留,其余删除,形成新的语义单元表示向量V′ser(c);计算识别模块,负责计算语义单元图中每条路径的概率,确定最大概率路径,并根据语义单元的参数类型,人名模式集以及表示人的各种语义单元,在最大概率路径上识别人名。整个流程中关键的就是构建每个句子的语言单元图和计算最大概率路径,语义单元图的构建通过提取模块和匹配模块两个模块的配合即可得到,然后计算图中包含的每条路径的联合概率,找到概率最大的路径,在该路径上获取人名,即可得到我们要找的人名。The framework contains three modules, the extraction module is responsible for reading all semantic unit representations starting with the character c from Rc, and forms a semantic unit representation vector, namely V ser (c); the matching module is responsible for converting V ser (c) Match each semantic unit in the sentence with the sentence, keep the successfully matched semantic unit, and delete the rest to form a new semantic unit representation vector V' ser (c); the calculation and recognition module is responsible for calculating the probability of each path in the semantic unit graph, Determine the maximum probability path, and recognize the name of the person on the maximum probability path according to the parameter type of the semantic unit, the name pattern set and various semantic units representing the person. The key in the whole process is to construct the language unit graph of each sentence and calculate the maximum probability path. The construction of the semantic unit graph can be obtained through the cooperation of the extraction module and the matching module, and then the calculation of each path contained in the graph Joint probability, find the path with the highest probability, get the name of the person on this path, and then get the name of the person we are looking for.
应该理解,这里参照图4描述的获取短信内容信息中的人名信息的过程只是示例,本领域技术人员还可以采用其它获取人名信息的技术。It should be understood that the process of obtaining name information in the short message content information described here with reference to FIG. 4 is just an example, and those skilled in the art may also adopt other techniques for obtaining name information.
现在,参照图5对获取号码种类信息的过程进行详细描述。Now, the process of acquiring number category information will be described in detail with reference to FIG. 5 .
如图5所示,对于号码信息,首先确定号码的位数,如果为8位则可确定为座机号码;如果为11位,则需要确定是否以13开头,如果是则确定该11位号码为手机号码;如果确定号码为11位且前3位为区号(可参照区号库)则确定为座机号码;如果号码为12位,则需继续确定其前4位是否为区号,如果是,则确定该号码为座机号码。As shown in Figure 5, for the number information, first determine the number of digits of the number, if it is 8 digits, it can be determined as a landline number; if it is 11 digits, it needs to be determined whether it starts with 13, if so, it is determined that the 11 digit number is Mobile phone number; if the number is determined to be 11 digits and the first 3 digits are area codes (refer to the area code library), then it is determined to be a landline number; if the number is 12 digits, it is necessary to continue to determine whether the first 4 digits are area codes, and if so, confirm This number is a landline number.
现在返回图1,继续对图1进行描述。Now return to FIG. 1 and continue to describe FIG. 1 .
在语义分析模块22将分析结果传送给服务器20的文本发送模块23之后,服务器20的文本发送模块23将分析结果发送给客户端10。例如,文本发送模块23将分析结果发送给客户端10。After the
客户端10中的文本接收模块15接收从服务器20发送来的分析结果。The
客户端10中的通讯录更新模块16基于接收的分析结果来更新本地通讯录。例如,通讯录更新模块16可通过将分析结果与本地通讯录进行匹配来更新本地通讯录。可在用户打开短信时,进一步由用户来确定是否由通信录更新模块16更新本地通讯录。The address
下面参照图6对更新本地通讯录的过程进行详细描述。The process of updating the local address book will be described in detail below with reference to FIG. 6 .
如图6所示,在步骤S601,通讯录更新模块16确定分析结果是否为空,如果不为空,则在步骤S602将分析结果和本地通讯录进行匹配。稍后将参照图7对步骤S602的匹配过程进行详细描述。如果分析结果为空,则结束操作。As shown in FIG. 6 , in step S601 , the address
在步骤S603判断分析结果和本地通讯录的匹配结果是否为空。如果匹配结果为空,则结束操作。如果匹配结果不为空,则在步骤S604确定用户当前是否打开短信。如果在步骤S604确定用户打开短信,则在步骤S605将匹配结果以提示框的形式展示给用户,随后在步骤S606由用户判定该匹配结果是否正确。如果在步骤S606用户判定匹配结果正确,则在步骤S608通讯录更新模块16按照提示框内容更新通讯录。否则的话,用户可编辑更新提示框(S607)并且按照重新编辑的提示框来更新通讯录(S608)(在否1的情况下),另外,如果用户直接关闭提示框的话(在否2的情况下),则操作结束,通讯录更新模块16不再进行任何处理。In step S603, it is judged whether the matching result of the analysis result and the local address book is empty. If the matching result is empty, end the operation. If the matching result is not empty, it is determined in step S604 whether the user currently opens the short message. If it is determined in step S604 that the user opens the short message, then in step S605 the matching result is displayed to the user in the form of a prompt box, and then in step S606 the user determines whether the matching result is correct. If in step S606 the user determines that the matching result is correct, then in step S608 the address
下面将参照图7对步骤S602的匹配过程进行详细描述。The matching process of step S602 will be described in detail below with reference to FIG. 7 .
如图7所示,对于分析结果,通讯录更新模块16在步骤S701将该分析结果中的人名与本地通讯录中的联系人进行匹配。在步骤S702确定是否匹配成功。如果确定匹配不成功,则说明该分析结果中人名是新增的联系人,这样的话,在步骤S708将该联系人的相关信息增加到本地通讯录中。As shown in FIG. 7 , for the analysis result, the address
如果确定匹配成功,则在步骤S703,通讯录更新模块16将分析结果中的号码与通讯录中匹配的联系人对应的电话号码进行匹配。If it is determined that the matching is successful, then in step S703, the address
在步骤S704确定号码匹配是否成功。如果确定号码匹配成功,则说明本地通讯录已经更新了该电话号码,因此在步骤S706返回结果为空并且结束该匹配操作。步骤S706对应于图6中的步骤S603中的是。In step S704 it is determined whether the number matching is successful. If it is determined that the number matching is successful, it means that the local address book has updated the phone number, so the returned result in step S706 is empty and the matching operation ends. Step S706 corresponds to Yes in step S603 in FIG. 6 .
如果在步骤S704确定号码匹配不成功,则在步骤S705确定是否需要保留旧号。例如,通讯录更新模块16根据分析结果中的旧号保留信息来确定是否保留旧号。如果确定保留旧号,则在步骤S708将新的号码添加到该联系人名下。如果确定不保留旧号,则在步骤S707用新的号码替换该旧有号码。If it is determined in step S704 that the number matching is unsuccessful, then in step S705 it is determined whether the old number needs to be retained. For example, the address
随后,在步骤S709,通讯录更新模块16生成更新结果,该更新结果包括根据id的号码信息、人名信息、号码种类、旧号保留以及更新方式等。该步骤对应于图6中的步骤S603中的否,即在步骤S709之后,继续执行步骤S604。Subsequently, in step S709, the address
图8是示出根据本发明实施例的个人通讯录管理方法的流程图。Fig. 8 is a flowchart illustrating a method for managing a personal address book according to an embodiment of the present invention.
在步骤S801,客户端10中的监听控制模块11监听短信模块中是否存在未读的新短信。In step S801, the
如果在步骤S801确定监听到存在新短息,则在步骤S802对接收的新短信进行过滤以只保留与通讯录有关的短信。If it is determined in step S801 that there is a new short message, then in step S802 the received new short message is filtered to keep only short messages related to the address book.
随后,在步骤S803,将接收的短信的相关内容组合成要分析的文本,所述要分析的文本包括发件人信息、短信内容信息、以及为每条短信分配的一个id号。Subsequently, in step S803, the relevant content of the received short message is combined into a text to be analyzed, and the text to be analyzed includes sender information, content information of the short message, and an id assigned to each short message.
在步骤S804,将所述要分析的文本发送给服务器。In step S804, the text to be analyzed is sent to the server.
在步骤S805,服务器对所述要分析的文本进行语义分析,提取该短信中的人名信息、号码信息、号码种类信息以及是否保留旧号等信息。In step S805, the server performs semantic analysis on the text to be analyzed, and extracts information such as name information, number information, number type information and whether to keep the old number in the short message.
在步骤S806,服务器将提取的该短信中的人名信息、号码信息、号码种类信息以及是否保留旧号等信息作为分析结果发送给客户端。In step S806, the server sends the extracted name information, number information, number type information and whether to keep the old number in the short message to the client as an analysis result.
在步骤S807,客户端基于接收的分析结果来更新本地通讯录。In step S807, the client updates the local address book based on the received analysis result.
通过采用根据本发明的个人通讯录管理系统和方法,能够提高通讯录管理的效率,从而提升了用户体验。By adopting the personal address book management system and method according to the present invention, the efficiency of address book management can be improved, thereby improving user experience.
尽管已经参照本发明的实施例具体显示和描述了本发明,但是本领域的技术人员应该理解,在不脱离由权利要求限定的本发明的精神和范围的情况下,可以对其进行形式和细节上的各种改变。While the invention has been particularly shown and described with reference to embodiments of the invention, it will be understood by those skilled in the art that changes may be made in form and detail without departing from the spirit and scope of the invention as defined by the claims. various changes.
Claims (17)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201210306925.XA CN103634760A (en) | 2012-08-27 | 2012-08-27 | System and method for personal address book management |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201210306925.XA CN103634760A (en) | 2012-08-27 | 2012-08-27 | System and method for personal address book management |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN103634760A true CN103634760A (en) | 2014-03-12 |
Family
ID=50215305
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201210306925.XA Pending CN103634760A (en) | 2012-08-27 | 2012-08-27 | System and method for personal address book management |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN103634760A (en) |
Cited By (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104023124A (en) * | 2014-05-14 | 2014-09-03 | 上海卓悠网络科技有限公司 | Method and device for automatically identifying and extracting a name in short message |
| CN104580709A (en) * | 2014-12-24 | 2015-04-29 | 惠州Tcl移动通信有限公司 | Method and system for searching user quickly |
| WO2016050038A1 (en) * | 2014-09-30 | 2016-04-07 | 小米科技有限责任公司 | Communication message recognition method and device |
| CN105577878A (en) * | 2015-12-14 | 2016-05-11 | 厦门美图移动科技有限公司 | Short message classification and arrangement device, method and mobile terminal |
| CN105681170A (en) * | 2016-03-08 | 2016-06-15 | 北京小米移动软件有限公司 | Short message interception method and device |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6377161B1 (en) * | 1998-08-11 | 2002-04-23 | Motorola, Inc. | Method and apparatus in a wireless messaging system for facilitating an exchange of address information |
| CN101212737A (en) * | 2006-12-30 | 2008-07-02 | 北京三星通信技术研究有限公司 | How to Automatically Update Mobile Phone Contacts |
| US20100001014A1 (en) * | 2007-02-12 | 2010-01-07 | Maurice Joseph Paul Tabone | Container base and lid with planar area for flat application of adhesive membrane |
| CN101651731A (en) * | 2009-08-06 | 2010-02-17 | 宇龙计算机通信科技(深圳)有限公司 | Method and system for managing address book and mobile terminal |
-
2012
- 2012-08-27 CN CN201210306925.XA patent/CN103634760A/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6377161B1 (en) * | 1998-08-11 | 2002-04-23 | Motorola, Inc. | Method and apparatus in a wireless messaging system for facilitating an exchange of address information |
| CN101212737A (en) * | 2006-12-30 | 2008-07-02 | 北京三星通信技术研究有限公司 | How to Automatically Update Mobile Phone Contacts |
| US20100001014A1 (en) * | 2007-02-12 | 2010-01-07 | Maurice Joseph Paul Tabone | Container base and lid with planar area for flat application of adhesive membrane |
| CN101651731A (en) * | 2009-08-06 | 2010-02-17 | 宇龙计算机通信科技(深圳)有限公司 | Method and system for managing address book and mobile terminal |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104023124A (en) * | 2014-05-14 | 2014-09-03 | 上海卓悠网络科技有限公司 | Method and device for automatically identifying and extracting a name in short message |
| WO2016050038A1 (en) * | 2014-09-30 | 2016-04-07 | 小米科技有限责任公司 | Communication message recognition method and device |
| JP2016541219A (en) * | 2014-09-30 | 2016-12-28 | 小米科技有限責任公司Xiaomi Inc. | COMMUNICATION MESSAGE IDENTIFICATION METHOD, COMMUNICATION MESSAGE IDENTIFICATION DEVICE, PROGRAM, AND RECORDING MEDIUM |
| RU2615632C2 (en) * | 2014-09-30 | 2017-04-06 | Сяоми Инк. | Method and device for recognizing communication messages |
| CN104580709A (en) * | 2014-12-24 | 2015-04-29 | 惠州Tcl移动通信有限公司 | Method and system for searching user quickly |
| CN104580709B (en) * | 2014-12-24 | 2019-04-12 | 惠州Tcl移动通信有限公司 | Quickly search the method and system of user |
| CN105577878A (en) * | 2015-12-14 | 2016-05-11 | 厦门美图移动科技有限公司 | Short message classification and arrangement device, method and mobile terminal |
| CN105681170A (en) * | 2016-03-08 | 2016-06-15 | 北京小米移动软件有限公司 | Short message interception method and device |
| CN105681170B (en) * | 2016-03-08 | 2019-04-16 | 北京小米移动软件有限公司 | SMS interception method and device |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US8095547B2 (en) | Method and apparatus for detecting spam user created content | |
| CN106682169B (en) | Application label mining method and device, application searching method and server | |
| CN101075929B (en) | Method, system and server for inquiring information | |
| CN110941708B (en) | Intelligent question and answer library establishing method, intelligent question and answer method and device and computer equipment | |
| EP2896162B1 (en) | Determining additional information associated with geographic location information | |
| CN108038165B (en) | Information searching method, searching device, terminal equipment and server equipment | |
| CN106682170B (en) | Application search method and device | |
| US9055419B2 (en) | Mobile terminal to recommend a short message recipient | |
| CN101459884B (en) | A business processing method and device | |
| CN103634760A (en) | System and method for personal address book management | |
| EP2847980A1 (en) | Method, client, server and system for intelligent recognizing contents of short message | |
| CN110069769B (en) | Application label generation method and device and storage device | |
| CN106970991A (en) | Recognition methods, device and the application searches of similar application recommend method, server | |
| CN103530313A (en) | Searching method and device of application information | |
| CN103002103A (en) | Method and device for mass sending of short messages | |
| CN111401051A (en) | Express delivery information analysis method and system | |
| CN106936971A (en) | A kind of incoming person's information presentation system and reminding method | |
| CN102014374A (en) | Processing method of default file and mobile terminal | |
| CN107247792A (en) | Match method, device and the computer equipment of functional department | |
| CN102932512A (en) | Contact person analysis method and device as well as contact person generation method and device | |
| CN101340659A (en) | Mobile communication terminal and method for quickly browsing short messages | |
| CN105120046A (en) | Method and device for creating address book according to note information of new number | |
| CN105874874B (en) | Method and device for processing information | |
| JPWO2018207485A1 (en) | Information processing system, information processing apparatus, computer program, and dictionary database updating method | |
| CN103685665A (en) | Sending method and device for text contents |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20140312 |
|
| RJ01 | Rejection of invention patent application after publication |