[go: up one dir, main page]

CN102905236A - Method, device and system for monitoring spam short messages - Google Patents

Method, device and system for monitoring spam short messages Download PDF

Info

Publication number
CN102905236A
CN102905236A CN2011102120339A CN201110212033A CN102905236A CN 102905236 A CN102905236 A CN 102905236A CN 2011102120339 A CN2011102120339 A CN 2011102120339A CN 201110212033 A CN201110212033 A CN 201110212033A CN 102905236 A CN102905236 A CN 102905236A
Authority
CN
China
Prior art keywords
short message
short
numbers
calling
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011102120339A
Other languages
Chinese (zh)
Other versions
CN102905236B (en
Inventor
疏星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201110212033.9A priority Critical patent/CN102905236B/en
Publication of CN102905236A publication Critical patent/CN102905236A/en
Application granted granted Critical
Publication of CN102905236B publication Critical patent/CN102905236B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明公开了一种垃圾短消息监控方法、装置及系统,该方法包括:获取短消息;根据所述短消息的内容确定与所述内容对应的短消息集合,在所述短消息集合中增加所述短消息的主叫号码和被叫号码;当所述短消息集合中短消息的发送数量大于或等于设定的第一阈值时,根据所述消息集合中短消息的传播轨迹确定所述消息集合中的短消息是否为垃圾消息。本发明还公开了一种垃圾短消息监控设备及监控系统。本发明技术方案的实现,能够解决短消息发送方通过降低单个号码的短消息发送量来规避系统监控的技术问题。

Figure 201110212033

The invention discloses a spam short message monitoring method, device and system. The method includes: obtaining short messages; determining a short message set corresponding to the content according to the content of the short message, and adding a short message set to the short message set The calling number and called number of the short message; when the number of short messages sent in the short message set is greater than or equal to a set first threshold, determine the Whether the short messages in the message collection are spam messages. The invention also discloses a spam short message monitoring device and a monitoring system. The realization of the technical scheme of the invention can solve the technical problem that the short message sender evades system monitoring by reducing the short message sending volume of a single number.

Figure 201110212033

Description

Method, device and system for monitoring spam short messages
Technical Field
The present invention relates to spam short message identification technologies, and in particular, to a spam short message monitoring method, device, and system.
Background
Nowadays, short messages are favored by many consumers due to their advantages of low price, easy operation, and convenient communication. Because the short message is sent randomly, the sending object can be selected randomly, and the cost is low, thereby providing great convenience for the senders of the spam short messages. Garbage short messages of the types of reflexes, advertisements, fraud and the like are in a more and more intense trend, and the daily life of vast consumers is seriously disturbed.
There are two ways to block spam sms, one is spam blocking at the mobile phone side, and the other is spam blocking at the network side. The mobile phone side spam short message shielding is limited by the mobile phone computing capability and can only carry out simple blacklist filtering and keyword filtering; the network side spam short message interception can realize complicated spam short message analysis, identification and processing by means of strong processing capability of the background, and becomes a mainstream mode of spam short message shielding.
The identification and interception of the spam short messages at the network side are mainly realized by the following means:
and (3) filtering a blacklist: the system maintains a blacklist user list and directly intercepts short messages sent by users in the blacklist;
and (3) filtering keywords: the system maintains a keyword library, and intercepts the short message when the short message sent by the user contains sensitive words in the keyword library, wherein the interception of the keyword + frequency can be regarded as a subset of the short message.
And (3) monitoring the sending behavior: various monitoring models are established by taking the calling number as the center, whether the single calling number has violation suspicion or not is monitored in unit time from dimensions such as sending flow, called number rule, short message content, time period and the like, and further measures are taken, such as preventing subsequent short messages from being issued and adding into a blacklist.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: the identification technology used by the existing spam short message sending behavior analysis model is based on analyzing and identifying the sending content and the sending behavior of a single calling number, and if the sending quantity of a short message of a number in unit time is smaller than the threshold value set by the existing behavior analysis model, the short message can not be identified by a system. The spam short message sender can detect the threshold values of various behavior analysis models in a continuous trial mode, so that the sending quantity of the short messages in a single number unit time slice is smaller than the threshold value, and the purpose of avoiding system monitoring is achieved. In addition, if the sender of the spam short message uses a batch of Subscriber Identity Module (SIM) cards at the same time to send the same spam short message in a polling manner, the number of short message transmissions per unit time slice for a single number can be very small, which is far below the threshold set by the system, so that the existing monitoring system cannot effectively identify and intercept the spam short message.
Disclosure of Invention
The embodiment of the invention provides a method, a device and a system for monitoring junk short messages, which can solve the technical problem that a short message sending party avoids system monitoring by reducing the sending quantity of short messages of a single number.
One aspect of the present invention provides a method for monitoring spam messages, including:
acquiring a short message;
determining a short message set corresponding to the content according to the content of the short message, and adding a calling number and a called number of the short message in the short message set;
and when the sending quantity of the short messages in the short message set is greater than or equal to a set first threshold value, determining whether the short messages in the message set are spam messages according to the propagation track of the short messages in the message set.
Another aspect of the present invention provides a spam monitoring apparatus, including:
the message acquisition module is used for acquiring short messages;
the data preprocessing module is used for determining a short message set corresponding to the content according to the content of the short message and adding a calling number and a called number of the short message in the short message set;
and the message identification module is used for determining whether the short messages in the message set are spam messages according to the propagation track of the short messages in the message set when the sending quantity of the short messages in the short message set is greater than or equal to a set first threshold value.
The invention also provides a spam message monitoring system, which comprises:
short message data source device: the spam short message monitoring equipment is used for providing short message data to the spam short message monitoring equipment for identifying spam short messages and receiving the identification result of the spam short messages; and the number of the first and second groups,
the spam message monitoring device as described above: used for obtaining the short message; determining a short message set corresponding to the content according to the content of the short message, and adding a calling number and a called number of the short message in the short message set; and when the sending quantity of the short messages in the short message set is greater than or equal to a set first threshold value, determining whether the short messages in the message set are spam messages according to the propagation track of the short messages in the message set.
It can be seen from the technical solutions provided by the embodiments of the present invention that, in the implementation of the technical solutions of the present invention, whether the short message in the message set is a spam message can be determined according to the propagation trajectory of the short message in the message set, which not only can solve the problem that a short message sender avoids system monitoring by reducing the short message sending flow of a single number, but also can realize the effect of identifying a batch of spam short message sending numbers at a time by analyzing the association relationship between the content sending numbers and receiving numbers sending the same or similar short messages.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.
Fig. 1 is a flowchart of a spam monitoring method according to an embodiment of the present invention;
FIG. 2 is a diagram of a propagation trace of popular SMS messages in an embodiment of the present invention;
FIG. 3 is a diagram of a propagation trace of spam messages according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a spam monitoring device according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a spam monitoring system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a method, a device and a system for monitoring junk short messages.A short message set corresponding to the content is determined according to the content of the short message by the obtained short message, and a calling number and a called number of the short message are added in the short message set; when the sending number of the short messages in the short message set is larger than or equal to a set first threshold value, the identification process of the junk short messages is carried out, and a large number of short message records sent by common users can be removed in such a way, so that the purposes of saving system resources and improving analysis efficiency are achieved.
The short message entering the spam short message identification process determines whether the short message in the message set is a spam message according to the propagation track of the short message in the message set, and by the identification mode, the incidence relation among different spam short message sending numbers can be effectively found, so that the purpose of identifying a batch of spam short message sending numbers at one time is realized.
As shown in fig. 1, a method for monitoring spam messages in an embodiment of the present invention includes:
101. acquiring a short message;
102. determining a short message set corresponding to the content according to the content of the short message, and adding a calling number and a called number of the short message in the short message set;
103. and when the sending quantity of the short messages in the short message set is greater than or equal to a set first threshold value, determining whether the short messages in the message set are spam messages according to the propagation track of the short messages in the message set.
The first threshold may be set as required, such as 500, 1000, 2000, 10000, etc.
In an embodiment of the present invention, in step 101, the short message may be obtained by using a real-time obtaining method or a non-real-time obtaining method:
the method for acquiring the short message in real time comprises the following steps:
the Short Message Peer to Peer protocol (SMPP) or a private protocol is in butt joint with Short Message data source equipment to obtain a Message to be analyzed in real time; the short message data source device can be a short message service center or a short message gateway.
Specific examples are as follows:
1) the spam message monitoring system collects the short messages from the short message service center in real time through a real-time interface, wherein the real-time interface can be an SMPP interface and the like.
2) The SPM (Signal Process Machine, signaling processing device) acquires signaling No. seven (SS 7: signaling System No.7) short message Signaling in the network link, and then through a customized transmission control protocol (TCP: transmission Control Protocol)/internet Protocol (IP: internet Protocol) interface sends the short message to the spam monitoring system.
In addition, the non-real-time short message acquisition mode comprises the following steps:
the method comprises the steps of obtaining short message ticket data from short message data source equipment through a non-real-time interface for analysis, wherein the non-real-time interface can be an FTP (File Transfer Protocol) interface and the like.
Specifically, the spam monitoring system can collect MO (mobile originated) tickets from the short message service center through the FTP interface, so as to realize non-real-time acquisition of short message data.
Optionally, after the short message is acquired, the content of the short message, the calling number, the called number and the sending time in the short message can be extracted; and the content of the short message is subjected to anti-interference rejection processing, and particularly, special characters in the content of the short message can be rejected.
In an optional embodiment of the present invention, for the content of the extracted short message or the content of the short message from which the special characters are removed, the step 102 determines a short message set corresponding to the content of the short message according to the content of the short message, including:
compressing the content of the short message by using a data compression algorithm to obtain a short message content value;
and determining a short message set corresponding to the content of the short message according to the short message content value.
Specifically, the data compression algorithm in the embodiment of the present invention may specifically be a Tianlhash algorithm, a strish algorithm, an elfhah algorithm, or an Hflp algorithm, and the like, and the process of compressing the content of the short message using the data compression algorithm may specifically be to calculate a hash value of the content of the short message according to the data compression algorithm, and may directly use the hash value as the content value of the short message.
Further optionally, determining, according to the content of the short message, a short message set corresponding to the content specifically includes:
searching a corresponding short message set according to the message content value; if the corresponding short message set is found according to the message content value, taking the found short message set as the short message set corresponding to the content; and if the corresponding short message set is not found according to the content value, generating the short message set corresponding to the content. In the embodiment of the invention, each short message content value has a corresponding short message set.
Specifically, in an embodiment of the present invention, the data structure in the short message set may be represented as the following table:
Figure BDA0000079016230000061
optionally, in another embodiment of the present invention, the table above may further increase the information of the sending amount of the short message, and each time an element is added in the table above, the sending amount of the short message is increased by 1. It is to be understood that the sending amount of the short message may also be stored in another location, for example, in another table with the short message content value as an index, the table storing the corresponding relationship between the short message content value and the sending amount of the short message.
Optionally, in step 102, adding the calling number and the called number of the short message to the short message set, including:
and adding an element in the short message set, wherein the calling number and the called number are used as information of the element.
In an alternative embodiment of the present invention, step 103 may comprise:
counting the number of calling numbers with out degree greater than 0 in the short message set; calculating the ratio of the number of the calling numbers with the out degree greater than 0 to all the numbers in the short message set, wherein all the numbers comprise the calling numbers and the called numbers; and when the ratio is smaller than a set second threshold value, determining that the short messages in the short message set are suspected spam messages, and the calling numbers with the outgoing degree larger than 0 are suspected spam message sending numbers. Specifically, the following procedure may be adopted to count the number of calling numbers with out-degree greater than 0 in the short message set: setting a calling number set; sequentially extracting elements in the short message set, determining whether the calling number of the current element is stored in the calling number set, and if so, adding 1 to the out-degree of the calling number in the calling number set; if not, the calling number is added to the calling number set and the out degree is set to 1. After traversing all the elements in the short message set, the number of calling numbers with out degree greater than 0 can be determined.
The number of calling numbers with out degree greater than 0 in the short message set can be represented by t; the ratio can be expressed as r ═ T/T, where T is the number of all numbers; the second threshold is an empirical value and may be set as desired, such as 1%, 5%, 10%, etc. And when r is smaller than a second threshold value, determining that the short message in the message set is a suspected spam short message, and determining that the calling number with the out-degree larger than 0 is a sending number of the suspected spam short message.
A popular short message of blessing type or joke type is usually forwarded by the user continuously, so that the sending amount of the content can easily reach the set second threshold; fig. 2 illustrates the propagation trajectory of popular short messages, and as shown in fig. 2, after 13800000000 sends a short message to 13800000001, 13800000002 and 13800000003, 13800000002 further forwards the short message to 13800000003 and 13800000004, and after 13800000004 receives the short message, it forwards the short message to 13800000003.
However, a user of a spam short message usually cannot forward the spam short message after receiving the spam short message, the number of calling numbers of the short message is limited, and the number of called numbers is many, so that the propagation track is relatively single, that is, the number of sent contents is usually smaller than the set second threshold. Fig. 3 illustrates a propagation trajectory of spam short messages, as shown in fig. 3, after 13700000000 sends a spam short message to 13700000001, 13700000002, 13700000003, 13700000004 and 13700000005, 13700000001, 13700000002, 13700000003, 13700000004 and 13700000005 do not forward the spam short message any more; 13800000000 sends a spam short message to 13800000001, 13800000002, 13800000003, 13800000004 and 13800000005, 13800000001, 13800000002, 13800000003, 13800000004 and 13800000005 will not forward the spam short message. In this case, 13700000000 and 13800000000 are the numbers suspected of sending spam messages. Since many short messages sent by Service Providers (SPs), companies, etc. have the propagation path of the spam short message, and short messages sent by SPs, companies, etc. cannot be regarded as spam short messages, a white list may be set, and short messages sent by SPs, companies, etc. are not put in a short message set for processing.
Therefore, the identification method can effectively discover the association relation among different spam short message sending numbers, and realize the effect of identifying a batch of spam short message sending numbers at one time.
Specifically, in the embodiment of the present invention, the number of the calling numbers with out-degree greater than 0 in the short message set is counted, and the short message propagation trajectory of the message set may be used for counting, and specifically, a directed graph G (V, E) may be used for representing, where V is a set of all elements in a message set, each element is composed of a calling number and a called number, and E is a set of all short message contents;
n ═ V | represents the number of all numbers in V, and E ═ E | represents the amount of short message transmission in E;
using directed edges (i, j) between the number i and the number j to represent the number i to send a short message to the number j, wherein i, j is taken from the set V;
by di inThe incoming degree of the number i is represented, namely the number of short messages with the number i as a called number; by di outD is the number of short messages indicating the out-degree of the number i, i.e. i is the calling numberi in=di out=e(i=1:n);
In the directed edges (i, j), a number i is adjacent to a number j, the number j is adjacent to the number i, and an adjacent table is used to indicate other number sets adjacent or adjacent to a given number, which is used in the embodiment of the present invention to indicate a number set of a given number of a calling or called party.
In the foregoing embodiment, the number of the calling numbers with the out-degree greater than 0 in the short message set is counted, and is described by taking the out-degree of the numbers as an example, if the link table in the adjacency relation is applied in the scenario of the foregoing embodiment, when it is determined that the calling number in the information of the current element is included in the calling number set, the out-degree of the calling number in the information of the current element is added by 1 in the calling number set, and the called number is added to the adjacency table; if not, adding the calling number in the information of the current element in the calling number set, setting the out degree of the calling number in the information of the current element as 1, and adding the called number into the adjacency list;
specifically, in the embodiment of the present invention, the following may be referred to describe the data structure of each element in the element set V in a manner of using an adjacency linked list:
Figure BDA0000079016230000091
since popular short messages are forwarded between different users and recipients of spam short messages are not substantially forwarded, step 103 may be implemented in another alternative embodiment of the present invention, comprising:
counting the number of called numbers with the incoming degree greater than 0 in the short message set;
calculating the ratio of the number of the called numbers with the income degree larger than 0 to all the numbers in the short message set, wherein all the numbers comprise calling numbers and called numbers;
and when the ratio is larger than a set third threshold value, determining that the short message in the short message set is a suspected spam message, and the calling number for sending the short message in the short message set is a suspected spam message sending number. The third threshold is also an empirical value, and may be set as needed, such as 99%, 95%, 90%, and the like.
In an optional embodiment of the present invention, the step 102 of adding the calling number and the called number of the short message to the short message set includes:
adding an element in the short message set, and taking the calling number and the called number as information of the element;
the counting the number of called numbers with the incoming degree greater than 0 in the short message set comprises:
sequentially acquiring information of elements in the short message set;
judging whether the called number in the information of the current element is included in a called number set or not; if yes, adding 1 to the income degree of the called number in the information of the current element in the called number set; if not, adding the called number in the information of the current element in the called number set, and setting the in-degree of the called number in the information of the current element as 1.
It should be noted that, in the embodiment of the present invention, the identification process of the short message is described by taking the number in degree as an example, which is the same as the principle implemented by the embodiment adopting the unified number out degree, and detailed description is not repeated here, and specific reference may be made to the specific scheme of the embodiment.
In an optional embodiment of the present invention, when it is determined that the short message in the message set is a spam message according to the propagation trajectory of the short message in the message set, the method may further include:
processing the spam short message and the number for sending the spam short message;
specifically, the general suspected spam short message processing method includes at least one of the following three methods:
1) adding the number judged to send the suspected spam Short Message into a blacklist, and synchronizing the blacklist to an external System, such as an SMSC (Short Message Service Center), a BOSS (business and Operation Support System) and the like;
2) sending the number judged to send the suspected spam short message, the content of the short message, the sending quantity of the short message and other information to a manual auditing platform for manual secondary confirmation;
3) and adding keywords into the corresponding short message content for interception.
As shown in fig. 4, a monitoring device for spam messages according to an embodiment of the present invention includes:
a message acquisition module 21, configured to acquire a short message;
a data preprocessing module 22, configured to determine a short message set corresponding to the content according to the content of the short message, and add a calling number and a called number of the short message to the short message set;
the message identification module 23 determines whether the short messages in the message set are spam messages according to the propagation track of the short messages in the message set when the sending number of the short messages in the short message set is greater than or equal to a set first threshold.
Optionally, the message collection module 21 may be specifically configured to:
the method comprises the steps that butt joint is carried out through a real-time interface short message data source device, and a message to be analyzed is obtained in real time; or,
and acquiring the call ticket data of the short message in the short message data source equipment in a non-real-time manner through a non-real-time interface for analysis.
Further optionally, the message acquisition module sends the acquired short message data packet to the data preprocessing module, when the short message flow is large and the processing capacity of a single server is limited, the data preprocessing module needs to adopt cluster-mode distributed deployment, and at this time, the traditional load balancing mode of distributing according to the calling number rule cannot guarantee that the same short message content is sent but short messages with different calling numbers are distributed to the same server, so that the scheme can provide two load balancing modes:
one is that: and a load balancing mode for distributing according to the content length of the short message. If the short message with the content length within 20 bytes is sent to the server 1, the short message with the length of 20-39 bytes is sent to the server 2, the short message with the length of 40-70 bytes is sent to the server 3, and the short message with the length more than 70 bytes is sent to the server 4;
the other is as follows: the content of the short message is converted into a content value (such as an integer) of the short message by a certain algorithm, and then the traditional load balancing mode is adopted, for example, the load balancing is realized according to the mantissa of the content value of the short message.
In an embodiment of the present invention, the message identification module 23 is specifically configured to: counting the number of calling numbers with out degree greater than 0 in the short message set; calculating the ratio of the number of the calling numbers with the out degree greater than 0 to all the numbers in the short message set, wherein all the numbers comprise the calling numbers and the called numbers; and when the ratio is smaller than a set second threshold value, determining that the short messages in the short message set are suspected spam messages, and the calling numbers with the outgoing degree larger than 0 are suspected spam message sending numbers.
In an embodiment of the present invention, the data preprocessing module 22 adds the calling number and the called number of the short message in the short message set, which may specifically include: adding an element in the short message set, and taking the calling number and the called number as information of the element;
the message identification module 23 counts the number of the calling numbers with out degree greater than 0 in the short message set, including: sequentially acquiring information of elements in the short message set; judging whether the calling number in the information of the current element is included in the calling number set; if yes, adding 1 to the out degree of the calling number in the information of the current element in the calling number set; if not, adding the calling number in the information of the current element in the calling number set, and setting the out degree of the calling number in the information of the current element as 1.
In another embodiment of the present invention, the message identification module 22 is specifically configured to: counting the number of called numbers with the incoming degree greater than 0 in the short message set; calculating the ratio of the number of the called numbers with the income degree larger than 0 to all the numbers in the short message set, wherein all the numbers comprise calling numbers and called numbers; and when the ratio is larger than a set third threshold value, determining that the short message in the short message set is a suspected spam message, and the calling number for sending the short message in the short message set is a suspected spam message sending number.
In another embodiment of the present invention, the data preprocessing module 22 adds the calling number and the called number of the short message to the short message set, specifically including: adding an element in the short message set, and taking the calling number and the called number as information of the element;
the message identification module 23 counts the number of called numbers with an incoming degree greater than 0 in the short message set, including: sequentially acquiring information of elements in the short message set; judging whether the called number in the information of the current element is included in a called number set or not; if yes, adding 1 to the income degree of the called number in the information of the current element in the called number set; if not, adding the called number in the information of the current element in the called number set, and setting the in-degree of the called number in the information of the current element as 1.
In an optional embodiment of the present invention, the determining, by the data preprocessing module 22, a short message set corresponding to the content of the short message according to the content of the short message specifically includes: compressing the content of the short message by using a data compression algorithm to obtain a short message content value; and determining a short message set corresponding to the content of the short message according to the short message content value.
In an optional embodiment of the present invention, the data preprocessing module 22 determines, according to the content of the short message, a short message set corresponding to the content, further comprising: searching a corresponding short message set according to the message content value; if the corresponding short message set is found according to the message content value, taking the found short message set as the short message set corresponding to the content; and if the corresponding short message set is not found according to the content value, generating the short message set corresponding to the content.
It should be noted that the embodiment of the spam short message monitoring apparatus in the present invention is directly obtained based on the embodiment of the method, and includes the same or corresponding technical solutions of the embodiment of the method, wherein there is a correspondence between each module and each step in the embodiment of the method in the embodiment of the present invention, and reference may be specifically made to the related description of the embodiment of the method.
As shown in fig. 5, a spam monitoring system according to an embodiment of the present invention includes:
short message data source device 31: the spam short message monitoring equipment is used for providing short message data to the spam short message monitoring equipment for identifying spam short messages and receiving the identification result of the spam short messages;
and includes the spam short message monitoring device 32 provided by the embodiment of the present invention.
The method, the device and the system for monitoring the spam short message provided by the embodiment of the invention not only can solve the problem that a short message sending party cannot be found in time to avoid system monitoring by reducing the short message sending flow of a single number; and the effect of identifying a batch of spam short message sending numbers at one time can be realized by analyzing the incidence relation between the sending numbers and the receiving numbers for sending the same or similar short message contents.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (15)

1. A method for monitoring spam short messages is characterized by comprising the following steps:
acquiring a short message;
determining a short message set corresponding to the content according to the content of the short message, and adding a calling number and a called number of the short message in the short message set;
and when the sending quantity of the short messages in the short message set is greater than or equal to a set first threshold value, determining whether the short messages in the message set are spam messages according to the propagation track of the short messages in the message set.
2. The method of claim 1, wherein the determining whether the short messages in the message set are spam messages according to the propagation trajectories of the short messages in the message set comprises:
counting the number of calling numbers with out degree greater than 0 in the short message set;
calculating the ratio of the number of the calling numbers with the out degree greater than 0 to all the numbers in the short message set, wherein all the numbers comprise the calling numbers and the called numbers;
and when the ratio is smaller than a set second threshold value, determining that the short messages in the short message set are suspected spam messages, and the calling numbers with the outgoing degree larger than 0 are suspected spam message sending numbers.
3. The method of claim 2, wherein said adding the calling number and the called number of the short message to the set of short messages comprises:
adding an element in the short message set, and taking the calling number and the called number as information of the element;
the counting the number of the calling numbers with the out-degree greater than 0 in the short message set comprises:
sequentially acquiring information of elements in the short message set;
judging whether the calling number in the information of the current element is included in the calling number set; if yes, adding 1 to the out degree of the calling number in the information of the current element in the calling number set; if not, adding the calling number in the information of the current element in the calling number set, and setting the out degree of the calling number in the information of the current element as 1.
4. The method of claim 1, wherein the determining whether the short messages in the message set are spam messages according to the propagation trajectories of the short messages in the message set comprises:
counting the number of called numbers with the incoming degree greater than 0 in the short message set;
calculating the ratio of the number of the called numbers with the income degree larger than 0 to all the numbers in the short message set, wherein all the numbers comprise calling numbers and called numbers;
and when the ratio is larger than a set third threshold value, determining that the short message in the short message set is a suspected spam message, and the calling number for sending the short message in the short message set is a suspected spam message sending number.
5. The method of claim 4, wherein said adding the calling number and the called number of the short message to the set of short messages comprises:
adding an element in the short message set, and taking the calling number and the called number as information of the element;
the counting the number of called numbers with the incoming degree greater than 0 in the short message set comprises:
sequentially acquiring information of elements in the short message set;
judging whether the called number in the information of the current element is included in a called number set or not; if yes, adding 1 to the income degree of the called number in the information of the current element in the called number set; if not, adding the called number in the information of the current element in the called number set, and setting the in-degree of the called number in the information of the current element as 1.
6. The method according to any one of claims 1 to 5, wherein said determining a short message set corresponding to the content of the short message according to the content of the short message comprises:
compressing the content of the short message by using a data compression algorithm to obtain a short message content value;
and determining a short message set corresponding to the content of the short message according to the short message content value.
7. The method according to claim 6, wherein determining the short message set corresponding to the content according to the content of the short message is specifically:
searching a corresponding short message set according to the message content value;
if the corresponding short message set is found according to the message content value, taking the found short message set as the short message set corresponding to the content;
and if the corresponding short message set is not found according to the content value, generating the short message set corresponding to the content.
8. A spam monitoring device, comprising:
the message acquisition module is used for acquiring short messages;
the data preprocessing module is used for determining a short message set corresponding to the content according to the content of the short message and adding a calling number and a called number of the short message in the short message set;
and the message identification module is used for determining whether the short messages in the message set are spam messages according to the propagation track of the short messages in the message set when the sending quantity of the short messages in the short message set is greater than or equal to a set first threshold value.
9. The device of claim 8, wherein the message identification module is specifically configured to:
counting the number of calling numbers with out degree greater than 0 in the short message set;
calculating the ratio of the number of the calling numbers with the out degree greater than 0 to all the numbers in the short message set, wherein all the numbers comprise the calling numbers and the called numbers;
and when the ratio is smaller than a set second threshold value, determining that the short messages in the short message set are suspected spam messages, and the calling numbers with the outgoing degree larger than 0 are suspected spam message sending numbers.
10. The apparatus of claim 9, wherein the data preprocessing module adds the calling number and the called number of the short message to a short message set, and specifically comprises:
adding an element in the short message set, and taking the calling number and the called number as information of the element;
the message identification module counts the number of the calling numbers with the out-degree greater than 0 in the short message set, and specifically includes: sequentially acquiring information of elements in the short message set; judging whether the calling number in the information of the current element is included in the calling number set; if yes, adding 1 to the out degree of the calling number in the information of the current element in the calling number set; if not, adding the calling number in the information of the current element in the calling number set, and setting the out degree of the calling number in the information of the current element as 1.
11. The device of claim 8, wherein the message identification module is specifically configured to:
counting the number of called numbers with the incoming degree greater than 0 in the short message set;
calculating the ratio of the number of the called numbers with the income degree larger than 0 to all the numbers in the short message set, wherein all the numbers comprise calling numbers and called numbers;
and when the ratio is larger than a set third threshold value, determining that the short message in the short message set is a suspected spam message, and the calling number for sending the short message in the short message set is a suspected spam message sending number.
12. The apparatus of claim 11, wherein the data preprocessing module adds the calling number and the called number of the short message to a short message set, and specifically comprises:
adding an element in the short message set, and taking the calling number and the called number as information of the element;
the message identification module counts the number of called numbers with the incoming degree greater than 0 in the short message set, and specifically includes: sequentially acquiring information of elements in the short message set; judging whether the called number in the information of the current element is included in a called number set or not; if yes, adding 1 to the income degree of the called number in the information of the current element in the called number set; if not, adding the called number in the information of the current element in the called number set, and setting the in-degree of the called number in the information of the current element as 1.
13. The apparatus according to any one of claims 8 to 12, wherein the data preprocessing module determines a short message set corresponding to the content of the short message according to the content of the short message, and specifically includes:
compressing the content of the short message by using a data compression algorithm to obtain a short message content value;
and determining a short message set corresponding to the content of the short message according to the short message content value.
14. The apparatus according to claim 13, wherein the data preprocessing module determines a short message set corresponding to the content according to the content of the short message, and specifically includes:
searching a corresponding short message set according to the message content value;
if the corresponding short message set is found according to the message content value, taking the found short message set as the short message set corresponding to the content;
and if the corresponding short message set is not found according to the content value, generating the short message set corresponding to the content.
15. A spam monitoring system, comprising:
short message data source device: the spam short message monitoring equipment is used for providing short message data to the spam short message monitoring equipment for identifying spam short messages and receiving the identification result of the spam short messages; and the number of the first and second groups,
a spam monitoring device according to any one of claims 8 to 14.
CN201110212033.9A 2011-07-27 2011-07-27 A kind of junk short message monitoring method, Apparatus and system Expired - Fee Related CN102905236B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110212033.9A CN102905236B (en) 2011-07-27 2011-07-27 A kind of junk short message monitoring method, Apparatus and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110212033.9A CN102905236B (en) 2011-07-27 2011-07-27 A kind of junk short message monitoring method, Apparatus and system

Publications (2)

Publication Number Publication Date
CN102905236A true CN102905236A (en) 2013-01-30
CN102905236B CN102905236B (en) 2016-08-17

Family

ID=47577233

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110212033.9A Expired - Fee Related CN102905236B (en) 2011-07-27 2011-07-27 A kind of junk short message monitoring method, Apparatus and system

Country Status (1)

Country Link
CN (1) CN102905236B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104219672A (en) * 2014-10-14 2014-12-17 北京奇虎科技有限公司 Incoming call or message identification method and device
CN106162584A (en) * 2015-01-27 2016-11-23 北京奇虎科技有限公司 Identify the method for refuse messages, client, cloud server and system
CN106454818A (en) * 2015-08-06 2017-02-22 中国移动通信集团四川有限公司 Data information service credit control method and data information service credit control device
CN106815200A (en) * 2015-11-30 2017-06-09 任子行网络技术股份有限公司 Objectionable text detection method and device based on keyword
CN114302351A (en) * 2022-03-09 2022-04-08 太平金融科技服务(上海)有限公司深圳分公司 Short message service processing method and device, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101321365A (en) * 2008-07-17 2008-12-10 浙江大学 A method for identifying users of spam text messages using text message reply frequency
CN101335920A (en) * 2008-07-15 2008-12-31 中国联合通信有限公司 Rubbish short message recognition system and method based on calling number location and transmitted content
CN101355728A (en) * 2008-05-06 2009-01-28 中国移动通信集团江苏有限公司 SMS vitality system and its judgment method
CN101572870A (en) * 2008-05-03 2009-11-04 祁勇 Method for monitoring junk information in communication network
WO2010145403A1 (en) * 2009-10-30 2010-12-23 中兴通讯股份有限公司 Method, system, control console and management machine for determining spam messages
CN101977360A (en) * 2010-09-30 2011-02-16 北京新媒传信科技有限公司 Junk short message filter method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101572870A (en) * 2008-05-03 2009-11-04 祁勇 Method for monitoring junk information in communication network
CN101355728A (en) * 2008-05-06 2009-01-28 中国移动通信集团江苏有限公司 SMS vitality system and its judgment method
CN101335920A (en) * 2008-07-15 2008-12-31 中国联合通信有限公司 Rubbish short message recognition system and method based on calling number location and transmitted content
CN101321365A (en) * 2008-07-17 2008-12-10 浙江大学 A method for identifying users of spam text messages using text message reply frequency
WO2010145403A1 (en) * 2009-10-30 2010-12-23 中兴通讯股份有限公司 Method, system, control console and management machine for determining spam messages
CN101977360A (en) * 2010-09-30 2011-02-16 北京新媒传信科技有限公司 Junk short message filter method

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104219672A (en) * 2014-10-14 2014-12-17 北京奇虎科技有限公司 Incoming call or message identification method and device
CN104219672B (en) * 2014-10-14 2017-08-22 北京奇虎科技有限公司 Incoming call or short message recognition methods and device
CN106162584A (en) * 2015-01-27 2016-11-23 北京奇虎科技有限公司 Identify the method for refuse messages, client, cloud server and system
CN106162584B (en) * 2015-01-27 2020-04-24 北京奇虎科技有限公司 Method, client, cloud server and system for identifying spam messages
CN106454818A (en) * 2015-08-06 2017-02-22 中国移动通信集团四川有限公司 Data information service credit control method and data information service credit control device
CN106815200A (en) * 2015-11-30 2017-06-09 任子行网络技术股份有限公司 Objectionable text detection method and device based on keyword
CN114302351A (en) * 2022-03-09 2022-04-08 太平金融科技服务(上海)有限公司深圳分公司 Short message service processing method and device, computer equipment and storage medium
CN114302351B (en) * 2022-03-09 2022-06-17 太平金融科技服务(上海)有限公司深圳分公司 Short message service processing method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN102905236B (en) 2016-08-17

Similar Documents

Publication Publication Date Title
CN108337652B (en) A method and device for detecting traffic fraud
CN103605791B (en) Information transmission system and information-pushing method
US20120030293A1 (en) Employing report ratios for intelligent mobile messaging classification and anti-spam defense
Murynets et al. Crime scene investigation: SMS spam data analysis
Jiang et al. Greystar: Fast and accurate detection of {SMS} spam numbers in large cellular networks using gray phone space
JP2006060811A (en) Method for filtering spam mail for mobile communication apparatus
WO2016065908A1 (en) Method, device and system for detecting fraudulent user
CN102905236B (en) A kind of junk short message monitoring method, Apparatus and system
CN101860822A (en) Method and system for monitoring spam messages
CN103581909B (en) The localization method of a kind of doubtful mobile phone Malware and device thereof
CN106533893B (en) Message processing method and system
CN103391547A (en) Information processing method and terminal
CN106255082A (en) The recognition methods of a kind of refuse messages and system
CN101635894A (en) Monitoring system, monitoring method and information transmission method for junk information
CN103796207A (en) Method and device for identifying false subscriber number
CN102098640B (en) Method, device and system for distinguishing and stopping equipment from sending SMS (short messaging service) spam
CN101668256A (en) Methods and device for monitoring garbage multimedia messages
CN108322354B (en) A method and device for identifying a sneak traffic account
CN102111723B (en) Method for identifying spam short message user by analyzing short message frequency and content
CN102932753A (en) Method for intercepting spam multimedia message on link of multimedia system
CN105490824A (en) Game server and mass message filtering method
Jiang et al. Understanding sms spam in a large cellular network: characteristics, strategies and defenses
CN112954667B (en) Detection method and device for hotspot mobile terminal, computer equipment and storage medium
CN106899947A (en) Short message method for cleaning and device
WO2016037489A1 (en) Method, device and system for monitoring rcs spam messages

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160817

CF01 Termination of patent right due to non-payment of annual fee