CN107707516A - A kind of IP address analysis method and system - Google Patents
A kind of IP address analysis method and system Download PDFInfo
- Publication number
- CN107707516A CN107707516A CN201710216069.1A CN201710216069A CN107707516A CN 107707516 A CN107707516 A CN 107707516A CN 201710216069 A CN201710216069 A CN 201710216069A CN 107707516 A CN107707516 A CN 107707516A
- Authority
- CN
- China
- Prior art keywords
- cycle
- current
- final
- period
- probability
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
- H04L63/0236—Filtering by address, protocol, port number or service, e.g. IP-address or URL
Landscapes
- Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Telephonic Communication Services (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
技术领域technical field
本发明涉及互联网用户画像领域,尤其涉及一种IP地址分析方法。The invention relates to the field of Internet user portraits, in particular to an IP address analysis method.
背景技术Background technique
网络层防火墙可视为一种IP封包过滤器,运作在底层的TCP/IP协议堆栈上。可以以枚举的方式,只允许符合特定规则的封包通过,其余的一概禁止穿越防火墙(病毒除外,防火墙不能防止病毒侵入)。这些规则通常可以经由管理员定义或修改,不过某些防火墙设备可能只能套用内置的规则。The network layer firewall can be regarded as a kind of IP packet filter, which operates on the underlying TCP/IP protocol stack. In the way of enumeration, only packets that meet specific rules are allowed to pass through, and the rest are prohibited from passing through the firewall (except viruses, which cannot prevent virus intrusion). These rules can usually be defined or modified by the administrator, but some firewall devices may only apply built-in rules.
也可以以另一种较宽松的角度来制定防火墙规则,只要封包不符合任何一项“否定规则”就予以放行。操作系统及网络设备大多已内置防火墙功能。It is also possible to formulate firewall rules from another looser angle, as long as the packet does not meet any of the "negative rules", it will be released. Most operating systems and network devices have built-in firewall functions.
较新的防火墙能利用封包的多样属性来进行过滤,例如:来源IP地址、来源端口号、目的IP地址或端口号、服务类型(如HTTP或是FTP)。也能经由通信协议、TTL值、来源的网域名称或网段...等属性来进行过滤。Newer firewalls can use various attributes of packets to filter, such as: source IP address, source port number, destination IP address or port number, service type (such as HTTP or FTP). It can also be filtered by attributes such as communication protocol, TTL value, source domain name or network segment... and so on.
现有的拦截方案通过单一固定规则拦截,对访问请求分析的粒度、维度不够,对访问者的身份缺乏认知,容易造成误拦。The existing interception scheme intercepts through a single fixed rule, which does not have enough granularity and dimension for the analysis of access requests, and lacks awareness of the identity of the visitor, which is likely to cause false blocking.
发明内容Contents of the invention
本发明旨在解决上面描述的问题。本发明的一个目的是提供一种解决以上问题中的任何一个的。具体地,本发明提供能够。The present invention aims to solve the problems described above. It is an object of the present invention to provide a solution to any of the above problems. Specifically, the present invention provides the ability to.
根据本发明的第一方面,提供了一种IP地址分析方法,包括:According to a first aspect of the present invention, a kind of IP address analysis method is provided, comprising:
收集IP地址的历史数据;collect historical data on IP addresses;
对IP地址的历史数据进行分析,生成IP地址的信用数据。Analyze the historical data of the IP address to generate the credit data of the IP address.
其中,所述收集IP地址的历史数据的步骤包括:Wherein, the step of collecting the historical data of the IP address includes:
在第一周期内收集并解析原始日志;Collect and parse raw logs during the first cycle;
将所述原始日志中的信息格式化,得到预定义指标,所述预定义指标至少包含以下信息中的任一项或任意多项:Formatting the information in the original log to obtain a predefined indicator, the predefined indicator at least includes any one or more of the following information:
时间,IP,工作时段请求数,休息时段请求数,睡眠时段请求数,工作时段请求文件大小,休息时段请求文件大小,睡眠时段请求文件大小,工作时段用户代理UserAgent数,休息时段UserAgent数,睡眠时段UserAgent数,移动端UserAgent数,PC端UserAgent数,访问来源数量,访问域名数量,出现小时数;Time, IP, number of requests during working hours, number of requests during rest periods, number of requests during sleep periods, file size requested during working hours, file size requested during rest periods, file size requested during sleep periods, number of UserAgents during working hours, number of UserAgents during rest periods, sleep The number of UserAgents in the time period, the number of UserAgents on the mobile terminal, the number of UserAgents on the PC terminal, the number of access sources, the number of domain names accessed, and the number of hours;
存储各个IP地址对应的所述预定义指标,每个IP地址对应一个或多个预定义指标。The predefined indicators corresponding to each IP address are stored, and each IP address corresponds to one or more predefined indicators.
其中,所述收集IP地址的历史数据的步骤还包括:Wherein, the step of collecting the historical data of the IP address also includes:
在第二周期内,一次或多次从第三方平台获取第三方IP库和/或第三方IP黑名单。In the second cycle, obtain the third-party IP library and/or third-party IP blacklist from the third-party platform one or more times.
其中,对IP地址的历史数据进行分析,生成IP地址的信用数据包括:Among them, the historical data of the IP address is analyzed, and the credit data of the generated IP address includes:
对第二周期内与工作日对应的第一周期的预定义指标进行预处理并归一化后得到各个工作日中间值,对所述第二周期内与休息日对应的第一周期的预定义指标进行预处理并归一化后得到各个休息日中间值,所述第二周期包含多个与工作日对应的第一周期和多个与休息日对应的第一周期;Preprocess and normalize the predefined indicators of the first period corresponding to working days in the second period to obtain the median value of each working day, and the predefined indicators of the first period corresponding to rest days in the second period After the indicators are preprocessed and normalized, the median value of each rest day is obtained, and the second cycle includes multiple first cycles corresponding to working days and multiple first cycles corresponding to rest days;
对所述各个工作日中间值分别进行加权平均处理得到工作日加权均值;Performing weighted average processing on the median value of each working day respectively to obtain the weighted mean value of working days;
对所述各个休息日中间值分别进行加权平均或最大值处理得到休息日加权均值或最大值;Perform weighted average or maximum value processing on the median value of each rest day to obtain the weighted average or maximum value of rest days;
依据一个或者多个工作日加权均值,一个或多个休息日加权均值计算得到第二周期内的当前第二周期临时具体指标,所述当前第二周期临时具体指标包括:Based on the weighted average value of one or more working days and the weighted average value of one or more rest days, the current temporary specific indicators for the second period in the second period are calculated, and the temporary specific indicators for the current second period include:
本周期为办公出口IP概率,本周期为家庭出口IP概率,本周期为真人概率,本周期活跃度分数,本周期人数分组;This cycle is the probability of office exit IP, this cycle is the probability of home exit IP, this cycle is the probability of real people, the activity score of this cycle, and the number of people in this cycle are grouped;
根据上一第二周期的最终具体指标与第三方IP库和/或第三方IP黑名单,对所述当前第二周期临时具体指标进行调整,得到当前第二周期的最终具体指标,以该当前第二周期的最终具体指标作为所述IP地址的信用数据。According to the final specific indicators of the previous second cycle and the third-party IP library and/or third-party IP blacklist, the temporary specific indicators of the current second cycle are adjusted to obtain the final specific indicators of the current second cycle, and the current The final specific index of the second period is used as the credit data of the IP address.
其中,所述对第二周期内与工作日对应的第一周期的预定义指标进行预处理及归一化后得到各个工作日中间值,对所述第二周期内与休息日对应的第一周期的预定义指标进行预处理及归一化后得到各个休息日中间值的步骤包括:Wherein, the median value of each working day is obtained after preprocessing and normalizing the predefined indicators of the first period corresponding to working days in the second period, and the first period corresponding to rest days in the second period The steps to obtain the median value of each rest day after preprocessing and normalizing the predefined indicators of the period include:
计算与工作日对应的第一周期小时出现数分数、移动端UserAgent分数、PC端UserAgent数分数、休息时段VS工作时段请求数分数、休息时段VS睡眠时段请求数分数、访问域名数分数,对以上分数取加权均值,得到工作日为家庭出口IP概率中间值;Calculate the hours of the first cycle corresponding to the working day, the score of the user agent on the mobile terminal, the score of the user agent on the PC terminal, the score of the number of requests during the rest period vs. the work period, the score of the number of requests during the rest period vs. the sleep period, and the score of the number of domain names visited. For the above The weighted average of the scores is used to obtain the median value of the probability of exporting IP for households on weekdays;
计算与工作日对应的第一周期工作时段请求数分数、工作时段VS休息时段请求数分数、休息时段VS睡眠时段请求数分数、PC端UserAgent数分数、工作时段VS休息时段UserAgent数分数,对以上分数值加权平均,得到工作日为办公出口IP概率中间值;Calculate the number of requests in the first cycle corresponding to the working day, the number of requests in the working period VS the rest period, the number of requests in the rest period VS the sleep period, the number of PC-side UserAgents, and the number of UserAgents in the working period VS rest period, for the above The weighted average of the score values is used to obtain the median value of the IP probability of an office exit on a working day;
计算与工作日对应的第一周期请求数分布分数、UserAgent数分布分数、小时出现数分数、域名数VS来源数分数、移动端VS PC端UserAgent数分数,对以上分数取加权均值,得到工作日为真人概率中间值;Calculate the distribution score of the number of requests in the first period corresponding to the working day, the distribution score of the number of UserAgents, the score of the number of hours, the score of the number of domain names VS the number of sources, the score of the number of UserAgents on the mobile side VS PC side, and take the weighted average of the above scores to get the working day is the median value of the real probability;
计算与工作日对应的第一周期访问域名数分数、工作时段请求数分数、休息时段请求数分数、睡眠时段请求数分数、小时出现数分数、请求来源数分数,对以上分数取加权均值,得到工作日活跃度中间值;Calculate the scores of the number of domain names accessed in the first period corresponding to the working day, the number of requests during working hours, the number of requests during rest periods, the number of requests during sleep periods, the number of hours, and the number of request sources, and take the weighted average of the above scores to get Median value of workday activity;
计算与休息日对应的第一周期小时出现数分数、移动端UserAgent分数、PC端UserAgent数分数、休息时段VS工作时段请求数分数、休息时段VS睡眠时段请求数分数、访问域名数分数,对以上分数取加权均值,得到休息日为家庭出口IP概率中间值;Calculation of the hourly occurrence score, mobile UserAgent score, PC UserAgent score, rest period VS work period request score, rest period VS sleep period request score, and access domain name score for the first cycle corresponding to the rest day. The weighted average of the scores is used to obtain the median value of the probability that the rest day is the family's export IP;
计算与休息日对应的第一周期休息时段请求数分数、UserAgent数分布分数、小时出现数分数、域名数VS来源数分数、移动端VS PC端UserAgent数分数,对以上分数取加权均值,得到休息日为真人概率中间值;Calculate the number of requests for the first period of the rest period corresponding to the rest day, the distribution of UserAgents, the number of hours, the number of domain names vs. the number of sources, and the number of users on mobile vs. PCs. Take the weighted average of the above scores to get the break The day is the median value of the probability of a real person;
计算与休息日对应的第一周期访问域名数分数、工作时段请求数分数、休息时段请求数分数、睡眠时段请求数分数、小时出现数分数、请求来源数分数,对以上分数取加权均值,得到休息日活跃度中间值。Calculate the scores of the number of domain names accessed in the first period, the scores of requests during working hours, the scores of requests during rest periods, the scores of requests during sleep periods, the scores of hour occurrences, and the scores of request sources in the first cycle corresponding to the rest day, and take the weighted average of the above scores to get The median activity level on rest days.
其中,所述依据一个或者多个工作日加权均值,一个或多个休息日加权均值计算得到第二周期内的当前第二周期临时具体指标的步骤包括:Wherein, the step of calculating the temporary specific indicators for the current second period in the second period according to one or more weighted averages of working days and one or more weighted averages of rest days includes:
预处理及归一化后得到工作日VS休息日PC端UserAgent数分数、工作日VS休息日移动端UserAgent数分数、工作日VS休息日请求数分数,对以上三个分数取加权均值,与工作日为办公出口IP概率中间值的加权均值为所述本周期为办公出口IP概率;After preprocessing and normalization, the scores of PC-side UserAgents on weekdays vs. holidays, mobile-side UserAgents on weekdays vs. holidays, and the number of requests on weekdays vs. holidays are obtained. The weighted average of the above three scores is taken to compare with the work The weighted average of the median value of the IP probability of office exits on a daily basis is the IP probability of office exits in the current cycle;
以工作日为家庭出口IP概率中间值与休息日为家庭出口IP概率中间值加权均值作为本周期为家庭出口IP概率;Taking working days as the median value of household export IP probability and rest days as the weighted average of household export IP probability as the household export IP probability in this cycle;
以工作日为真人概率中间值与休息日为真人概率中间值的加权均值作为本周期为真人概率;The weighted average of the median probability of being a real person on weekdays and the median probability of being a real person on rest days is taken as the probability of being a real person in this cycle;
以工作日活跃度中间值与休息日活跃度中间值的加权均值作为本周期活跃度分数。The weighted average of the median activity level on weekdays and the median activity level on rest days is used as the activity score for this cycle.
以工作日移动端UserAgent数量与休息日移动端UserAgent数量与工作日PC端UserAgent数量与休息日PC端UserAgent数量的最大值进行分组作为本周期人数分组。The maximum number of UserAgents on mobile terminals on weekdays, the number of UserAgents on mobile terminals on rest days, the number of UserAgents on PCs on weekdays, and the number of UserAgents on PCs on rest days are used as the grouping of the number of people in this cycle.
其中,所述最终具体指标至少包含以下信息的任一项或任意多项,Wherein, the final specific indicators include at least any one or multiple items of the following information,
IP,IPInt,更新ID,该IP更新次数,最终人数分组,最终为办公出口IP概率之和,最终为家庭出口IP概率之和,最终为真人概率之和,最终活跃度分数之和,IP, IPInt, update ID, the number of times the IP is updated, the final number of people grouped, the final sum of office exit IP probabilities, the final sum of home exit IP probabilities, the final sum of real-person probabilities, and the final sum of activity scores,
其中,“IPInt”为IP地址对应的长整型,“更新ID”为更新第二周期的最终具体指标的次数,“该IP更新次数”为某IP地址的更新第二周期的最终具体指标的次数,Among them, "IPInt" is a long integer corresponding to the IP address, "update ID" is the number of times to update the final specific index in the second cycle, and "this IP update number" is the number of the final specific index in the second cycle of an IP address update frequency,
所述根据上一第二周期的最终具体指标与第三方IP库和/或第三方IP黑名单,对所述当前第二周期临时具体指标进行调整,得到当前第二周期的最终具体指标的步骤包括:The step of adjusting the temporary specific indicators of the current second cycle according to the final specific indicators of the previous second cycle and the third-party IP library and/or third-party IP blacklist to obtain the final specific indicators of the current second cycle include:
对于在所述上一第二周期的最终具体指标与所述当前第二周期临时具体指标中均涉及的IP地址,通过如下计算获取当前第二周期的最终具体指标:For the IP address involved in the final specific index of the last second cycle and the temporary specific index of the current second cycle, the final specific index of the current second cycle is obtained through the following calculation:
在当前第二周期临时具体指标的人数分组和上一第二周期的最终具体指标的最终人数分组为相邻的分组时,选择人数大的分组作为当前第二周期的最终人数分组,否则选择当前第二周期临时具体指标的人数分组,When the population grouping of the temporary specific indicators in the current second cycle and the final population grouping of the final specific indicators in the previous second cycle are adjacent groups, select the group with the largest number of people as the final population grouping of the current second cycle, otherwise select the current The grouping of the number of people in the temporary specific indicators for the second cycle,
当前第二周期临时具体指标中的为办公出口IP概率加上一第二周期的最终具体指标中的最终为办公出口IP概率之和作为当前第二周期的最终为办公出口IP概率之和,The sum of the IP probability of the office exit in the temporary specific index of the current second cycle plus the final IP probability of the office exit in the final specific index of the second cycle is used as the final sum of the IP probability of the office exit in the current second cycle,
当前第二周期临时具体指标中的为家庭出口IP概率加上一第二周期的最终具体指标中的最终为家庭出口IP概率之和作为当前第二周期的最终为家庭出口IP概率之和,In the temporary specific index of the current second cycle, the sum of the final IP probability of household export plus the final IP probability of household export in the final specific index of the second cycle is used as the final sum of IP probability of household export in the current second cycle,
当前第二周期临时具体指标中的为真人概率加上一第二周期的最终具体指标中的最终为真人概率之和作为当前第二周期的最终为真人概率之和,The sum of the probability of being a real person in the temporary specific indicators of the current second cycle plus the final probability of being a real person in the final specific indicators of the second cycle is the sum of the final probability of being a real person in the current second cycle,
当前第二周期临时具体指标中的活跃度分数加上一第二周期的最终活跃度分数之和后,除以更新ID记录的最终更新次数,乘以该IP更新次数,作为当前第二周期的最终活跃度分数之和;After adding the sum of the activity score in the temporary specific index of the current second period to the final activity score of the second period, divide by the final update times of the update ID record, and multiply by the IP update times, as the current second period The sum of the final liveness scores;
对于在所述上一第二周期的最终具体指标中未涉及而在所述当前第二周期临时具体指标中涉及的IP地址,通过如下计算获取当前第二周期的临时具体指标:For the IP address involved in the temporary specific index of the current second cycle that is not involved in the final specific index of the last second cycle, the temporary specific index of the current second cycle is obtained by the following calculation:
以当前第二周期临时具体指标中的人数分组作为当前第二周期的最终人数分组,The number of people grouped in the temporary specific indicators of the current second cycle is used as the final number of people grouped in the current second cycle,
以当前第二周期临时具体指标中的为办公出口IP概率作为当前第二周期的最终为办公出口IP概率之和,Taking the IP probability of being an office exit in the temporary specific indicators of the current second cycle as the final sum of the probability of being an office exit IP in the current second cycle,
以当前第二周期临时具体指标中的为家庭出口IP概率作为当前第二周期的最终为家庭出口IP概率之和,Taking the export IP probability of households in the temporary specific indicators of the current second cycle as the final IP probability of household exports in the current second cycle,
以当前第二周期临时具体指标中的为真人概率作为当前第二周期的最终真人概率之和,Taking the probability of being a real person in the temporary specific indicators of the current second cycle as the sum of the final probability of real people in the current second cycle,
以当前第二周期临时具体指标中的活跃度分数除以更新ID,乘以该IP更新次数,作为当前第二周期的最终活跃度分数之和;Divide the activity score in the temporary specific indicators of the current second period by the update ID, multiply the number of IP updates, and use it as the sum of the final activity scores of the current second period;
使用当前第二周期的最终具体指标覆盖所述上一第二周期的最终具体指标,记录更新第二周期的最终具体指标的次数和对相应IP地址更新的次数。Using the final specific index of the current second cycle to cover the final specific index of the last second cycle, recording the number of times of updating the final specific index of the second cycle and the number of updates to the corresponding IP address.
其中,所述根据上一第二周期的最终具体指标与第三方IP库和/或第三方IP黑名单,对所述当前第二周期临时具体指标进行调整,得到当前第二周期的最终具体指标的步骤还包括:Wherein, according to the final specific index of the last second cycle and the third-party IP library and/or third-party IP blacklist, the temporary specific index of the current second cycle is adjusted to obtain the final specific index of the current second cycle The steps also include:
过滤掉所述当前第二周期临时具体指标中对应IP不合语法或对应IP为局域网IP的数据;Filtering out the data corresponding to the IP in the current second-period temporary specific indicators that is not grammatical or the corresponding IP is the LAN IP;
根据第三方IP库中包含的IP地址附加信息,调整当前第二周期临时具体指标中的为办公出口IP概率、家庭出口IP概率及真人概率;According to the additional information of IP addresses contained in the third-party IP database, adjust the temporary specific indicators in the current second cycle to be office exit IP probability, home exit IP probability and real person probability;
根据所述第三方IP黑名单生成IP信用污点数据,将所述IP信用污点数据加入所述当前第二周期的最终具体指标。Generate IP credit taint data according to the third-party IP blacklist, and add the IP credit taint data into the final specific indicators of the current second cycle.
其中,该方法还包括:Wherein, the method also includes:
向第三方提供接口,允许通过所述接口访问所述IP地址的信用数据;或,provide an interface to a third party allowing access to credit data at said IP address via said interface; or,
接收第三方发出针对IP地址的IP验证请求,查找所述IP地址对应的信用数据,根据所述信用数据对所述IP地址进行信用等级评价,向所述第三方返回评价结果。Receiving an IP verification request for an IP address from a third party, searching for credit data corresponding to the IP address, evaluating the credit level of the IP address according to the credit data, and returning the evaluation result to the third party.
根据本发明的另一方面,还提供了一种IP地址分析系统,包括大数据平台与离线计算平台;According to another aspect of the present invention, an IP address analysis system is also provided, including a big data platform and an offline computing platform;
所述大数据平台,用于存储原始日志,计算原始日志,收集并存储IP地址的历史数据;The big data platform is used to store original logs, calculate original logs, collect and store historical data of IP addresses;
所述离线计算平台,用于对所述大数据平台收集的IP地址的历史数据进行分析,生成IP地址的信用数据。The offline computing platform is used to analyze the historical data of the IP address collected by the big data platform, and generate the credit data of the IP address.
本发明的方法和系统通过收集IP地址的历史数据,对IP地址的历史数据进行分析,生成IP地址的信用数据,实现了对IP地址细化精确的分析,以大数据确定IP地址属性,对IP地址信用情况有了全面准确的了解,可应用于IP地址合法性验证、IP地址拦截等场景中,解决了IP地址安全性认知错误的问题,有效防止IP地址合法性误判、IP地址误拦的发生。The method and system of the present invention collect historical data of IP addresses, analyze the historical data of IP addresses, generate credit data of IP addresses, and realize detailed and accurate analysis of IP addresses, and determine the attributes of IP addresses with big data. With a comprehensive and accurate understanding of the IP address credit situation, it can be applied to IP address legality verification, IP address interception and other scenarios, which solves the problem of wrong perception of IP address security and effectively prevents misjudgment of IP address legality, IP address False blocking occurs.
参照附图来阅读对于示例性实施例的以下描述,本发明的其他特性特征和优点将变得清晰。Other characteristic features and advantages of the present invention will become apparent from the following description of exemplary embodiments read with reference to the accompanying drawings.
附图说明Description of drawings
并入到说明书中并且构成说明书的一部分的附图示出了本发明的实施例,并且与描述一起用于解释本发明的原理。在这些附图中,类似的附图标记用于表示类似的要素。下面描述中的附图是本发明的一些实施例,而不是全部实施例。对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,可以根据这些附图获得其他的附图。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate the embodiments of the invention and together with the description serve to explain the principles of the invention. In the drawings, like reference numerals are used to denote like elements. The drawings in the following description are some, but not all, embodiments of the present invention. Those skilled in the art can obtain other drawings based on these drawings without creative efforts.
图1示例性地示出了本发明的实施例一提供的一种IP地址分析方法流程;FIG. 1 exemplarily shows the flow of an IP address analysis method provided by Embodiment 1 of the present invention;
图2示例性地示出了本发明的实施例提供的技术方案的应用原理;Fig. 2 exemplarily shows the application principle of the technical solution provided by the embodiment of the present invention;
图3示例性地示出了本发明的实施例二提供的一种IP地址分析系统的架构。FIG. 3 exemplarily shows the architecture of an IP address analysis system provided by Embodiment 2 of the present invention.
具体实施方式detailed description
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互任意组合。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention. It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined arbitrarily with each other.
现有的拦截方案通过单一固定规则拦截,对访问请求分析的粒度、维度不够,对访问者的身份缺乏认知,容易造成误拦。The existing interception scheme intercepts through a single fixed rule, which does not have enough granularity and dimension for the analysis of access requests, and lacks awareness of the identity of the visitor, which is likely to cause false blocking.
为了解决上述问题,本发明的实施例提供了一种IP地址分析方法,下面结合附图,对本发明的实施例一进行说明。In order to solve the above problems, an embodiment of the present invention provides an IP address analysis method. Embodiment 1 of the present invention will be described below with reference to the accompanying drawings.
本发明实施例提供了一种IP地址分析方法,基于IP地址过往访问历史数据、相关附加信息等数据,对IP地址进行详细的分析,获得IP地址的信用数据,依据该IP地址的信用数据对IP地址进行评价,实现了对IP地址细化准确的分析判断,以信用数据评价IP地址安全程度,有效的防止了误判误拦的发生,具体流程如图1所示,包括:The embodiment of the present invention provides an IP address analysis method, which analyzes the IP address in detail based on past access history data of the IP address, related additional information and other data, obtains the credit data of the IP address, and uses the credit data of the IP address to The evaluation of IP addresses realizes detailed and accurate analysis and judgment of IP addresses, evaluates the security degree of IP addresses with credit data, and effectively prevents misjudgments and blocking. The specific process is shown in Figure 1, including:
步骤101、收集IP地址的历史数据;Step 101, collecting historical data of IP addresses;
本步骤中,按照第一周期收集原始日志,如CDN服务器日志,并解析,将所述CDN服务器日志中的信息格式化,得到预定义指标。所述CDN服务器日志具体为CDN nginx日志,也可以通过其他网络访问日志来计算。In this step, the original logs, such as CDN server logs, are collected according to the first cycle, and analyzed, and the information in the CDN server logs is formatted to obtain predefined indicators. The CDN server log is specifically a CDN nginx log, and may also be calculated through other network access logs.
所述预定义指标至少包含以下信息中的任一项或任意多项:The predefined indicators include at least one or more of the following information:
时间,IP,工作时段请求数,休息时段请求数,睡眠时段请求数,工作时段请求文件大小,休息时段请求文件大小,睡眠时段请求文件大小,工作时段UserAgent数,休息时段UserAgent数,睡眠时段UserAgent数,移动端UserAgent数,PC端UserAgent数,访问来源数量,访问域名数量,出现小时数,其中,所述第一周期包含工作时段、休息时段及睡眠时段,所述“时间”是指相应的CDN服务器日志生成的时间(即用户访问时间),所述“IP”是指涉及的IP地址,所述“工作时段请求数”是指在所述工作时段内所述IP地址发出的请求数,所述“休息时段请求数”是指在所述休息时段内所述IP地址发出的请求数,所述“睡眠时段请求数”是指在所述睡眠时段内所述IP地址发出的请求数,所述“工作时段请求文件大小”是指在所述工作时段内所述IP地址请求的文件大小总和,所述“休息时段请求文件大小”是指在所述休息时段内所述IP地址请求的文件大小总和,所述“睡眠时段请求文件大小”是指在所述睡眠时段内所述IP地址请求的文件大小总和,所述“工作时段UserAgent数”是指在所述工作时段内所述IP地址下出现的UserAgent数,所述“休息时段UserAgent数”是指在所述休息时段内所述IP地址下出现的UserAgent数,所述“睡眠时段UserAgent数”是指在所述睡眠时段内所述IP地址下出现的UserAgent数,所述“移动端UserAgent数”是指所述IP地址通过移动端访问的UserAgent数,所述“PC端UserAgent数”是指所述IP地址通过PC端访问的UserAgent数,所述“访问来源数量”是指在所述第一周期内所述IP地址访问来源的数量,所述“访问域名数量”是指在所述第一周期内所述IP地址访问的域名的数量,所述“出现小时数”是指在所述第一周期内出现所述IP地址访问的小时数(即出现IP地址访问的小时计1个出现小时数,例如2点和4点都有IP访问就将出现小时数值置为2)。Time, IP, number of requests during working hours, number of requests during rest periods, number of requests during sleep periods, file size requested during working hours, file size requested during rest periods, file size requested during sleep periods, number of UserAgents during working hours, number of UserAgents during rest periods, and UserAgent during sleep periods number, the number of UserAgents at the mobile end, the number of UserAgents at the PC end, the number of access sources, the number of access domain names, and the number of hours of appearance. The time when the CDN server log is generated (ie user access time), the "IP" refers to the IP address involved, and the "number of requests during working hours" refers to the number of requests sent by the IP address during the working hours, The "number of requests during the rest period" refers to the number of requests sent by the IP address during the rest period, and the "number of requests during the sleep period" refers to the number of requests sent by the IP address during the sleep period, The "requested file size during working hours" refers to the sum of the file sizes requested by the IP address during the working hours, and the "requested file size during rest periods" refers to the sum of the file sizes requested by the IP address during the rest periods. The total file size, the "sleep period request file size" refers to the total file size requested by the IP address during the sleep period, and the "Working period UserAgent number" refers to the IP address during the working period The number of UserAgents that appear under the address, the "number of UserAgents during the rest period" refers to the number of UserAgents that appear under the IP address during the rest period, and the "number of UserAgents during the sleep period" refers to the number of UserAgents that appear under the IP address during the rest period. The number of UserAgents appearing under the IP address, the "number of UserAgents at the mobile terminal" refers to the number of UserAgents accessed by the IP address through the mobile terminal, and the "number of UserAgents at the PC end" refers to the number of UserAgents accessed by the IP address through the PC terminal The number of UserAgents, the "number of access sources" refers to the number of access sources of the IP address in the first period, and the "number of domain names visited" refers to the number of visits from the IP address in the first period The number of domain names, the "number of occurrence hours" refers to the number of hours that the IP address access occurs within the first period (that is, the hour of IP address access is counted as one occurrence hour, such as 2 o'clock and 4 o'clock If all IP accesses are available, the hour value will be set to 2).
存储各个IP地址对应的所述预定义指标,每个IP地址对应一个或多个预定义指标。具体的,可将所述预定义指标存入hive表。The predefined indicators corresponding to each IP address are stored, and each IP address corresponds to one or more predefined indicators. Specifically, the predefined index can be stored in a hive table.
本步骤中涉及的第一周期优选为1个自然日(24小时)。The first period involved in this step is preferably 1 natural day (24 hours).
优选的,IP地址的历史数据还可以从第三方平台获取,如在第二周期内一次或多次从第三方平台获取第三方IP库和/或第三方IP黑名单,第三方IP库中往往包含IP地址的附加信息(如IP地址或IP地址段的分布,即该IP段对应的国家、省、市、运营商,也可能标明某公司名称,也可能标明为某数据中心)。第三方平台数据不定期更新,因此可在第三方平台的数据发生更新后获取,也可在第二周期准备进行计算前获取。Preferably, the historical data of the IP address can also be obtained from the third-party platform, such as obtaining the third-party IP library and/or the third-party IP blacklist from the third-party platform once or more in the second cycle, and the third-party IP library is often Additional information including IP addresses (such as the distribution of IP addresses or IP address segments, that is, the country, province, city, and operator corresponding to the IP segment, and may also indicate a company name or a data center). The third-party platform data is updated irregularly, so it can be obtained after the third-party platform data is updated, or before the second cycle is ready for calculation.
步骤102、对IP地址的历史数据进行分析,生成IP地址的信用数据;Step 102, analyzing the historical data of the IP address to generate the credit data of the IP address;
本步骤中,对第二周期内与工作日对应的第一周期的预定义指标进行预处理并归一化后得到各个工作日中间值,对所述第二周期内与休息日对应的第一周期的预定义指标进行预处理并归一化后得到各个休息日中间值,所述第二周期包含多个与工作日对应的第一周期和多个与休息日对应的第一周期;对所述各个工作日中间值分别进行加权平均(包括算数加权平均或几何加权平均)处理得到工作日加权均值;对所述各个休息日中间值分别进行加权平均或最大值处理得到休息日加权均值或最大值;依据一个或者多个工作日加权均值,一个或多个休息日加权均值计算得到第二周期内的当前第二周期临时具体指标,所述当前第二周期临时具体指标包括:本周期为办公出口IP概率,本周期为家庭出口IP概率,本周期为真人概率,本周期活跃度分数,本周期人数分组;根据上一第二周期的最终具体指标与第三方IP库和/或第三方IP黑名单,对所述当前第二周期临时具体指标进行调整,得到当前第二周期的最终具体指标,以该当前第二周期的最终具体指标作为所述IP地址的信用数据。In this step, pre-process and normalize the predefined indicators of the first period corresponding to the working day in the second period to obtain the median value of each working day, and the first period corresponding to the rest day in the second period After preprocessing and normalizing the predefined indicators of the cycle, the median value of each rest day is obtained. The second cycle includes multiple first cycles corresponding to working days and multiple first cycles corresponding to rest days; The weighted average (including arithmetic weighted average or geometric weighted average) of the above-mentioned median values of each working day is processed to obtain the weighted average value of the working day; value; based on the weighted average value of one or more working days and the weighted average value of one or more rest days, the current temporary specific indicators for the second cycle in the second cycle are obtained. The temporary specific indicators for the current second cycle include: Probability of export IP, this cycle is the probability of family export IP, this cycle is the probability of real people, the activity score of this cycle, and the number of people in this cycle are grouped; according to the final specific indicators of the previous and second cycles and the third-party IP library and/or third-party IP The blacklist is to adjust the temporary specific index of the current second cycle to obtain the final specific index of the current second cycle, and use the final specific index of the current second cycle as the credit data of the IP address.
第二周期为第一周期的整数倍;优选的,在第一周期为天时,第二周期为月或周。The second cycle is an integer multiple of the first cycle; preferably, when the first cycle is days, the second cycle is months or weeks.
以下对本步骤的具体算法举例进行说明。其中涉及的第一周期为1日,第二周期为1个月;本发明实施例中涉及的归一化算法使用变形的sigmoid函数,1.0/(1.0+math.exp(-分子/分母+4.0)),因为输入值均大于等于0,所以加上4.0。The specific algorithm of this step is illustrated below with an example. The first period involved is 1 day, and the second period is 1 month; the normalization algorithm involved in the embodiment of the present invention uses a deformed sigmoid function, 1.0/(1.0+math.exp(-numerator/denominator+4.0 )), since the input values are all greater than or equal to 0, add 4.0.
1、工作日数据汇总:1. Summary of workday data:
a)求工作日每天数据中间值a) Find the median value of the daily data on weekdays
本步骤中,以IP为维度,计算工作日对应的第一周期的移动端UserAgent数、PC端UserAgent数、请求数中间值。In this step, the IP is used as the dimension to calculate the number of Mobile UserAgents, the number of PC UserAgents, and the median number of requests in the first cycle corresponding to the working day.
请求数:工作时段请求数加上休息时段请求数加上睡眠时段请求数。The number of requests: the number of requests during the working period plus the number of requests during the rest period plus the number of requests during the sleep period.
(1)工作日IP地址为家庭出口IP概率:(1) Probability that the working day IP address is the home export IP:
小时出现数分数:归一化算法分子为6,分母为出现小时数;Fraction of hourly occurrences: the numerator of the normalization algorithm is 6, and the denominator is the number of hours;
移动端UserAgent分数:归一化算法,分子为一段时间内各IP下移动端UserAgent数的均值(如10,不定期更新),分母为移动UserAgent数;Mobile UserAgent score: Normalization algorithm, the numerator is the average number of mobile UserAgents under each IP within a period of time (such as 10, updated irregularly), and the denominator is the number of mobile UserAgents;
PC端UserAgent数分数:归一化算法,分子为一段时间内各IP下PC端UserAgent数的均值(如5,不定期更新),分母为PC端UserAgent数;Score of PC-side UserAgents: normalized algorithm, the numerator is the average value of PC-side UserAgents under each IP within a period of time (such as 5, updated irregularly), and the denominator is the number of PC-side UserAgents;
休息时段VS工作时段请求数分数:归一化算法,分子为休息时段请求数除以4,分母为工作时段请求数除以12;Rest period VS fraction of requests during working hours: Normalization algorithm, the numerator is the number of requests during the rest period divided by 4, and the denominator is the number of requests during the working period divided by 12;
休息时段VS睡眠时段请求数分数:归一化算法,分子为休息时段请求数除以4,分母为睡眠时段请求数除以8;Rest period VS sleep period request score: normalization algorithm, the numerator is the number of requests during the rest period divided by 4, and the denominator is the number of requests during the sleep period divided by 8;
访问域名数分数:归一化算法,分子为一段时间内各IP日均访问域名数量(不定期更新),分母为域名数;Score of the number of domain names visited: normalized algorithm, the numerator is the average number of domain names visited by each IP every day for a period of time (updated from time to time), and the denominator is the number of domain names;
以上小时出现数分数、移动端UserAgent分数、PC端UserAgent数分数、休息时段VS工作时段请求数分数、休息时段VS睡眠时段请求数分数、访问域名数分数的加权均值为工作日为家庭出口IP概率中间值。The weighted mean of the number of occurrences in the above hours, the number of mobile UserAgent scores, the number of PC-side UserAgent scores, the score of requests during rest periods vs. working hours, the scores of requests during rest periods vs. sleep periods, and the scores of access domain names is the family export IP probability on weekdays Median.
(2)工作日为办公出口IP概率中间值:(2) The working day is the median value of the IP probability of the office exit:
工作时段请求数分数:归一化算法,分子为工作时段请求数除以12,分母为一段时间内各IP工作时段请求数小时平均值(不定期更新);Score of requests during working hours: Normalization algorithm, the numerator is the number of requests during working hours divided by 12, and the denominator is the average number of hours of requests for each IP working hours during a period of time (updated irregularly);
工作时段VS休息时段请求数分数:预处理及归一化算法,分子为工作时段请求数除以12,分母为休息时段请求数除以4;Score of requests during working hours VS break hours: preprocessing and normalization algorithm, the numerator is the number of requests during working hours divided by 12, and the denominator is the number of requests during rest periods divided by 4;
休息时段VS睡眠时段请求数分数:预处理及归一化算法,分子为休息时段请求数除以4,分母为睡眠时段请求数除以8;Rest period VS score of sleep period requests: preprocessing and normalization algorithm, the numerator is the number of requests during the rest period divided by 4, and the denominator is the number of requests during the sleep period divided by 8;
PC端UserAgent数分数:预处理及归一化算法,分子为一段时间内各IP下PC端UserAgent数的均值(如10,不定期更新),分母为PC端UserAgent数;PC-side UserAgent number score: preprocessing and normalization algorithm, the numerator is the average value of PC-side UserAgent numbers under each IP within a period of time (such as 10, updated irregularly), and the denominator is the PC-side UserAgent number;
工作时段VS休息时段UserAgent数分数:预处理及归一化算法,分子为工作时段UserAgent数,分母为休息时段UserAgent数;Number of UserAgents during working hours vs. rest periods: preprocessing and normalization algorithm, the numerator is the number of UserAgents during working hours, and the denominator is the number of UserAgents during rest periods;
以上工作时段请求数分数、工作时段VS休息时段请求数分数、休息时段VS睡眠时段请求数分数、PC端UserAgent数分数、工作时段VS休息时段UserAgent数分数,对以上分数值加权平均,得到工作日为办公出口IP概率中间值;The above scores for the number of requests during working hours, the number of requests for working hours vs. rest periods, the scores for requests for rest periods vs. sleep periods, the scores for the number of UserAgents on the PC side, and the scores for the number of UserAgents during working hours vs. break periods. The above scores are weighted and averaged to obtain the working day is the median value of the office exit IP probability;
(3)工作日为真人概率中间值:(3) The working day is the middle value of the probability of a real person:
请求数分布分数:预处理及归一化算法,分子为工作时段请求数除以12与休息时段请求数除以4与睡眠时段除以8的标准差,分母为1;Request distribution score: preprocessing and normalization algorithm, the numerator is the standard deviation of the number of requests divided by 12 during the working period, the number of requests divided by 4 during the rest period, and the number of requests divided by 8 during the sleep period, and the denominator is 1;
UserAgent数分布分数:归一化算法,分子为工作时段UserAgent数,休息时段UserAgent数,睡眠时段UserAgent数的标准差,分母为1;UserAgent number distribution score: Normalization algorithm, the numerator is the number of UserAgents during the working period, the number of UserAgents during the rest period, and the standard deviation of the number of UserAgents during the sleep period, and the denominator is 1;
小时出现数分数:归一化算法,分子为6,分母为小时出现数;Hour occurrence fraction: Normalization algorithm, the numerator is 6, and the denominator is the hour occurrence number;
域名数VS来源数分数:归一化算法,分子为来源数,分母为域名数;The number of domain names VS the number of sources: normalization algorithm, the numerator is the number of sources, and the denominator is the number of domain names;
移动端VSPC端UserAgent数分数:归一化算法,分子为移动端UserAgent数,分母为PC端UserAgent数。Fraction of the number of UserAgents on the mobile VSPC: a normalization algorithm, the numerator is the number of UserAgents on the mobile side, and the denominator is the number of UserAgents on the PC side.
以上分数加权均值为工作日为真人概率中间值。The weighted average of the above scores is the median value of the probability of being a real person on a weekday.
(4)工作日活跃度中间值:(4) Median value of workday activity:
访问域名数分数:归一化算法,分子为访问域名数,分母为10;Score of the number of domain names visited: normalization algorithm, the numerator is the number of domain names visited, and the denominator is 10;
工作时段请求数分数:归一化算法,分子为工作时段请求数除以12,分母为一段时间内各IP工作时段请求数小时平均值(不定期更新);Score of requests during working hours: Normalization algorithm, the numerator is the number of requests during working hours divided by 12, and the denominator is the average number of hours of requests for each IP working hours during a period of time (updated irregularly);
休息时段请求数分数:归一化算法,分子为休息时段请求数除以4,分母为一段时间内各IP休息时段请求数小时平均值(不定期更新);Rest period request count score: Normalization algorithm, the numerator is the number of rest period requests divided by 4, and the denominator is the average number of hours of requests for each IP rest period within a period of time (updated irregularly);
睡眠时段请求数分数:归一化算法,分子为睡眠时段请求数除以8,分母为一段时间内各IP睡眠时段请求数小时平均值(不定期更新);Score of requests during sleep period: Normalization algorithm, the numerator is the number of requests during sleep period divided by 8, and the denominator is the average number of hours of requests for each IP sleep period within a period of time (updated irregularly);
小时出现数分数:归一化算法,分子为小时出现数,分母为6;Hour occurrence fraction: Normalization algorithm, the numerator is the hour occurrence number, and the denominator is 6;
请求来源数分数:归一化算法,分子为请求来源数,分母为平均每个IP每日请求来源数平均值(不定期更新)。Request source number score: normalization algorithm, the numerator is the number of request sources, and the denominator is the average daily number of request sources per IP (updated irregularly).
以上所有分数加权均值得到工作日活跃度中间值。A weighted average of all the above scores yields the median weekday activity.
b)求工作日每天数据中间值的加权均值:b) Find the weighted average of the median value of the daily data on weekdays:
以IP为维度,计算工作日对应的第一周期移动端UserAgent数加权均值,PC端UserAgent数加权均值,请求数加权均值,工作日为家庭出口IP概率加权均值,工作日为办公出口IP概率加权均值,工作日为真人概率加权均值,工作日活跃度加权均值。Taking IP as the dimension, calculate the weighted average of the number of Mobile UserAgents, the weighted average of the number of PC-side UserAgents, and the weighted average of the number of requests corresponding to the working day in the first period. The working day is the weighted mean of the IP probability of the home exit, and the weighted average of the IP probability of the office exit on the working day The average value is the probability-weighted average of real people on weekdays, and the weighted average of activity on weekdays.
2、休息日数据汇总:2. Summary of rest day data:
a)求休息日每天数据中间值a) Find the median value of the daily data on rest days
以IP为维度,计算休息日对应的第一周期的移动端UserAgent数、PC端UserAgent数、请求数的中间值。Taking IP as the dimension, calculate the median value of the number of Mobile UserAgents, PC UserAgents, and requests in the first period corresponding to the rest day.
请求数:工作时段请求数加上休息时段请求数加上睡眠时段请求数。The number of requests: the number of requests during the working period plus the number of requests during the rest period plus the number of requests during the sleep period.
(1)休息日为家庭出口IP概率中间值:类似工作日算法;(1) The rest day is the median value of the family's export IP probability: similar to the working day algorithm;
(2)休息日为真人概率中间值:类似工作日算法;(2) The rest day is the middle value of the probability of a real person: similar to the working day algorithm;
(3)休息日活跃度中间值:类似工作日算法。(3) Median value of activity on rest days: similar to the algorithm on working days.
b)求休息日每天数据中间值的加权均值或最大值:b) Find the weighted mean or maximum value of the median value of the daily data on rest days:
以IP为维度,计算休息日对应的第一周期的移动端UserAgent数最大值,PC端UserAgent数最大值,请求数加权均值,休息日为家庭出口IP概率加权均值,休息日为真人概率加权均值,休息日活跃度加权均值。Taking IP as the dimension, calculate the maximum number of mobile-side UserAgents, the maximum number of PC-side UserAgents, and the weighted average of requests in the first cycle corresponding to the rest day. The rest day is the weighted average of the probability of family exit IP, and the rest day is the weighted average of the probability of real people. , the weighted mean of activity on rest days.
3、工作日和休息日数据汇总,得到当前第二周期临时具体指标:3. Summarize the data of working days and rest days to obtain temporary specific indicators for the current second cycle:
将工作日和休息日按照IP连接一起进行计算,得到:Calculate the working days and rest days according to the IP connection, and get:
IPInt:将该IP转换为对应的长整型。IPInt: Convert the IP to the corresponding long integer.
本周期人数分组:求工作日PC端UserAgent数,计算与工作日对应的第一周期PC端UserAgent数加权平均值。工作日移动端UserAgent数,计算与工作日对应的第一周期移动端UserAgent数加权平均值。休息日PC端UserAgent数,计算与休息日对应的第一周期PC端UserAgent数加权平均值。休息日移动端UserAgent数,计算与休息日对应的第一周期移动端UserAgent数加权平均值。求工作日PC端UserAgent数、工作日移动端UserAgent数、休息日PC端UserAgent数、休息日移动端UserAgent数的最大值后,按照以下进行分组:1:0-1,2:2-5,3:6-10,4:11-30,5:31-50,6:51-100,7:101-500,8:501-2000,9:>2000。Grouping the number of people in this cycle: calculate the number of PC-side UserAgents on weekdays, and calculate the weighted average of the number of PC-side UserAgents corresponding to the weekdays in the first cycle. The number of mobile UserAgents on working days, and calculate the weighted average of the number of mobile UserAgents corresponding to working days in the first period. For the number of PC-side UserAgents on rest days, calculate the weighted average of the number of PC-side UserAgents corresponding to the rest days in the first period. The number of UserAgents on the mobile terminal on rest days, and calculate the weighted average of the number of UserAgents on the mobile end corresponding to the rest days in the first period. After finding the maximum number of UserAgents on the PC side on weekdays, the number of UserAgents on the mobile side on weekdays, the number of UserAgents on the PC side on rest days, and the number of UserAgents on the mobile end on rest days, group them according to the following: 1:0-1, 2:2-5, 3:6-10, 4:11-30, 5:31-50, 6:51-100, 7:101-500, 8:501-2000, 9:>2000.
本周期为办公出口IP概率:This cycle is the probability of office export IP:
工作日VS休息日PC端UserAgent数分数:归一化算法,分子为工作日PC端UserAgent数量,分母为休息日PC端UserAgent数量;Number of PC-side UserAgents on weekdays vs. holidays: normalization algorithm, the numerator is the number of PC-side UserAgents on weekdays, and the denominator is the number of PC-side UserAgents on holidays;
工作日VS休息日移动端UserAgent数分数:归一化算法,分子为工作日移动端UserAgent数量,分母为休息日移动端UserAgent数量;Number of mobile UserAgents on weekdays vs. holidays: normalization algorithm, the numerator is the number of mobile UserAgents on weekdays, and the denominator is the number of mobile UserAgents on holidays;
工作日VS休息日请求数分数:归一化算法,分子为工作日请求数,分母为休息日请求数;Working day VS fraction of holiday requests: normalization algorithm, the numerator is the number of workday requests, and the denominator is the number of holiday requests;
以上三个分数加权均值,与工作日为办公出口IP概率中间值的加权均值为本周期办公出口IP概率。The weighted average of the above three scores, and the weighted average of the median value of the IP probability of office exits on working days is the IP probability of office exits in this cycle.
本周期为家庭出口IP概率:工作日为家庭出口IP概率中间值与休息日为家庭出口IP概率中间值的加权均值。This cycle is the household export IP probability: the weighted average value of the middle value of the household export IP probability on weekdays and the rest day is the household export IP probability.
本周期为真人概率:工作日为真人概率中间值与休息日为真人概率中间值的加权均值。This cycle is the probability of a real person: the weighted average of the median probability of being a real person on weekdays and the median probability of being a real person on rest days.
本周期活跃度分数:工作日活跃度中间值与休息日活跃度中间值的加权均值。Activity score for this period: the weighted average of the median activity on weekdays and the median activity on rest days.
本周期人数分组:以工作日移动端UserAgent数量与休息日移动端UserAgent数量与工作日PC端UserAgent数量与休息日PC端UserAgent数量的最大值进行分组。Grouping of the number of people in this cycle: group by the maximum of the number of mobile UserAgents on weekdays, the number of mobile UserAgents on rest days, the number of PC UserAgents on weekdays, and the number of PC UserAgents on rest days.
将当前第二周期临时具体指标存入MySQL临时数据表,进入调整阶段。Store the temporary specific indicators of the current second cycle into the MySQL temporary data table and enter the adjustment stage.
调整阶段adjustment stage
以存入MySQL的当前第二周期临时具体指标,以及上一个第二周期存入MySQL的上一第二周期的最终具体指标作为本阶段的输入。The current second cycle temporary specific indicators stored in MySQL and the final specific indicators of the previous second cycle stored in MySQL in the previous second cycle are used as the input of this stage.
遍历MySQL临时数据表中的所有IP,每个IP均对应有当前第二周期临时具体指标。Traversing all IPs in the MySQL temporary data table, each IP corresponds to the current second period temporary specific indicators.
1、过滤掉IP不合语法的数据,过滤掉局域网IP。1. Filter out data with ungrammatical IP and filter out LAN IP.
2、获取第三方IP库信息,判断第三方IP库包含的附加信息判断字符串中是否包含以下敏感字符串,并返回相应调整指数:“公司”,“数据中心”,“GSM/TD-SCDMA/LTE”。调整指数包括对“真人概率”,“为办公出口IP概率”,“为家庭出口IP概率”三个概率的调整指数,如不包含敏感字符串则三种概率的调整指数都为1。用三个概率的调整指数分别乘三个概率,并约定概率需在[0.05,0.95]范围内,如小于0.05则返回0.05,如大于0.95则返回0.95。2. Obtain the third-party IP library information, judge the additional information contained in the third-party IP library, judge whether the string contains the following sensitive strings, and return the corresponding adjustment index: "Company", "Data Center", "GSM/TD-SCDMA /LTE". The adjustment index includes the adjustment index for the three probabilities of "probability of real person", "probability of exporting IP for office", "probability of exporting IP for family". If sensitive strings are not included, the adjustment index of the three probabilities is 1. Multiply the three probabilities by the adjustment indices of the three probabilities, and agree that the probabilities must be in the range of [0.05,0.95]. If it is less than 0.05, it will return 0.05, and if it is greater than 0.95, it will return 0.95.
3、获取上一个第二周期存入MySQL的最终具体指标。3. Obtain the final specific indicators stored in MySQL in the last second cycle.
对于某IP地址,根据该IP地址是否有上一第二周期的最终具体指标,对该IP地址生成当前第二周期的最终具体指标的方式也不同,具体如下:For an IP address, depending on whether the IP address has the final specific index of the previous second cycle, the method of generating the final specific index of the current second cycle for the IP address is also different, as follows:
a)如有此IP的上一第二周期的最终具体指标,则对以下指标进行对应操作:a) If there are final specific indicators for the last and second cycle of this IP, perform corresponding operations on the following indicators:
更新ID:即第几次更新第二周期数据。Update ID: the number of times the second cycle data is updated.
IPInt:将该IP转换为对应的长整型。IPInt: Convert the IP to the corresponding long integer.
该IP更新次数:上一第二周期该IP更新次数加1。The number of IP updates: add 1 to the number of IP updates in the previous second cycle.
最终人数分组:如果当前第二周期临时具体指标的人数分组和上一第二周期最终具体指标的最终人数分组为相邻的分组,则最终为人数大的分组;否则最终为临时数据的分组。Final number of people grouping: If the number of people grouping of the temporary specific indicators of the current second cycle and the final number of people grouping of the final specific indicators of the previous second cycle are adjacent groups, the group with the largest number of people will be the final group; otherwise, it will be the temporary data grouping.
最终为办公出口IP概率之和:当前第二周期临时具体指标中的为办公出口IP概率加上一第二周期的最终具体指标中的最终为办公出口IP概率之和。Finally, it is the sum of the IP probability of office exits: the IP probability of office exits in the temporary specific indicators of the current second cycle plus the final IP probability of office exits in the final specific indicators of the second cycle.
最终为家庭出口IP概率之和:当前第二周期临时具体指标中的为家庭出口IP概率加上一第二周期的最终具体指标中的最终为家庭出口IP概率之和。Finally, it is the sum of household export IP probabilities: the household export IP probability in the temporary specific indicators of the current second cycle plus the final final specific indicators of the second cycle is the final sum of household export IP probabilities.
最终为真人概率之和:当前第二周期临时具体指标中的为真人IP概率加上一第二周期的最终具体指标中的最终为真人概率之和。The sum of the final probability of being a real person: the IP probability of a real person in the temporary specific indicators of the current second cycle plus the sum of the final probability of being a real person in the final specific indicators of the second cycle.
最终活跃度分数之和:当前第二周期临时具体指标中的活跃度分数加上一第二周期的最终活跃度分数之和后,除以更新ID记录的最终更新次数,乘以该IP更新次数。The sum of the final activity score: the activity score in the temporary specific index of the current second period plus the sum of the final activity score of the second period, divided by the final update times of the update ID record, multiplied by the IP update times .
如无此IP,则对以下指标进行对应操作:If there is no such IP, perform corresponding operations on the following indicators:
IPInt:将该IP转换为对应的长整型。IPInt: Convert the IP to the corresponding long integer.
更新ID:即第几次更新第二周期数据。Update ID: the number of times the second cycle data is updated.
该IP更新次数:为1。The IP update times: 1.
最终人数分组:当前第二周期临时具体指标的人数分组。Final number of people grouping: the number of people grouping for the temporary specific indicators in the current second cycle.
最终为办公出口IP概率之和:当前第二周期临时具体指标的为办公出口IP概率。Finally, it is the sum of the IP probability of office exits: the temporary specific index for the current second cycle is the IP probability of office exits.
最终为家庭出口IP概率之和:当前第二周期临时具体指标的为家庭出口IP概率。Finally, it is the sum of household export IP probabilities: the temporary specific indicator for the current second cycle is the household export IP probability.
最终为真人概率之和:当前第二周期临时具体指标的真人概率。Finally, it is the sum of real-person probabilities: the real-person probability of temporary specific indicators in the current second cycle.
最终活跃度分数之和:当前第二周期临时具体指标的活跃度分数除以更新ID,乘以该IP更新次数。The sum of the final activity scores: the activity score of the temporary specific indicators in the current second period divided by the update ID, multiplied by the number of updates of the IP.
此外,还可根据第三方IP黑名单生成IP信用污点数据,将所述IP信用污点数据加入所述当前第二周期的最终具体指标。In addition, IP credit stain data can also be generated according to the third-party IP blacklist, and the IP credit stain data can be added to the final specific index of the current second cycle.
具体的,黑名单中信息一般包括列入黑名单的IP地址或者IP地址段。信用污点数据优选可用信用污点分数的形式来表现,例如存在于越多的第三方黑名单中,其信用污点分数越高,不存在于黑名单中则信用污点分数为0。Specifically, the information in the blacklist generally includes the IP addresses or IP address segments included in the blacklist. Credit taint data can preferably be expressed in the form of credit taint scores, for example, the more third-party blacklists exist, the higher the credit taint score, and the credit taint score is 0 if it does not exist in the blacklist.
使用当前第二周期的最终具体指标覆盖所述前一第二周期的最终具体指标,记录更新第二周期的最终具体指标的次数和对相应IP地址更新的次数。将以上所有数据更新至MySQL。Using the final specific index of the current second cycle to cover the final specific index of the previous second cycle, recording the number of times of updating the final specific index of the second cycle and the number of updates to the corresponding IP address. Update all the above data to MySQL.
当前第二周期临时具体指标和最终具体指标存至MySQL时可以IP作为索引;也可以IP转为对应长整数后以对应的长整数为索引,按照IPInt均分为256份后分表存储,以方便查询及提高查询速度。When the current second-period temporary specific indicators and final specific indicators are stored in MySQL, the IP can be used as the index; the IP can also be converted into the corresponding long integer and then the corresponding long integer can be used as the index, divided into 256 parts according to the IPInt and then stored in separate tables. Convenient query and improve query speed.
在获取依据当前第二周期更新的最终具体指标后,即以IP地址对应的最终具体指标作为该IP地址的信用数据,依据该信用数据对IP地址进行评价。可向第三方提供接口,允许通过所述接口访问所述IP地址的信用数据;也可接收第三方发出针对IP地址的IP验证请求,查找所述IP地址对应的信用数据,根据所述信用数据对所述IP地址进行信用等级评价,向所述第三方返回评价结果。After obtaining the final specific index updated according to the current second cycle, the final specific index corresponding to the IP address is used as the credit data of the IP address, and the IP address is evaluated according to the credit data. An interface can be provided to a third party to allow access to the credit data of the IP address through the interface; an IP verification request for the IP address can also be received from a third party, and the credit data corresponding to the IP address can be searched for, and according to the credit data Perform a credit rating evaluation on the IP address, and return the evaluation result to the third party.
可应用于防火墙对IP拦截的操作中,防火墙根据IP地址的信用数据判定该IP地址的合法性,也可独立成一个IP信用等级平台,向防火墙提供IP验证结果。也可在不影响现有防火墙功能的前提下,提供一个二次验证的机制,即在防火墙判定IP地址可疑时,再由IP信用等级平台依据信用数据进行二次验证,进一步提高防火墙拦截的准确性,防止误拦。It can be applied to the firewall's IP interception operation. The firewall judges the legality of the IP address based on the credit data of the IP address. It can also independently form an IP credit rating platform to provide the IP verification result to the firewall. It can also provide a secondary verification mechanism without affecting the existing firewall functions, that is, when the firewall determines that the IP address is suspicious, the IP credit rating platform will conduct secondary verification based on the credit data to further improve the accuracy of firewall interception To prevent accidental blocking.
本发明实施例提供的IP地址分析方法,能够与现有的互联网架构相结合,如图2所示,收集用户访问日志作为原始日志,结合第三方的IP黑名单与IP地址库,使用本发明实施例提供的IP地址分析方法,得到主要由最终具体指标构成的IP用户属性数据、主要根据第三方IP黑名单生成的IP污点数据和根据第三方IP地址库得到的IP地址库数据,并将IP用户属性数据、IP污点数据与IP地址库数据整合至IP信用等级平台,对IP地址进行信用评级,得到IP地址的信用数据。IP地址的信用数据全面的描述了IP地址的特征,可用于信息安全领域对IP地址安全性的确认,或基于IP的用户画像领域,实现了基于大数据分析的IP地址精确描绘。应用结果亦可反馈给IP信用等级平台,对已有结果进行算法迭代和参数调整,赋予系统自学习自调整的能力,进一步提高IP信用等级平台对IP分析的精度。The IP address analysis method provided by the embodiment of the present invention can be combined with the existing Internet architecture. As shown in FIG. The IP address analysis method provided by the embodiment obtains the IP user attribute data mainly composed of final specific indicators, the IP stain data mainly generated according to the third-party IP blacklist, and the IP address database data obtained according to the third-party IP address database, and IP user attribute data, IP taint data and IP address database data are integrated into the IP credit rating platform to perform credit rating on IP addresses and obtain credit data of IP addresses. The credit data of IP addresses comprehensively describes the characteristics of IP addresses, which can be used to confirm the security of IP addresses in the field of information security, or in the field of IP-based user portraits, and realize the accurate description of IP addresses based on big data analysis. The application results can also be fed back to the IP credit rating platform to perform algorithm iteration and parameter adjustment on the existing results, endowing the system with the ability of self-learning and self-adjustment, and further improving the accuracy of IP analysis by the IP credit rating platform.
下面结合附图,对本发明的实施例二进行说明。Embodiment 2 of the present invention will be described below with reference to the accompanying drawings.
本发明实施例提供了一种IP地址分析系统,其结构如图3所示,包括:The embodiment of the present invention provides a kind of IP address analyzing system, its structure is shown in Figure 3, comprises:
大数据计算平台和离线计算平台。Big data computing platform and offline computing platform.
其中,大数据平台包括:Hadoop计算平台、spark计算平台;Among them, the big data platform includes: Hadoop computing platform, spark computing platform;
离线计算平台包括:服务器或服务器集群。Offline computing platforms include: servers or server clusters.
所述大数据平台,用于存储原始日志,计算原始日志,收集并存储IP地址的历史数据;The big data platform is used to store original logs, calculate original logs, collect and store historical data of IP addresses;
所述离线计算平台,用于对所述大数据平台收集的IP地址的历史数据进行分析,生成IP地址的信用数据。The offline computing platform is used to analyze the historical data of the IP address collected by the big data platform, and generate the credit data of the IP address.
所述离线计算平台,还能够通过所述大数据平台与第三方进行通信,向第三方提供所述IP地址的信用数据,或接收第三方查询请求返回依据所述信用数据地IP地址验证的辅助信息。The offline computing platform can also communicate with a third party through the big data platform, provide the credit data of the IP address to the third party, or receive a query request from a third party and return the auxiliary IP address verification based on the credit data. information.
优选的,该IP地址分析系统还包括存储平台,所述存储平台支持MySQL系统,可用于存储所述原始日志、IP地址的信用数据、从第三方获取的第三方IP库与第三方IP黑名单、本周期的最终具体指标、上一第二周期的最终具体指标、当前第二周期的临时具体指标及运算过程中产生的中间数据等。Preferably, the IP address analysis system also includes a storage platform, the storage platform supports the MySQL system, and can be used to store the original log, the credit data of the IP address, the third-party IP library obtained from the third party and the third-party IP blacklist , the final specific indicators of this cycle, the final specific indicators of the previous second cycle, the temporary specific indicators of the current second cycle, and the intermediate data generated during the calculation process, etc.
本发明的实施例提供了一种IP地址分析系统,能够与本发明的实施例提供的一种IP地址分析方法相结合,通过收集IP地址的历史数据,对IP地址的历史数据进行分析,生成IP地址的信用数据,实现了对IP地址细化精确的分析,以大数据确定IP地址属性,对IP地址信用情况有了全面准确的了解,可应用于IP地址合法性验证、IP地址拦截等场景中,解决了IP地址安全性认知错误的问题,有效防止IP地址合法性误判、IP地址误拦的发生。The embodiment of the present invention provides an IP address analysis system, which can be combined with an IP address analysis method provided by the embodiment of the present invention, by collecting historical data of IP addresses, analyzing the historical data of IP addresses, and generating The credit data of IP addresses realizes the detailed and accurate analysis of IP addresses, determines the attributes of IP addresses with big data, and has a comprehensive and accurate understanding of the credit status of IP addresses, which can be applied to IP address legality verification, IP address interception, etc. In the scenario, the problem of erroneous perception of IP address security is solved, effectively preventing misjudgment of the legality of IP addresses and false blocking of IP addresses.
上面描述的内容可以单独地或者以各种方式组合起来实施,而这些变型方式都在本发明的保护范围之内。The content described above can be implemented alone or combined in various ways, and these variants are all within the protection scope of the present invention.
最后应说明的是:以上实施例仅用以说明本发明的技术方案,而非对其限制。尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them. Although the present invention has been described in detail with reference to the aforementioned embodiments, those of ordinary skill in the art should understand that: it can still modify the technical solutions described in the aforementioned embodiments, or perform equivalent replacements for some of the technical features; and these The modification or replacement does not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention.
Claims (10)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710216069.1A CN107707516B (en) | 2017-04-01 | 2017-04-01 | A kind of IP address analysis method and system |
PCT/CN2018/079732 WO2018177167A1 (en) | 2017-04-01 | 2018-03-21 | Method for analyzing ip address, system, computer readable storage medium, and computer device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710216069.1A CN107707516B (en) | 2017-04-01 | 2017-04-01 | A kind of IP address analysis method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107707516A true CN107707516A (en) | 2018-02-16 |
CN107707516B CN107707516B (en) | 2018-11-13 |
Family
ID=61169473
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710216069.1A Active CN107707516B (en) | 2017-04-01 | 2017-04-01 | A kind of IP address analysis method and system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107707516B (en) |
WO (1) | WO2018177167A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018177167A1 (en) * | 2017-04-01 | 2018-10-04 | 贵州白山云科技有限公司 | Method for analyzing ip address, system, computer readable storage medium, and computer device |
CN108683531A (en) * | 2018-05-02 | 2018-10-19 | 百度在线网络技术(北京)有限公司 | Method and apparatus for handling log information |
CN109873811A (en) * | 2019-01-16 | 2019-06-11 | 光通天下网络科技股份有限公司 | Network safety protection method and its network security protection system based on attack IP portrait |
CN110401727A (en) * | 2018-04-24 | 2019-11-01 | 北京数安鑫云信息技术有限公司 | An IP address analysis method and device |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101014072A (en) * | 2007-02-15 | 2007-08-08 | 北京互联易通信息技术有限公司 | Method and apparatus for obtaining and analyzing data information aimed at data object |
CN101719824A (en) * | 2009-11-24 | 2010-06-02 | 北京信息科技大学 | Network behavior detection-based trust evaluation system and network behavior detection-based trust evaluation method |
US20130067062A1 (en) * | 2011-09-12 | 2013-03-14 | Microsoft Corporation | Correlation of Users to IP Address Lease Events |
CN103475637A (en) * | 2013-04-24 | 2013-12-25 | 携程计算机技术(上海)有限公司 | Network access control method and system based on IP access behaviors |
US8661119B1 (en) * | 2006-06-30 | 2014-02-25 | Google Inc. | Determining a number of users behind a set of one or more internet protocol (IP) addresses |
CN104506356A (en) * | 2014-12-24 | 2015-04-08 | 网易(杭州)网络有限公司 | Method and device for determining credibility of IP (Internet protocol) address |
CN104954188A (en) * | 2015-06-30 | 2015-09-30 | 北京奇虎科技有限公司 | Cloud based web log security analysis method, device and system |
CN105610616A (en) * | 2015-12-29 | 2016-05-25 | 赛尔网络有限公司 | Method and system for performing statistics to obtain average flow of single IP (Internet Protocol) of access network based on ICP (Internet Content Provider) activity |
CN106230890A (en) * | 2016-07-15 | 2016-12-14 | 中电长城网际系统应用有限公司 | A kind of message normalization processing method and system |
CN106254096A (en) * | 2016-07-21 | 2016-12-21 | 柳州龙辉科技有限公司 | A kind of processing means of Linux daily record |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1746916A (en) * | 2005-10-25 | 2006-03-15 | 二六三网络通信股份有限公司 | Evaluation Method of Network IP Address Reputation and Its Application in E-mail System |
US20150215334A1 (en) * | 2012-09-28 | 2015-07-30 | Level 3 Communications, Llc | Systems and methods for generating network threat intelligence |
US9319382B2 (en) * | 2014-07-14 | 2016-04-19 | Cautela Labs, Inc. | System, apparatus, and method for protecting a network using internet protocol reputation information |
CN107707516B (en) * | 2017-04-01 | 2018-11-13 | 贵州白山云科技有限公司 | A kind of IP address analysis method and system |
-
2017
- 2017-04-01 CN CN201710216069.1A patent/CN107707516B/en active Active
-
2018
- 2018-03-21 WO PCT/CN2018/079732 patent/WO2018177167A1/en active Application Filing
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8661119B1 (en) * | 2006-06-30 | 2014-02-25 | Google Inc. | Determining a number of users behind a set of one or more internet protocol (IP) addresses |
CN101014072A (en) * | 2007-02-15 | 2007-08-08 | 北京互联易通信息技术有限公司 | Method and apparatus for obtaining and analyzing data information aimed at data object |
CN101719824A (en) * | 2009-11-24 | 2010-06-02 | 北京信息科技大学 | Network behavior detection-based trust evaluation system and network behavior detection-based trust evaluation method |
US20130067062A1 (en) * | 2011-09-12 | 2013-03-14 | Microsoft Corporation | Correlation of Users to IP Address Lease Events |
CN103475637A (en) * | 2013-04-24 | 2013-12-25 | 携程计算机技术(上海)有限公司 | Network access control method and system based on IP access behaviors |
CN104506356A (en) * | 2014-12-24 | 2015-04-08 | 网易(杭州)网络有限公司 | Method and device for determining credibility of IP (Internet protocol) address |
CN104954188A (en) * | 2015-06-30 | 2015-09-30 | 北京奇虎科技有限公司 | Cloud based web log security analysis method, device and system |
CN105610616A (en) * | 2015-12-29 | 2016-05-25 | 赛尔网络有限公司 | Method and system for performing statistics to obtain average flow of single IP (Internet Protocol) of access network based on ICP (Internet Content Provider) activity |
CN106230890A (en) * | 2016-07-15 | 2016-12-14 | 中电长城网际系统应用有限公司 | A kind of message normalization processing method and system |
CN106254096A (en) * | 2016-07-21 | 2016-12-21 | 柳州龙辉科技有限公司 | A kind of processing means of Linux daily record |
Non-Patent Citations (1)
Title |
---|
张洪, 等: "基于IP地址聚类的反垃圾邮件信誉系统", 《清华大学学报(自然科学版)》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018177167A1 (en) * | 2017-04-01 | 2018-10-04 | 贵州白山云科技有限公司 | Method for analyzing ip address, system, computer readable storage medium, and computer device |
CN110401727A (en) * | 2018-04-24 | 2019-11-01 | 北京数安鑫云信息技术有限公司 | An IP address analysis method and device |
CN110401727B (en) * | 2018-04-24 | 2022-04-19 | 北京数安鑫云信息技术有限公司 | A kind of IP address analysis method and device |
CN108683531A (en) * | 2018-05-02 | 2018-10-19 | 百度在线网络技术(北京)有限公司 | Method and apparatus for handling log information |
CN108683531B (en) * | 2018-05-02 | 2019-06-21 | 百度在线网络技术(北京)有限公司 | Method and apparatus for handling log information |
CN109873811A (en) * | 2019-01-16 | 2019-06-11 | 光通天下网络科技股份有限公司 | Network safety protection method and its network security protection system based on attack IP portrait |
Also Published As
Publication number | Publication date |
---|---|
CN107707516B (en) | 2018-11-13 |
WO2018177167A1 (en) | 2018-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017037444A1 (en) | Malicious activity detection on a computer network and network metadata normalisation | |
CN107707516B (en) | A kind of IP address analysis method and system | |
WO2012126117A1 (en) | Systems and methods for automatic detection of non-compliant content in user actions | |
EP4440040A1 (en) | Systems and methods for artificial network traffic detection | |
CN114338064A (en) | Method, device, equipment and storage medium for identifying network traffic type | |
US11836265B2 (en) | Type-dependent event deduplication | |
US11297105B2 (en) | Dynamically determining a trust level of an end-to-end link | |
CN118504009B (en) | Dynamic data isolation method and system based on multiple data sources | |
CN108667783A (en) | A precise interception method, device and system for IP addresses | |
CN115238247A (en) | Data processing method based on zero trust data access control system | |
CN109327356B (en) | User portrait generation method and device | |
CN115811468A (en) | Distribution method, device, electronic equipment and storage medium of flow collection strategy | |
US11429697B2 (en) | Eventually consistent entity resolution | |
CN119996084A (en) | A method and system for dynamic control of network access rights based on behavior analysis | |
CN112769739B (en) | Database operation violation processing method, device and equipment | |
CN112632044A (en) | Database security audit method | |
CN116132119A (en) | Web application API sensitive data risk monitoring method and system | |
CN119646579A (en) | Data processing method, device, storage medium and computer equipment | |
CN114760083A (en) | Method and device for issuing attack detection file and storage medium | |
US20240356969A1 (en) | Statistical modeling of email senders to detect business email compromise | |
CN115396329B (en) | Method, device, equipment and storage medium for determining target different network number | |
CN115134808B (en) | Network traffic detection method and system | |
CN116260654A (en) | Intrusion detection method, device, equipment and storage medium | |
CN114969450A (en) | User behavior analysis method, device, equipment and storage medium | |
RU2698412C2 (en) | System for protecting personal data of users in an information system based on depersonalization and migration to a secure environment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: 100015 5 floor, block E, 201 IT tower, electronic city, 10 Jiuxianqiao Road, Chaoyang District, Beijing. Patentee after: GUIZHOU BAISHANCLOUD TECHNOLOGY Co.,Ltd. Address before: 100015 5 floor, block E, 201 IT tower, electronic city, 10 Jiuxianqiao Road, Chaoyang District, Beijing. Patentee before: GUIZHOU BAISHANCLOUD TECHNOLOGY Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20181105 Address after: 100015 Beijing Chaoyang District Jiuxianqiao North Road 10 hospital 201 Building 5 floor 505 inside 02 Patentee after: BEIJING SHUAN XINYUN INFORMATION TECHNOLOGY Co.,Ltd. Address before: 100015 5 floor, block E, 201 IT tower, electronic city, 10 Jiuxianqiao Road, Chaoyang District, Beijing. Patentee before: GUIZHOU BAISHANCLOUD TECHNOLOGY Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20250718 Address after: 100015 Beijing City Chaoyang District Dongsanhuan Middle Road 20 Building 17th Floor 1709 Unit Patentee after: BEIJING SHUJU XINYUN INFORMATION TECHNOLOGY Co.,Ltd. Country or region after: China Address before: 100015 Beijing Chaoyang District Jiuxianqiao North Road 10 hospital 201 Building 5 floor 505 inside 02 Patentee before: BEIJING SHUAN XINYUN INFORMATION TECHNOLOGY Co.,Ltd. Country or region before: China |