[go: up one dir, main page]

CN113824810B - A target-driven method for inferring IP address geolocation - Google Patents

A target-driven method for inferring IP address geolocation Download PDF

Info

Publication number
CN113824810B
CN113824810B CN202110964934.7A CN202110964934A CN113824810B CN 113824810 B CN113824810 B CN 113824810B CN 202110964934 A CN202110964934 A CN 202110964934A CN 113824810 B CN113824810 B CN 113824810B
Authority
CN
China
Prior art keywords
node
target
address
anchor
anchor node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110964934.7A
Other languages
Chinese (zh)
Other versions
CN113824810A (en
Inventor
温胜昔
季宇凯
王占丰
陈潇霆
陈嘉欣
马潇霄
张一杭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Lexbell Information Technology Co ltd
Original Assignee
Nanjing Lexbell Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Lexbell Information Technology Co ltd filed Critical Nanjing Lexbell Information Technology Co ltd
Priority to CN202110964934.7A priority Critical patent/CN113824810B/en
Publication of CN113824810A publication Critical patent/CN113824810A/en
Application granted granted Critical
Publication of CN113824810B publication Critical patent/CN113824810B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/60Types of network addresses
    • H04L2101/69Types of network addresses using geographic information, e.g. room number

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a target-driven IP address geographic position deducing method, which comprises the steps of firstly collecting anchor nodes based on a target IP, then screening and calibrating the anchor nodes, and finally comprehensively detecting the target IP to construct a target portrait for high-precision positioning. The invention can effectively infer the use information of the IP address, effectively solves the problem of lower efficiency of blindly obtaining the anchor node in a large range, can further improve the accuracy of comprehensive inference of the IP use information through more effective anchor nodes, and comprehensively infers the use and positioning information of the IP to be detected according to the path approximation principle and the geographic position information by comparing and analyzing the topology paths of the anchor nodes and the IP to be detected.

Description

Target-driven IP address geographic position deducing method
Technical Field
The invention relates to the technical field of target IP building level positioning, in particular to a building level IP positioning method based on target driving and IP comprehensive inference.
Background
Currently, a common method for positioning an IP is to estimate its geographic location by means of various information such as paths, reference nodes, bearer service information, white data, and the like of the IP. The basic principle of positioning algorithm design is to reduce the measurement overhead as much as possible under the condition of ensuring the positioning accuracy, and simultaneously has good expansibility without the support of a client. The initial positioning algorithm infers the geographic location of the IP device by querying a DNS server or mining information implicit in the hostname. In recent years, a probability-based positioning algorithm becomes a research hotspot again, and positioning is performed by searching a distribution rule of time delay and geographic distance. Because of the many IP positioning algorithms, the classification can be performed according to different standards such as whether the support of the client is needed, the positioning principle, etc. In the existing positioning algorithm, the positioning algorithm based on the client has the highest precision, but often uses the infrastructure such as GPS, cellular base station, wiFi access point and the like, the data is derived from analysis of Whois data, or from operator data, or from analysis of network data, and the positioning precision and the accuracy cannot be ensured, so that the application range of IP positioning data is greatly influenced. Although many IP positioning algorithms are proposed by researchers, due to the lack of a large number of reference nodes, extensive deployment is not possible to obtain highly accurate results.
Disclosure of Invention
In order to overcome the problems in the prior art, the invention provides a target-driven IP address positioning algorithm, which realizes dynamic collection of positioning anchor nodes around an IP address positioning target and then deduces the geographic position of the target IP through path similarity comparison. The method can avoid large-scale network space detection and anchor node acquisition, thereby reducing the overhead of network space measurement, realizing high-precision positioning of the IP address, and ensuring that the positioning precision can reach the neighborhood level, even building level precision. In order to achieve the technical purpose and the technical effect, the invention is realized by the following technical scheme:
step 1, anchor node information collection, namely acquiring anchor nodes from a target IP network segment;
Step 2, anchor node information calibration, namely judging the effectiveness of the anchor node and the node type according to the node equipment type information;
step 3, deducing the geographical position of the IP to be positioned by measuring the similarity of the path of the IP to be positioned and the anchor node path;
A geographic position deducing method of IP address driven by target includes the following steps:
The invention provides a target-driven IP address geographic position deducing method aiming at the fact that an IP positioning algorithm lacks a large number of reference nodes and cannot be deployed in a large range to obtain high-accuracy positioning. The algorithm classifies the IP responding to the network measurement condition by establishing an effective anchor node set in a target IP network segment according to the target IP address, measures the IP responding to the measurement by using Traceroute to obtain a network path to the target node, then selects N nearest IP addresses to the node by a path matching method, and determines the geographic position of the target IP by a centroid method or a nearest neighbor method. The positioning method has higher building level positioning accuracy.
Drawings
FIG. 1 is a flow chart of a positioning algorithm of the present invention;
FIG. 2 is a schematic measurement diagram of the present invention;
Table 1 shows the active IP and ports (220.180.112. X detection result of the relevant units of the urban area Du Zhonglu and the government and enterprise of the city, bo, anhui province) in the network segment where the target IP is located;
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
As shown in FIG. 1, the IP address of the target drive of the invention uses a unit inference algorithm to firstly judge whether the active IP in the network segment where the target IP is located is an anchor node, then judge whether the candidate anchor node is an effective anchor node and a server node type according to the type of the target IP equipment, finally measure the network path of the target node by Traceroute, then select N IP addresses nearest to the node by a path matching method, and determine the geographic position of the target IP by adopting a centroid method or a nearest neighbor method. The detailed algorithm flow comprises the following steps:
detecting the well-known port and the registered port, finding the active IP address of the C-class network segment where the IP address is located, detecting the network service related to the position, and if no effective anchor node exists in the C-class network segment where the target IP is located, searching the anchor nodes from two adjacent network segments C-1 and C+1 until enough active anchor nodes are found. The device probing for open services is expressed as:
Pi={P1, P2, P3, P4,...,Pm}(1)
And step two, judging whether the node is a relay node or not through the position of the node in the network, wherein all the IP which is not positioned at the last hop in the path library is regarded as the relay node, and otherwise, the node is an end node. And judging whether all the end nodes are NAT gateway, CDN node and server according to the result of classifying the device types by the Bayesian judging network, and judging that the end nodes are effective anchor nodes if the end nodes are server nodes.
Wherein the probability that the equipment Ei is ci is;
(2)
assuming that each network entity's open services or ports are independent, equation (2) can be expressed according to bayesian theory as:
P(M|c)=P({m1,m2,...,mn}|c)=P(m1|c)P(m2|c)...P(mn|c)(3)
and clustering the effective reference nodes according to the network path similarity of the anchor nodes. If the geographic distance of the cluster Ci is larger than the radius of the city, the cluster Ci is described as a cloud node, otherwise, the cluster Ci is an independent host.
distance(Ci)=Max{ distance (loc(IPm), loc(IPn))}(4)
Where distance (Ai, aj) is the distance of the anchor node Ai, aj, distance (Ci) is the distance of the cluster, IPm, IPn e Ci.
In this example, all active IPs in the network segment 220.180.112.X/24 where the IPs are located are detected first, and the active IPs and their attribution unit information are found as shown in table 1.
TABLE 1
And step three, responding to the network measurement condition through the target IP address, and dividing the IP address into IP responding to the measurement and IP not responding to the measurement. The measured IP address of the last hop on the measuring path can be replaced by the non-responding measured IP address, and the fingerprint information of the node is detected for the node responding to the measurement. And if the target node is an independent host, judging according to the network topology proximity, firstly measuring a network path of the target node through Traceroute, then selecting N nearest IP addresses to the node through a path matching method, and determining the geographic position of the target IP by adopting a centroid method or a nearest neighbor method.
By centroid method, the target IP address is expressed as:
(1)
Where, (latp, longp) represents the coordinates of the pending node and (lati, longi) represents the coordinates of node i.
The best proximity rule is adopted:
(2)
Finally, the Traceroute of the IP addresses is measured, and the second-to-last hops of the paths of the 3 IP addresses to be inferred are '61.132.186.166', which are similar to the 5 anchor node IP address topology paths of 220.180.112.229, 220.180.112.232, 220.180.112.248, 220.180.112.249 and 220.180.112.149.
As can be seen from the above data, the number of websites of government and enterprise related units in all websites of section C where the IP to be detected is located is 31, the number of websites of other types is 3, and the websites of government and enterprise related units occupy 91%, so that the IP of the network section can be inferred to be mainly used by the government and enterprise related units because most websites are government and enterprise related unit websites, and meanwhile, the 3 IP addresses can be further verified to be the IP addresses used by the government and enterprise units according to the path approximation principle of topology, and the real geographic position of the IP addresses is located in the data center of the Bozhou city with high probability.
The embodiments of the present invention have been described in detail with reference to the drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the spirit of the present invention.

Claims (3)

1. The target-driven IP address geographic position deducing method is characterized by comprising the following steps:
Step 1, anchor node information collection, namely detecting an active IP address by using a network segment where a target IP is located, collecting fingerprint characteristics if the active IP exists, and judging whether the anchor node is an anchor node or not;
Step 2, anchor node calibration, judging whether each candidate anchor node is an effective anchor node and a corresponding server node type according to whether the anchor node has an explicit geographic position;
Step 3, deducing the geographic position, namely, measuring the similarity between the path of the IP to be positioned and the path of the anchor node, wherein a centroid method or an adjacent anchor node address position substitution method can be adopted for different types of target nodes;
In the step 2, whether the node is a relay node is judged by the position of the node in the network, and the IP which is not located at the last hop in the path library is regarded as the relay node, otherwise the IP is an end node;
wherein the probability that the equipment Ei is ci is;
(2)
assuming that each network entity's open services or ports are independent, equation (2) can be expressed according to bayesian theory as:
P(M|c)=P({m1,m2,...,mn}|c)=P(m1|c)P(m2|c)...P(mn|c)(3)
Clustering the effective reference nodes according to the network path similarity of the anchor nodes, if the geographic distance of the cluster Ci is larger than the radius of the city, the cluster Ci is described as a cloud node, otherwise, the cluster Ci is an independent host;
distance(Ci)=Max{ distance (loc(IPm), loc(IPn))} (4)
where distance (Ai, aj) is the distance of the anchor node Ai, aj, distance (Ci) is the distance of the cluster, IPm, IPn e Ci.
2. The method for target-driven IP address geographic location inference as defined in claim 1, wherein in step 1, by probing the well-known port (Well Known Ports) and the registered port (REGISTERED PORTS), the active IP address of the segment C where the IP address is located is found to probe the location-related network service again, and if there is no valid anchor node in the segment C where the target IP is located, an anchor node search is performed from two adjacent segments C-1 and C+1 until a sufficiently active anchor node is found,
Wherein, the device detecting the open service is expressed as:
Pi={P1, P2, P3, P4,…,Pm}。
3. The method of IP address geographic position deducing of target drive of claim 1, wherein in step 3, IP address is divided into IP address of response measurement and IP address of non-response measurement according to condition of target IP address response network measurement, the IP address of non-response measurement can be replaced by last hop IP address of measurement path, fingerprint information of node of response measurement is detected, judging target node property according to step 2 method, if target node is cloud computing node, obtaining geographic position of cloud service provider by calling hundred degree interface, if it is independent host, judging according to network topology proximity, firstly measuring network path of target node by Traceroute, then selecting N IP addresses nearest to the node by path matching method, and determining geographic position of target IP by centroid method or nearest neighbor method;
By centroid method, the target IP address is expressed as:
(1)
with the best proximity rule, the target IP address is expressed as:
(2)
where, (latp, longp) represents the coordinates of the pending node and (lati, longi) represents the coordinates of node i.
CN202110964934.7A 2021-08-23 2021-08-23 A target-driven method for inferring IP address geolocation Active CN113824810B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110964934.7A CN113824810B (en) 2021-08-23 2021-08-23 A target-driven method for inferring IP address geolocation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110964934.7A CN113824810B (en) 2021-08-23 2021-08-23 A target-driven method for inferring IP address geolocation

Publications (2)

Publication Number Publication Date
CN113824810A CN113824810A (en) 2021-12-21
CN113824810B true CN113824810B (en) 2025-06-24

Family

ID=78913386

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110964934.7A Active CN113824810B (en) 2021-08-23 2021-08-23 A target-driven method for inferring IP address geolocation

Country Status (1)

Country Link
CN (1) CN113824810B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114785719B (en) * 2022-04-13 2024-05-10 北京亚鸿世纪科技发展有限公司 IP region attribution method for forming region fingerprint through ping command
CN116668524B (en) * 2023-06-08 2025-08-22 深圳永安在线科技有限公司 A method and system for identifying and locating home bandwidth IP
CN116866310A (en) * 2023-07-13 2023-10-10 东南大学 An IP location inference method and system based on massive landmarks

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107920115A (en) * 2017-11-17 2018-04-17 南京莱克贝尔信息技术有限公司 A kind of City-level IP localization methods based on time delay and geographical consistency constraint
CN110012128A (en) * 2019-04-12 2019-07-12 中原工学院 Network entity landmark screening method based on routing hop count

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4952583B2 (en) * 2005-10-25 2012-06-13 日本電気株式会社 Hierarchical mobility management system, access router, anchor node, mobile communication system, and route setting method
CN110300368B (en) * 2019-05-24 2021-01-01 中国人民解放军63880部队 IP geographical positioning system overall processing method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107920115A (en) * 2017-11-17 2018-04-17 南京莱克贝尔信息技术有限公司 A kind of City-level IP localization methods based on time delay and geographical consistency constraint
CN110012128A (en) * 2019-04-12 2019-07-12 中原工学院 Network entity landmark screening method based on routing hop count

Also Published As

Publication number Publication date
CN113824810A (en) 2021-12-21

Similar Documents

Publication Publication Date Title
CN113824810B (en) A target-driven method for inferring IP address geolocation
CN102246463B (en) Geolocation mapping of network devices
Katz-Bassett et al. Towards IP geolocation using delay and topology measurements
Scheitle et al. HLOC: Hints-based geolocation leveraging multiple measurement frameworks
US9052378B2 (en) Estimation of position using WLAN access point radio propagation characteristics in a WLAN positioning system
Liu et al. Mining checkins from location-sharing services for client-independent ip geolocation
CN110474843B (en) IP Location Method Based on Routing Hop Count
Zhao et al. IP Geolocation based on identification routers and local delay distribution similarity
Laitinen et al. Access point significance measures in WLAN-based location
Dan et al. IP geolocation using traceroute location propagation and IP range location interpolation
Zu et al. IP-geolocater: a more reliable IP geolocation algorithm based on router error training
CN111711707B (en) IP address positioning method based on neighbor relation
Ding et al. A Street‐Level IP Geolocation Method Based on Delay‐Distance Correlation and Multilayered Common Routers
US11792110B2 (en) Geolocation system and method
Hillmann et al. On the path to high precise ip geolocation: A self-optimizing model
Xiang et al. No-jump-into-latency in china's internet! toward last-mile hop count based ip geo-localization
Gueye et al. Leveraging buffering delay estimation for geolocation of Internet hosts
CA2628121A1 (en) Methods and systems for wireless network survey, location and management
Hillmann et al. Dragoon: advanced modelling of IP geolocation by use of latency measurements
Jain et al. Internet distance prediction using node-pair geography
Xu et al. Netvigator: Scalable network proximity estimation
CN110300368B (en) IP geographical positioning system overall processing method
Wang et al. Target driven ip geolocation algorithm
Hillmann et al. Modelling of IP Geolocation by use of Latency Measurements
CN104079681A (en) Alias analysis method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant