CN116521803B

CN116521803B - Track similarity query method and device based on data federation

Info

Publication number: CN116521803B
Application number: CN202310222854.3A
Authority: CN
Inventors: 彭智勇; 吴晨; 王胜
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2023-03-09
Filing date: 2023-03-09
Publication date: 2025-08-15
Anticipated expiration: 2043-03-09
Also published as: CN116521803A

Abstract

The invention discloses a track similarity query method and device based on a data federation, and relates to the field of track data mining and information retrieval, wherein the query method comprises the steps of numbering mobile equipment in the data federation; based on the federal index and the number, searching the preset number of mobile devices most similar to the track to be queried by using a dynamic branch reduction algorithm. The method and the system construct the federal index by utilizing the space-time characteristics of the track, filter out the mobile equipment which does not meet the query condition by pruning the upper boundary of the similarity, reduce the communication expense, prune the track data to be queried based on the space-time characteristics of the track data, reduce the expense of local calculation, and terminate the query flow in advance by the central server according to the dynamic pruning condition, thereby further improving the query efficiency.

Description

Track similarity query method and device based on data federation

Technical Field

The invention relates to the field of track data mining and information retrieval, in particular to a track similarity query method and device based on data federation.

Background

Track similarity query is an important means for space-time track data analysis, and track data with the most similarity with a target track is quickly searched through predefined track similarity. The similarity query has important significance for subsequent track data analysis, including track clustering, track recommendation and the like. In addition, the track similarity query is also widely applied to the fields of traffic logistics (path planning), sociology and the like. The track similarity query has important application value.

Data federation is a well-established method of data integration. In the unified data model, data owners within the federation hold a portion of data that interact with a central server and provide query services to the outside. In federal query, a central server in the federal receives a query request from a user and forwards the query request to a related data owner, the data owner processes the query request and returns a query result to the central server, and the central server collates the query result and returns a final result to the user. In the process, the user does not directly interact with the data owners in the federation, so that the data privacy of the data owners is protected to a certain extent.

The traditional data federation is mainly based on enterprise-level application, is mainly applied to server-side programs, and has a large data scale. With the development of mobile technology, the computing power and capacity of terminal devices such as mobile phones are gradually improved, the managed data volume is also gradually increased, and on the other hand, the application of positioning technology on mobile phones also enables daily track data of people to be collected and managed. The track data are analyzed, and particularly the track similarity query can be applied to the fields of infectious disease contact tracing, traffic path planning and the like, so that the social treatment level is improved. However, daily trajectory data often involves personal privacy, and it is difficult for researchers to directly acquire and analyze the data, which is also a difficulty in analysis of movement trajectory data.

Disclosure of Invention

Aiming at the defects existing in the prior art, the invention aims to provide a track similarity query method and device based on data federation, mobile equipment which does not meet query conditions is filtered through similarity upper bound pruning, communication expenditure is reduced, meanwhile, track data to be queried is pruned based on space-time characteristics of the track data, the cost of local calculation is reduced, a central server side can terminate a query flow in advance according to dynamic pruning conditions, and query efficiency is further improved.

In order to achieve the above purpose, the invention adopts the following technical scheme:

numbering mobile devices in the data federation;

Constructing a federal index by a central server and based on a spatial grid;

Based on the federal index and the number, searching a preset number of mobile devices most similar to the track to be queried by using a dynamic branch reduction algorithm.

Based on the technical scheme, the mobile equipment in the data federation is numbered, and the specific steps comprise:

The mobile device sends registration information to the central server, and the central server numbers all the mobile devices in sequence according to the sequence of the received information.

On the basis of the technical scheme, the federal index is constructed by the central server based on the space grid, and the specific steps comprise:

The central server sends the space grid parameters to all the mobile devices;

After receiving the grid parameters, the mobile equipment anonymizes the space positions in the track data by using the space grids, and sends the anonymized track information to the central server;

The central server receives anonymized trajectory information, builds a time index over each pane of the spatial grid to record the dwell information of each device within that pane.

On the basis of the technical scheme, the space grid parameters comprise the space range of the grid and the side length of the grid inner square.

Based on the technical scheme, the method for searching the preset number of mobile devices most similar to the track to be queried by using a dynamic branch reduction algorithm based on the federal index and the number comprises the following specific steps:

the central server initializes a priority queue of a similarity upper bound, the size of which is the total number of mobile devices in the data federation, so as to record the similarity upper bound of the data track and the track to be queried in each mobile device and arrange the data tracks from large to small;

The central server calculates the upper limit of the similarity of the track data of the mobile equipment by using the federal index, and adds all the upper limit values of the similarity into the priority queue;

removing the equipment with the upper limit value of similarity of 0 in the priority queue, and locally calculating the similarity value between the rest mobile equipment and the track to be queried by the following calculation modes:

Wherein Sim is similarity, T ₁ is the track to be queried, T ₂ is the local track of the mobile device, pi _i is the ith dwell point in track T ₁, pi _j is the jth dwell point in track T ₂, τ is the time spent at the dwell point, ρ is the point of interest to which the dwell point corresponds, pi _i. τ is the dwell time of the ith dwell point in track T ₁, pi _j. τ is the dwell time of the jth dwell point in track T ₂, pi _i.τ∩π_j. τ is the common dwell time on the same point of interest;

The central server receives the similarity value sent by the mobile terminal, updates the similarity upper bound value in the priority list to be the similarity value of the corresponding equipment, and according to the received similarity value:

If the received similarity value is greater than the maximum similarity upper limit value, ending the query and returning a preset number of most similar mobile devices;

If the received similarity value is not greater than the similarity upper bound, judging whether the similarity values of all the mobile devices are received, if yes, ending the query and returning to the preset number of most similar devices, if not, returning to the execution center server to receive the similarity values sent by the mobile terminal, and updating the similarity upper bound in the priority list to be the similarity value of the corresponding device.

On the basis of the technical scheme, the central server calculates the similarity upper bound of the track data of the mobile equipment by using the federal index, and adds all similarity upper bound values into the priority queue, and the specific steps comprise:

the central server maintains a similarity upper limit value list;

traversing all stay points in the track to be queried, calculating the square of the stay point based on each stay point, and finding a corresponding interval tree in the federal index;

Finding all leaf nodes with intersections according to the interval information of the stay points, and updating a similarity upper limit value on corresponding equipment based on each leaf node, wherein the similarity upper limit value is calculated in the following way:

UB_i＝UB_i+|π.τ∩N_leaf.τ|

Wherein UB _i is the upper limit of the similarity of the ith mobile device, pi is the dwell point, τ is the dwell time, pi.τ is the dwell time of the dwell point, N _leaf is the leaf node, N _leaf.τ is the dwell time of the leaf node;

after the traversal is completed, all the similarity upper bound values in the similarity upper bound value list are added to the priority list.

On the basis of the above technical solution, after removing the device with the upper limit value of similarity of 0 in the priority queue, the method further includes:

According to the track data characteristics of the mobile equipment, irrelevant parts in query data are removed, so that efficiency of the mobile equipment in calculating the similarity is improved.

On the basis of the technical scheme, after the central server transmits the space grid parameters to all the mobile devices, the method further comprises the following steps:

the central server initializes a root node of the federation index, the root node stores m x n pointers, each pointer points to an interval tree, and the calculation method of m and n is as follows:

Wherein, R is the spatial range of the grid, the spatial range R is represented by longitude and latitude, lat _min is the minimum of latitude, lat _max is the maximum of latitude, r.lat _max is the maximum of latitude of the grid space, r.lat _min is the minimum of latitude of the grid space, lon _min is the minimum of longitude, lon _max is the maximum of longitude, r.lon _max is the maximum of longitude of the grid space, r.lon _min is the minimum of longitude of the grid space, and δ _d is the side length of one square.

On the basis of the technical scheme, the mobile equipment receives grid parameters and anonymizes the space positions in the track data by utilizing grids, and the method specifically comprises the following steps:

The mobile equipment receives the grid parameters and determines grid division;

Traversing all stay points in the local track data and converting the stay points into corresponding footprint information in the square, namely The conversion mode is as follows:

where pi is the dwell point, τ is the dwell time, ρ is the point of interest of dwell point pi, R is the space grid range, δ _d is the side length of the grid in the space grid, lon is the longitude, ρ.lon is the longitude of the point of interest ρ, lat is the latitude, ρ.lat is the latitude of the point of interest ρ, x is the offset of the point of interest on the longitude, y is the offset of the point of interest on the latitude, t is the space grid number, c _t is the grid number, v is the total number of mobile devices in the data federation.

The invention also provides a track similarity query device based on the data federation, which comprises the following steps:

The execution module is used for numbering the mobile equipment in the data federation and constructing a federation index based on the space grid;

and the query module is used for querying the preset number of mobile devices most similar to the track to be queried by using a dynamic branch reduction algorithm based on the federal index constructed by the execution module.

Compared with the prior art, the method has the advantages that the federal index is built by utilizing the space-time characteristics of the track, the mobile equipment which does not meet the query conditions is filtered through pruning of the upper boundary of the similarity, communication expenditure is reduced, meanwhile, the track data to be queried is pruned based on the space-time characteristics of the track data, the expenditure of local calculation is reduced, and the central server side can terminate the query flow in advance according to the dynamic pruning conditions, so that the query efficiency is further improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a trace similarity query method based on data federation in an embodiment of the present invention;

FIG. 2 is a schematic diagram of a specific flow of Federal index creation in an embodiment of the invention;

FIG. 3 is a schematic diagram of an index structure according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a root node of a federal index in accordance with an embodiment of the present invention;

FIG. 5 is a sample diagram of anonymization of trace data locations in an embodiment of the present invention;

fig. 6 is a schematic flow chart of a dynamic pruning algorithm in an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application.

Referring to fig. 1, an embodiment of the present invention provides a track similarity query method based on data federation, including the following steps:

S1, numbering mobile equipment in a data federation;

s2, constructing a federal index through a central server and based on a space grid;

and S3, searching the preset number of mobile devices most similar to the track to be queried by using a dynamic branch reduction algorithm based on the federal index and the number.

The track similarity query method based on the data federation comprises the steps of firstly recording and numbering mobile devices in the data federation, sending registration information to a central server by the mobile devices, numbering all the mobile devices according to the sequence of the received information, then sending space grid parameters to all the mobile devices by the central server, initializing root nodes N _root of federation indexes, traversing all stay points in local track data after the mobile devices receive the space grid parameters, converting the stay points into footprint information corresponding to space grids, uploading all the footprint information to the central server by the mobile devices after conversion is completed, receiving the footprint information sent by the mobile devices by the central server, finding a corresponding interval tree in a root node of the federation index according to the space grid information, inserting the interval tree into leaf nodes of the interval tree according to a time interval in the footprint information, recording corresponding device numbers in the leaf nodes, completing construction of the federation index, and then using a dynamic pruning algorithm to find k mobile devices which are most similar to a track Q to be queried, wherein k is preset integer number.

In the invention, the mobile equipment in the data federation is numbered, and the specific steps include:

That is, the mobile device transmits registration information to the central server, and the central server sequentially numbers all the mobile devices according to the sequence of receiving the information, for example, the mobile devices are numbered using S ₁-S_v.

In the invention, the method for constructing the federal index based on the space grid through the central server comprises the following specific steps:

The central server sends the space grid parameters to all the mobile devices;

That is, the central server transmits space grid parameters to all mobile devices in the data federation, the space grid parameters comprise a space range R of the space grid and a grid side length delta _d in the space grid, the space range R= < lat _min,lat_max,lon_min,lon_max >, the mobile device receives the grid parameters and anonymizes the space position in track data of the mobile device by utilizing grids, wherein the track data T consists of a plurality of stay points pi, namely T= < pi ₁,π₂,...,π_x >, each stay point pi= < ρ, tau > records stay information of a target at an interest point, the stay information comprises the interest point rho and stay time tau, the space-time grid comprises a plurality of grids with the same size, namely G= < c ₁,c₂,...,c_m*n >, and anonymization is to convert the stay points pi= < ρ, tau > into footprint information sigma= < c _i, tau > on the corresponding grids, wherein c _i is the square grid corresponding to the interest point in the stay points, and transmits the square grid to the server, and the central server receives anonymized track information, and constructs each square grid above the space to record stay information in the device.

In the invention, the space grid parameters comprise the space range of the grid and the side length of the grid inner square.

I.e. the spatial grid parameters comprise the spatial extent r= < lat _min,lat_max,lon_min,lon_max > of the grid and the side length delta _d of the grid inner grid.

In the invention, based on federal index and number, a dynamic branch reduction algorithm is used for searching the mobile equipment with the number most similar to the track to be inquired, and the specific steps include:

Wherein Sim is similarity, T ₁ is a track to be queried, T ₂ is a local track of the mobile device, pi _i is an ith dwell point in a T ₁ track, pi _j is a jth dwell point in a track T ₂, τ is a dwell time at the dwell point, ρ is a point of interest to which the dwell point corresponds more, pi _i.τ∩π_j, τ is a common dwell time at the same point of interest;

The method comprises the steps that a central server initializes a priority queue L with an upper similarity bound, the size of the priority queue L is the total number v of mobile devices in a data federation, the priority queue L records the upper similarity bound of track data in each mobile device and tracks to be queried, the track data and the upper similarity bound of tracks to be queried are sorted according to the numerical value from large to small, the maximum upper similarity bound in the mobile devices is reserved by the head L.top of the queue, the central server calculates the upper similarity bound of the track data of the mobile devices by using the federal index, all the upper similarity bound values are added into the priority queue, for mobile devices with the upper similarity bound of 0, the mobile devices can be judged to be out of the priority queue, the mobile devices with the upper similarity bound of greater than 0 are removed, for the mobile devices with the upper similarity bound of greater than 0, the mobile devices to be queried, the mobile devices locally calculate the similarity values and send the similarity values to the central server, the central server receives the similarity values Sim of the similarity values sent by the mobile devices, the upper similarity values in the priority list are updated, the upper limit values in the priority list are the corresponding similarity values, when the upper similarity values received by the mobile devices are greater than the maximum similarity value Sim, and the maximum similarity value in the priority list is greater than the maximum similarity value is received, and the maximum similarity is equal to the maximum query result is not in the priority list, and the query result is directly is aborted, and query is sent by the mobile device.

The method and the system construct the federal index by utilizing the space-time characteristics of the track, filter out the mobile equipment which does not meet the query condition by pruning the upper bound of the similarity, reduce the communication expense, prune the track data to be queried based on the space-time characteristics of the track data, reduce the expense of local calculation, and terminate the query flow in advance by the central server according to the dynamic pruning condition, thereby further improving the query efficiency.

In the invention, the central server calculates the upper limit of the similarity of the track data of the mobile equipment by using the federal index, and adds all the upper limit values of the similarity into the priority queue, and the specific steps comprise:

the central server maintains a similarity upper limit value list;

UB_i＝UB_i+|π.τ∩N_leaf.τ|

The central server calculates the upper limit of the similarity of the track data of the mobile device by using the federal index, and adds all the upper limit values of the similarity to the priority queue, wherein a similarity upper limit value list UB= { UB ₁,UB₂,...,UB_v } and a query data list QL= { Q ₁,Q₂,...,Q_r } are maintained for the central server, all the stay points in the track Q to be queried are traversed, the square of the stay points pi= < ρ, τ > are calculated, corresponding interval trees in the federal index are found, all leaf nodes intersected with the stay points are found according to the interval information τ of the stay points, and the upper limit UB _i＝UB_i+|π.τ∩N_leaf. τ| of the similarity of the corresponding device is updated for each leaf node N _leaf = < i, τ >, and meanwhile the query data Q _i＝Q_i { pi } of the corresponding mobile device is updated. After the traversal is completed, all upper bound values in list UB are added to priority queue L.

In the invention, after removing the device with the upper limit value of similarity of 0 in the priority queue, the method further comprises the following steps:

That is, after removing the device with the upper limit value of similarity in the priority queue being 0, the central server will also remove the irrelevant part in the query data according to the track data feature of the mobile device, so as to improve the efficiency of the mobile device in calculating the similarity locally.

In the present invention, after the central server transmits the spatial grid parameters to all the mobile devices, the method further comprises:

the central server initializes a root node of the federation index, the root node stores m x n pointers, each pointer points to an interval tree, wherein the calculation method of m and n is as follows:

That is, after the central server sends the spatial grid parameters to all the mobile devices, the central server initializes a root node N _root of the federal index, where the root node stores m×n pointers, each pointer pointing to an interval tree, where the calculation method of m and N is as follows:

In the invention, the mobile equipment receives grid parameters and anonymizes the space position in the track data by utilizing grids, and the method specifically comprises the following steps:

The mobile equipment receives the grid parameters and determines grid division;

Namely, the mobile device receives the grid parameters and uses the specific steps of grid anonymization to the spatial position in the track data, namely, the mobile device receives the spatial grid parameters R and delta _d, traverses all stay points in the local track data, converts the stay points into footprint information corresponding to the square, and converts pi= < ρ, tau > into sigma= < c _i, tau > through the grid parameters R and delta _d, namely The specific conversion mode is as follows:

The embodiment of the invention also provides a track similarity query device based on the data federation, which comprises the following steps:

The embodiment of the invention also provides a track similarity query device based on the data federation, which comprises an execution module and a query module, wherein the execution module is used for numbering mobile equipment in the data federation and constructing a federation index based on a space grid. The query module is used for querying a preset number of mobile devices most similar to the track to be queried by using a dynamic branch reduction algorithm based on the federal index constructed by the execution module.

The track similarity query method for the mobile device data federation provided by the embodiment of the invention is characterized in that each mobile device holds a local track data, a track Q to be queried and an integer k are given, in order to improve query efficiency, the invention designs a space-time federal index for efficiently pruning data to be queried. The flow of the query method of the invention is shown in figure 1 and is mainly divided into three steps:

Initializing federation, wherein all mobile devices send registration information to a central server, and the central server records and numbers the registration information;

constructing a federation index, wherein a central server sends space grid parameters to all mobile devices, the mobile devices carry out position anonymization on local track data according to the grid parameters, the interest points in the stay points are converted into corresponding space grids, the corresponding space grids are uploaded to the central server, the federation index is built, and a specific flow is described in the first embodiment;

And (3) performing query execution, namely performing similarity upper bound pruning by using the federal index by using the central server, filtering out equipment which does not meet query conditions, and then dynamically calculating the similarity upper bound by using a dynamic pruning algorithm and returning a final query result, wherein the detailed flow of the dynamic pruning algorithm is described in the second embodiment.

The invention will be described in more detail below by means of a number of examples.

In the first embodiment, the federation index is created, the specific flowchart is shown in fig. 2, the index structure is shown in fig. 3, and the detailed steps are as follows:

And S201, the central server sends grid parameters to all mobile devices and initializes the root node.

The federal index consists of a root node and a plurality of interval tree indexes recorded by the root node. Fig. 4 is a root node structure diagram of a federal index, which is composed of index header information and child node lists, wherein the index header information includes grid parameter information, namely a grid range R, a grid side length delta _d in the grid and a grid total number m×n in the grid, and the child node list includes a plurality of list items, wherein each list item stores a corresponding grid id and a pointer pointing to a corresponding interval tree root node. The internal nodes of the interval tree store interval information, and the leaf nodes store id numbers of corresponding mobile devices in addition to the interval information.

And S202, the mobile equipment receives the grid parameters, performs anonymization processing and uploads the parameters to the central server.

The mobile equipment traverses the local track data, and uploads the corresponding footprint information to the central server after the position anonymization of the stay points in the local track data.

And S203, the central server receives the converted track data and updates the federation index.

And if the node information of the corresponding interval tree does not exist in the root node, a new interval tree is established and the root node is recorded in a root node pointer of the federation index. And finding the inner node of the lowest layer containing the footprint, adding a leaf node into the inner node, wherein the leaf node records the equipment id and the time interval corresponding to the footprint. When the track data processing of all the mobile devices is completed, the federation index is constructed.

In the second embodiment, the anonymization of the track data position is shown in fig. 5, and the specific steps are as follows:

Step one, the mobile equipment receives grid parameters and determines grid division.

After the mobile device receives the grid parameters, grid division in the longitude and latitude directions is calculated respectively. The spatial range in fig. 5 is divided into 4*4 squares, each of the squares is coded according to a scan line type, and the squares corresponding to each interest point in the trace stop points are calculated, and trace data in fig. 5 contains 5 stop points, and total 5 interest points ρ ₁-ρ₅ correspond to 5 squares c ₂、c₆、c₇、c₁₂、c₁₆ respectively.

Traversing the track data, and realizing position anonymization by using the space grid.

The basic idea of location anonymization is to replace the exact location information with a spatial extent to hide the true spatial location. In fig. 5, the positions of the interest points in the track stay points are replaced by the square obtained in the step one, the time interval is kept unchanged, the converted track is obtained, and the mobile device sends the converted track to the central server side to construct the federation index.

The third embodiment of the dynamic pruning algorithm comprises the following specific steps:

step one, initializing a similarity priority queue.

For the data federation with the number v of the mobile devices, a central server initializes a priority queue L with the same length, and sequentially puts the track similarity upper bound value of each mobile device into the L according to the size of the priority queue L, and the head of the queue L.top stores the maximum similarity upper bound.

And step two, receiving the similarity value sent by the mobile terminal.

After the mobile device calculates the similarity value locally, the computing result Sim _i is sent to the central server, and after the mobile device receives the similarity value, the central server updates the corresponding similarity value in the L by using the similarity value, and dynamically adjusts the maximum similarity upper bound of the first L.top of the team so as to facilitate the subsequent judgment of whether the query termination condition is met.

And step three, judging whether the dynamic pruning condition is met.

When the received similarity value is greater than the maximum similarity upper bound, namely Sim _i > L.top, the method indicates that the subsequently received similarity value cannot be higher than the similarity value of the currently received equipment, and step four is entered, otherwise, whether all the similarity values of all the mobile equipment are received is judged, if yes, the query is completed, step four is entered, and otherwise, step two is entered to continue the query operation.

And step four, returning a final query result.

After the inquiry is completed, the central server side sends inquiry end information to all the mobile devices, and returns the last k most similar devices to the user.

The specific flow of the dynamic pruning algorithm in the third embodiment is shown in fig. 6.

The foregoing is only a specific embodiment of the application to enable those skilled in the art to understand or practice the application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims

1. A trajectory similarity query method based on data federation, characterized by comprising the following steps:

Numbering mobile devices within the data federation;

Build a federated index based on a spatial grid through a central server;

Based on the federated index and number, a dynamic branch reduction algorithm is used to find a preset number of mobile devices that are most similar to the query trajectory.

The method uses a dynamic branch reduction algorithm based on the federated index and number to find a preset number of mobile devices that are most similar to the trajectory to be queried. The specific steps include:

The central server initializes a priority queue with a similarity upper bound, whose size is the total number of mobile devices in the data federation. It is used to record the upper bound of the similarity between the data trajectory of each mobile device and the trajectory to be queried and sort them from large to small.

The central server uses the federated index to calculate the upper bound of similarity of the mobile device trajectory data and adds all the upper bound values of similarity to the priority queue;

Remove the devices with a similarity upper limit of 0 in the priority queue, and the remaining mobile devices calculate the similarity value with the trajectory to be queried locally. The calculation method is:

in, is the similarity, is the trajectory to be queried, is the local trajectory of the mobile device, For the trajectory The A stopover point, For the trajectory Middle A stopover point, is the time spent at the stop point, There are many points of interest corresponding to the stop points. For the trajectory The The duration of stay at a stop point, For the trajectory Middle The duration of stay at a stop point, The time spent together at the same point of interest;

The central server receives the similarity value sent by the mobile terminal and updates the upper similarity limit value in the priority list to the similarity value of the corresponding device. Based on the received similarity value:

If the received similarity value is greater than the maximum similarity upper limit, the query ends and the preset number of most similar mobile devices are returned;

If the received similarity value is not greater than the similarity upper bound, it is determined whether the similarity values of all mobile devices have been received. If so, the query is terminated and a preset number of most similar devices are returned. If not, the execution returns to the central server to receive the similarity value sent by the mobile terminal, and the similarity upper bound value in the priority list is updated to the similarity value of the corresponding device.

2. The trajectory similarity query method based on data federation according to claim 1, wherein the numbering of mobile devices in the data federation comprises the following steps:

The mobile device sends registration information to the central server, and the central server numbers all mobile devices in the order in which they are received.

3. The trajectory similarity query method based on data federation according to claim 1, wherein the step of constructing a federated index based on a spatial grid through a central server comprises:

The central server sends the spatial grid parameters to all mobile devices;

After receiving the grid parameters, the mobile device anonymizes the spatial position in the trajectory data using the spatial grid and sends the anonymized trajectory information to the central server;

The central server receives anonymized trajectory information and builds a time index on each square of the spatial grid to record the stay information of each device in that square.

4. The trajectory similarity query method based on data federation according to claim 3, characterized in that:

The spatial grid parameters include the spatial extent of the grid and the side lengths of the cells within the grid.

5. The method for trajectory similarity query based on data federation according to claim 1, wherein the central server calculates the upper similarity bound of mobile device trajectory data using the federated index and adds all similarity upper bound values to a priority queue, specifically comprising the following steps:

The central server maintains a list of similarity upper bound values;

Traverse all the stop points in the query trajectory, calculate the grid where each stop point is located, and find the corresponding interval tree in the federated index;

According to the interval information of the stay point, find all the leaf nodes that intersect with it, and update the upper limit of similarity on the corresponding device based on each leaf node. The upper limit of similarity is calculated as follows:

in, For the The upper bound of similarity of mobile devices, For the stopover point, is the residence time, is the dwell time at the stop point, is a leaf node, is the residence time of the leaf node;

After the traversal is completed, all similarity upper bound values in the similarity upper bound value list are added to the priority list.

6. The method for trajectory similarity query based on data federation according to claim 1, characterized in that after removing the devices with a similarity upper limit of 0 in the priority queue, the method further comprises:

According to the trajectory data characteristics of the mobile device, irrelevant parts of the query data are removed to improve the efficiency of similarity calculation of the mobile device.

7. The trajectory similarity query method based on data federation according to claim 4, characterized in that after the central server sends the spatial grid parameters to all mobile devices, it further comprises:

The central server initializes the root node of the federated index, which stores pointers, each pointer points to an interval tree, and The calculation method is:

in, is the spatial extent of the grid, spatial extent Expressed by longitude and latitude, is the minimum value of latitude, is the maximum value of latitude, is the maximum latitude of the grid space, is the minimum latitude of the grid space, is the minimum value of longitude, is the maximum value of longitude, is the maximum longitude of the grid space, is the minimum longitude of the grid space, is the side length of a square.

8. The method for trajectory similarity query based on data federation according to claim 4, wherein the mobile device receives grid parameters and anonymizes the spatial positions in the trajectory data using the grid, and the specific steps include:

The mobile device receives the grid parameters and determines the grid division;

Traverse all the stop points in the local trajectory data and convert them into corresponding footprint information in the grid, that is, , the conversion is as follows:

in, For the stopover point, is the residence time, For stopping point Points of interest, is the spatial grid range, is the side length of a cell in the spatial grid, is the longitude, Points of interest Longitude, is the latitude, Points of interest Latitude, is the offset of the point of interest in longitude, is the latitude offset of the point of interest, is the spatial grid number, For the number The square, is the total number of mobile devices in the data federation.

9. A trajectory similarity query device based on data federation using the trajectory similarity query method based on data federation according to any one of claims 1 to 8, characterized by comprising:

An execution module, configured to number mobile devices within the data federation and construct a federation index based on a spatial grid;

The query module is configured to query a preset number of mobile devices that are most similar to the trajectory to be queried using a dynamic branch pruning algorithm based on the federated index constructed by the execution module.