CN112307025B - Distributed index construction method and device - Google Patents
Distributed index construction method and device Download PDFInfo
- Publication number
- CN112307025B CN112307025B CN202011183760.2A CN202011183760A CN112307025B CN 112307025 B CN112307025 B CN 112307025B CN 202011183760 A CN202011183760 A CN 202011183760A CN 112307025 B CN112307025 B CN 112307025B
- Authority
- CN
- China
- Prior art keywords
- division precision
- map range
- combination
- division
- stored
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Remote Sensing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a method for constructing a distributed index, and relates to the technical field of computers. The method comprises the following steps: determining a number of respective alternative levels for the map range when storing the spatiotemporal data; constructing at least one division precision combination corresponding to each alternative level number by utilizing each division precision related to the map range according to each alternative level number; selecting a target division precision combination meeting preset conditions from the constructed multiple division precision combinations; wherein, the predetermined condition is: after each sample spatiotemporal data is stored in the map range according to the division precision combination, the number of the sample spatiotemporal data stored in the designated range of the map range is minimum; and constructing a distributed index for storing the space-time data for the map range according to the target division precision combination and the corresponding target alternative level number. Compared with the prior art, the scheme provided by the embodiment of the invention can improve the query performance of the established distributed index.
Description
Technical Field
The present invention relates to the field of data storage technologies, and in particular, to a method and an apparatus for constructing a distributed index.
Background
Currently, in many scenarios, users need to store spatiotemporal data. The spatio-temporal data refers to: data with temporal and spatial dimensions. For example, each moving track of the target object in a certain geographic area is a time space data.
With the increasing amount of spatiotemporal data to be stored, the efficiency of data query and data storage is low due to the effective single-machine performance of the traditional database, so that users often choose to store a large amount of spatiotemporal data by using a distributed index.
In the related art, the construction method of the distributed index is as follows: dividing a map range utilized when storing space-time data according to a manually specified level value and division precision corresponding to different levels indicated by the level value to obtain a plurality of storage grids of the map range under each level; and determining the number numbers of the storage grids by utilizing a preset coding mode aiming at a plurality of storage grids of the map range under each level to obtain the number numbers of the storage grids so as to complete the construction of the distributed index. Wherein each storage grid is used for storing spatiotemporal data located within a spatial range characterized by the storage grid.
However, in the above-mentioned related art, since the manually specified hierarchical values and the division precision corresponding to different hierarchies are determined according to user experience, the distribution of the spatio-temporal data to be stored in the map range to be used may be ignored, resulting in that more invalid data may be queried when the data query is performed using the established distributed index, and further, the query performance of the established distributed index may be poor.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for constructing a distributed index and electronic equipment, so that the distribution condition of data to be stored in a map range is considered in the process of constructing the distributed index for storing space-time data in the map range, the quantity of invalid data obtained by query when the index is used for data query in the established distribution is reduced, and the query performance of the established distributed index is improved.
The specific technical scheme is as follows:
In a first aspect, an embodiment of the present invention provides a method for constructing a distributed index, where the method includes:
Determining a number of respective alternative levels for the map range when storing the spatiotemporal data;
Constructing at least one division precision combination corresponding to each alternative level number by utilizing each division precision related to the map range; the number of the division precision in each division precision combination is the same as the number of the alternative levels, and the division precision in different division precision combinations is not completely the same;
Selecting a target division precision combination meeting preset conditions from the constructed multiple division precision combinations; wherein the predetermined condition is: after each sample spatiotemporal data is stored in the map range according to the division precision combination, the quantity of the sample spatiotemporal data stored in the designated range of the map range is minimum;
And constructing a distributed index for storing space-time data for the map range according to the target division precision combination and the corresponding target alternative level number.
Optionally, in a specific implementation manner, the step of selecting the target division precision combination meeting the predetermined condition from the constructed multiple division precision combinations includes:
For each constructed division precision combination, storing each sample space-time data into the map range according to the division precision combination, and determining the number of the sample space-time data stored into the designated range of the map range after storing, wherein the number corresponds to the division precision combination;
And obtaining the partition precision combination corresponding to the minimum number as a target partition precision combination.
Optionally, in a specific implementation manner, the step of selecting the target division precision combination meeting the predetermined condition from the constructed multiple division precision combinations includes:
Selecting one division precision combination from the constructed multiple division precision combinations as a reference combination;
for the reference combination, storing each sample space-time data into the map range according to the reference combination, determining the quantity of the sample space-time data stored into the designated range of the map range after storing, and recording the determined quantity;
For each of the division precision combinations other than the reference combination, storing the respective sample spatiotemporal data into the map range in accordance with the division precision combination, and after storing, determining the number of sample spatiotemporal data stored into a specified range of the map range; when the determined number is smaller than the current recorded number, replacing the current recorded number with the determined number; otherwise, discarding the determined number;
After traversing all the constructed division precision combinations, determining the division precision combination corresponding to the number recorded currently as a target division precision combination.
Optionally, in a specific implementation manner, the step of constructing a distributed index for storing spatio-temporal data for the map range according to the target division precision combination and the corresponding target alternative level number includes:
Determining the corresponding division precision of each level in the target alternative level number according to the target division precision combination and the corresponding target alternative level number;
dividing the map range according to the dividing precision of each level to obtain each storage grid of the map range under the level;
and determining a digital code for each divided storage grid according to a preset coding mode to obtain a distributed index for storing space-time data for the map range.
Optionally, in a specific implementation manner, the step of constructing, for each candidate level, at least one partition precision combination corresponding to the candidate level by using each partition precision related to the map range includes:
For each candidate level number, all division precision combinations corresponding to the candidate level number are constructed by utilizing the division precision related to the map range.
Optionally, in a specific implementation manner, the map range is: normalizing the real map range; sample spatiotemporal data are: and carrying out space-time data normalized on the space dimension.
In a second aspect, an embodiment of the present invention provides a data storage method of a distributed index constructed based on any one of the methods provided in the first aspect, where the method includes:
Acquiring space-time data to be stored;
determining a storage grid to which the space-time data to be stored belongs according to the space dimension of the space-time data to be stored; the storage grid is obtained by dividing a preset map range according to the target division precision combination and the corresponding target alternative level number;
storing the space-time data to be stored into the storage grid, and taking the digital code of the storage grid as a distributed index of the space-time data to be stored.
In a third aspect, an embodiment of the present invention provides an apparatus for constructing a distributed index, where the apparatus includes:
the quantity acquisition module is used for determining the quantity of each alternative level aiming at the map range when the space-time data is stored;
The combination determining module is used for constructing at least one division precision combination corresponding to each alternative level number by utilizing each division precision related to the map range; the number of the division precision in each division precision combination is the same as the number of the alternative levels, and the division precision in different division precision combinations is not completely the same;
The combination selection module is used for selecting a target division precision combination meeting preset conditions from the constructed multiple division precision combinations; wherein the predetermined condition is: after each sample spatiotemporal data is stored in the map range according to the division precision combination, the quantity of the sample spatiotemporal data stored in the designated range of the map range is minimum;
and the index construction module is used for constructing a distributed index for storing the space-time data for the map range according to the target division precision combination and the corresponding target alternative level number.
Optionally, in a specific implementation manner, the combination selecting module is specifically configured to:
For each constructed division precision combination, storing each sample space-time data into the map range according to the division precision combination, and determining the number of the sample space-time data stored into the designated range of the map range after storing, wherein the number corresponds to the division precision combination;
And obtaining the partition precision combination corresponding to the minimum number as a target partition precision combination.
Optionally, in a specific implementation manner, the combination selecting module is specifically configured to:
Selecting one division precision combination from the constructed multiple division precision combinations as a reference combination;
for the reference combination, storing each sample space-time data into the map range according to the reference combination, determining the quantity of the sample space-time data stored into the designated range of the map range after storing, and recording the determined quantity;
For each of the division precision combinations other than the reference combination, storing the respective sample spatiotemporal data into the map range in accordance with the division precision combination, and after storing, determining the number of sample spatiotemporal data stored into a specified range of the map range; when the determined number is smaller than the current recorded number, replacing the current recorded number with the determined number; otherwise, discarding the determined number;
After traversing all the constructed division precision combinations, determining the division precision combination corresponding to the number recorded currently as a target division precision combination.
Optionally, in a specific implementation manner, the index building module is specifically configured to:
Determining the corresponding division precision of each level in the target alternative level number according to the target division precision combination and the corresponding target alternative level number;
dividing the map range according to the dividing precision of each level to obtain each storage grid of the map range under the level;
and determining a digital code for each divided storage grid according to a preset coding mode to obtain a distributed index for storing space-time data for the map range.
Optionally, in a specific implementation manner, the combination determining module is specifically configured to:
For each candidate level number, all division precision combinations corresponding to the candidate level number are constructed by utilizing the division precision related to the map range.
Optionally, in a specific implementation manner, the map range is: normalizing the real map range; sample spatiotemporal data are: and carrying out space-time data normalized on the space dimension.
In a fourth aspect, an embodiment of the present invention provides a data storage device of a distributed index constructed based on any of the methods provided in the first aspect, the device comprising:
the data acquisition module is used for acquiring space-time data to be stored;
The grid determining module is used for determining a storage grid to which the space-time data to be stored belongs according to the space dimension of the space-time data to be stored; the storage grid is obtained by dividing a preset map range according to the target division precision combination and the corresponding target alternative level number;
And the data storage module is used for storing the space-time data to be stored into the storage grid and taking the digital code of the storage grid as a distributed index of the space-time data to be stored.
In a fifth aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
A memory for storing a computer program;
and a processor, configured to implement the steps of any one of the methods for constructing a distributed index provided in the first aspect and/or the steps of a data storage method provided in the second aspect when executing a program stored in a memory.
In a sixth aspect, embodiments of the present invention provide a computer readable storage medium having stored therein a computer program which, when executed by a processor, implements the steps of any one of the distributed index building methods provided in the first aspect and/or the steps of a data storage method provided in the second aspect.
In a seventh aspect, embodiments of the present invention provide a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of any one of the distributed index building methods provided in the first aspect and/or the steps of a data storage method provided in the second aspect.
The embodiment of the invention has the beneficial effects that:
In the above, when the scheme provided by the embodiment of the invention is applied to construct a distributed index for storing space-time data for a map range, the number of each alternative hierarchy level for the map range is first determined, and then, for each alternative hierarchy level, at least one division precision combination corresponding to the number of alternative hierarchy levels is constructed by utilizing each division precision about the map range. Then, a target division precision combination meeting a predetermined condition can be selected from the constructed plurality of division precision combinations. The selected target division precision combination is as follows: after each sample spatiotemporal data is stored in the map range according to each division precision combination, the division precision combination with the least number of sample spatiotemporal data stored in the designated range of the map range is used, so that a distributed index for storing the spatiotemporal data for the map range can be constructed according to the target division precision combination and the corresponding target alternative level number.
Based on this, in the solution provided in the embodiment of the present invention, the target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing the spatiotemporal data for the map range are determined based on the distribution situation of the sample data in the map range, and the distribution situation of the sample data in the map range may reflect the distribution situation of the spatiotemporal data to be stored in the map range. That is, the above-described target division precision combination and the corresponding target candidate hierarchy number can be regarded as being determined based on the distribution situation of the spatio-temporal data to be stored in the map range.
In this way, the above-determined target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing the spatio-temporal data for the map range are determined in consideration of the distribution of the data to be stored within the map range, that is, in consideration of the distribution of the data to be stored within the map range in constructing the distributed index for storing the spatio-temporal data for the map range, so that the number of invalid data obtained by querying when querying the index with the established distribution can be reduced, and thus, the query performance of the established distributed index can be improved.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for constructing a distributed index according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a data storage method according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a distributed index building apparatus according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a data storage device according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the related art, the construction method of the distributed index is as follows: dividing a map range utilized when storing space-time data according to a manually specified level value and division precision corresponding to different levels indicated by the level value to obtain a plurality of storage grids of the map range under each level; and determining the number numbers of the storage grids by utilizing a preset coding mode aiming at a plurality of storage grids of the map range under each level to obtain the number numbers of the storage grids so as to complete the construction of the distributed index. Wherein each storage grid is used for storing spatiotemporal data located within a spatial range characterized by the storage grid.
However, in the above-mentioned related art, since the manually specified hierarchical values and the division precision corresponding to different hierarchies are determined according to user experience, the distribution of the spatio-temporal data to be stored in the map range to be used may be ignored, resulting in that more invalid data may be queried when the data query is performed using the established distributed index, and further, the query performance of the established distributed index may be poor.
In order to solve the technical problems, the embodiment of the invention provides a method for constructing a distributed index.
It should be noted that the construction method may be applicable to any scenario in which time-space data is stored by using a distributed index, for example, storing a moving track of each target in a certain area; the construction method can be applied to any type of electronic equipment, such as a notebook computer, a desktop computer and the like, wherein in order to improve the construction speed, a plurality of electronic equipment in a distributed equipment cluster can be used for construction at the same time. In this regard, the embodiment of the present invention does not specifically limit the application scenario and execution subject of the construction method.
The method for constructing the distributed index provided by the embodiment of the invention can comprise the following steps:
Determining a number of respective alternative levels for the map range when storing the spatiotemporal data;
Constructing at least one division precision combination corresponding to each alternative level number by utilizing each division precision related to the map range; the number of the division precision in each division precision combination is the same as the number of the alternative levels, and the division precision in different division precision combinations is not completely the same;
Selecting a target division precision combination meeting preset conditions from the constructed multiple division precision combinations; wherein the predetermined condition is: after each sample spatiotemporal data is stored in the map range according to the division precision combination, the quantity of the sample spatiotemporal data stored in the designated range of the map range is minimum;
And constructing a distributed index for storing space-time data for the map range according to the target division precision combination and the corresponding target alternative level number.
In the above, when the scheme provided by the embodiment of the invention is applied to construct a distributed index for storing space-time data for a map range, the number of each alternative hierarchy level for the map range is first determined, and then, for each alternative hierarchy level, at least one division precision combination corresponding to the number of alternative hierarchy levels is constructed by utilizing each division precision about the map range. Then, a target division precision combination meeting a predetermined condition can be selected from the constructed plurality of division precision combinations. The selected target division precision combination is as follows: after each sample spatiotemporal data is stored in the map range according to each division precision combination, the division precision combination with the least number of sample spatiotemporal data stored in the designated range of the map range is used, so that a distributed index for storing the spatiotemporal data for the map range can be constructed according to the target division precision combination and the corresponding target alternative level number.
Based on this, in the solution provided in the embodiment of the present invention, the target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing the spatiotemporal data for the map range are determined based on the distribution situation of the sample data in the map range, and the distribution situation of the sample data in the map range may reflect the distribution situation of the spatiotemporal data to be stored in the map range. That is, the above-described target division precision combination and the corresponding target candidate hierarchy number can be regarded as being determined based on the distribution situation of the spatio-temporal data to be stored in the map range.
In this way, the above-determined target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing the spatio-temporal data for the map range are determined in consideration of the distribution of the data to be stored within the map range, that is, in consideration of the distribution of the data to be stored within the map range in constructing the distributed index for storing the spatio-temporal data for the map range, so that the number of invalid data obtained by querying when querying the index with the established distribution can be reduced, and thus, the query performance of the established distributed index can be improved.
The following describes a method for constructing a distributed index according to an embodiment of the present invention with reference to the accompanying drawings.
Fig. 1 is a flow chart of a method for constructing a distributed index according to an embodiment of the present invention. As shown in fig. 1, the construction method may include the steps of:
s101: determining a number of respective alternative levels for the map range when storing the spatiotemporal data;
In constructing a distributed index for storing spatiotemporal data, since spatiotemporal data is data having a spatial dimension, the spatial dimension of spatiotemporal data to be stored is within a certain map range. Furthermore, the map range corresponding to the established distributed index for storing the spatio-temporal data needs to cover the map range in which the spatial dimension of the data to be stored is located.
That is, the map range may be determined based on the spatial dimension of the spatio-temporal data to be stored.
In the method for constructing a distributed index according to the embodiment of the present invention, the sample spatiotemporal data may be used for construction, so that the map range may be determined based on the spatial dimension of the sample spatiotemporal data.
Further, after the above map range is determined, the number of individual candidate levels for the map range when storing spatiotemporal data may be further determined.
Wherein, alternatively, specific values of the number of each alternative hierarchy may be determined directly. For example, the number of each alternative tier may be determined directly to be 1,2, and 3, respectively.
Alternatively, the maximum number of alternative levels may be determined, and in turn, each positive integer not greater than the maximum number of alternative levels may be determined as the above-described number of alternative levels for the map range. For example, the maximum number of candidate levels may be determined to be 3, and then the respective number of candidate levels may be determined to be 1,2, and 3, respectively.
Further, optionally, the specific value of the number of the candidate layers or the maximum number of candidate layers may be input by the user to the electronic device.
Alternatively, a correspondence relationship between the map range and the number of each candidate level may be pre-established, so that after determining the map range, the electronic device may directly determine the number of each candidate level for the map range according to the correspondence relationship.
It should be noted that, optionally, in a specific implementation manner, the map range is a map range that normalizes a real map range; correspondingly, in the specific implementation manner, the sample space-time data is: and carrying out space-time data normalized on the space dimension.
That is, in this specific implementation manner, for convenience of subsequent calculation, the spatial dimensions of the real map range and the sample spatio-temporal data may be normalized, so as to obtain the spatio-temporal data normalized by using the normalized map range and the spatial dimensions, and establish a distributed index for storing the spatio-temporal data for the normalized map range.
S102: constructing at least one division precision combination corresponding to each alternative level number by utilizing each division precision related to the map range according to each alternative level number;
the number of the division precision in each division precision combination is the same as the number of the alternative levels, and the division precision in different division precision combinations is not completely the same;
After determining the respective number of alternative levels for the map range, at least one division precision combination of the number of alternative levels for one may be constructed for each alternative level using the respective division precision for the map range.
The important links for constructing the distributed index for storing the space-time data for the map range are as follows: and dividing the map range utilized when storing the space-time data according to the specific level value and the division precision corresponding to different levels indicated by the level value to obtain a plurality of storage grids of the map range under each level.
Therefore, in the method for constructing a distributed index according to the embodiment of the present invention, it is necessary to determine the number of candidate levels that are ultimately used to divide the map range and the division precision corresponding to different levels indicated by the number of candidate levels. The division precision corresponding to each level refers to: at the hierarchy, the map range is partitioned into at least one grid of a grid size indicated by the partitioning accuracy.
Furthermore, since each level indicated by the number of alternative levels corresponds to a division precision, and the division precision corresponding to different levels is different, the number of division precision in each constructed division precision combination is the same as the number of alternative levels for each number of alternative levels, but the division precision included in different division precision combinations for the same number of alternative levels is not exactly the same. And, the mesh size indicated by the maximum precision in each of the constructed division precision combinations cannot exceed the above map range.
Optionally, a minimum division precision may be determined, and then the minimum division precision is accumulated one by one on the minimum precision, and the division precision obtained after each accumulation is used as a division precision of the map range until the grid size indicated by the division precision obtained after accumulation does not exceed the map range and the difference between the grid size and the map range is the minimum. Thus, the above-described minimized division accuracy and the accumulated division accuracy are the division accuracy with respect to the map range.
Further, for each candidate level number, the division precision of the candidate level number may be selected from the division precision regarding the map range, and thus, the selected division precision is one division precision combination corresponding to the candidate level number.
For example, if the map range is [0,1] and the minimum division accuracy is 0.1, the respective division accuracy regarding the map range can be obtained including: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1.
When the number of the alternative levels is 1, a division precision can be selected from the division precision as a division precision combination corresponding to the number of the alternative levels. Illustratively, a combination of division accuracies is obtained as: 0.5; when the number of the alternative levels is 2, two division precision combinations corresponding to the number of the alternative levels can be selected from the division precision combinations. Illustratively, a combination of division accuracies is obtained as: 0.5 and 0.2.
Optionally, when at least one division precision combination corresponding to the number of each candidate hierarchy is constructed, after each division precision about the map range is determined, for each candidate hierarchy number, the division precision of the number of candidate hierarchies may be selected from each division precision about the map range, so as to obtain one division precision combination corresponding to the number of candidate hierarchies. When a plurality of partition precision combinations corresponding to the number of the alternative levels need to be constructed, the partition precision of the number of the alternative levels selected each time is selected without repetition, namely the partition precision of the number of the alternative levels selected each time is not identical, so that the partition precision in different partition precision combinations corresponding to the number of the obtained alternative levels is not identical.
Optionally, in a specific implementation manner, the step S102 may include the following step 11:
Step 11: for each candidate level number, all division precision combinations corresponding to the candidate level number are constructed by using the division precision of the map range.
In this specific implementation manner, for each alternative level number, all the division precision combinations corresponding to the alternative level number can be constructed by using each division precision about the map range, so that the finally constructed distributed index for storing the spatio-temporal data for the map range can be determined from all the obtained distributed indexes for storing the spatio-temporal data for the map range, so that the finally constructed distributed index can better match the distribution condition of the spatio-temporal data to be stored in the map range, and the query performance of the constructed distributed index is greatly improved.
In one embodiment of this specific implementation manner, for each number of alternative levels, constructing all division precision combinations corresponding to the number of alternative levels may be: the number of combinations of the division accuracies, which are different from each other, of the number of the alternative hierarchical levels is determined for each division accuracy of the map range, so that the number of combinations is extracted for a plurality of times when the division accuracies of the division accuracy combinations extracted each time are not identical, and the division accuracy combination obtained by each extraction is one division accuracy combination corresponding to the number of the alternative hierarchical levels, and further, after the number of combinations is extracted for a plurality of times, all the division accuracy combinations corresponding to the number of the alternative hierarchical levels can be constructed.
For example, the map range is [0,1], and each division precision concerning the map range includes: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1. When the number of the alternative levels is 2, the number of combinations of 2 different division precision is 45, and when the division precision of each extraction is not identical to the division precision combination, the extraction is performed 45 times in total, so that all the division precision combinations corresponding to the number of the alternative levels 2 can be obtained.
In addition, in another embodiment of the present specific implementation, for each candidate level number, constructing all the division precision combinations corresponding to the candidate level number may be: determining the division precision under each level except the first level in each level indicated by the number of alternative levels as a fixed value, and further successively changing the division precision under the first level until all possible division precision under the first level is traversed under the condition that the division precision under each level is a fixed value, thereby obtaining a plurality of division precision combinations corresponding to the number of alternative levels; then, changing the division precision under at least one of the other layers, keeping the division precision under the changed other layers unchanged, and successively changing the division precision under the first layer again until all possible division precision under the first layer is obtained when the division precision under the other layers is traversed to be under a changed fixed value, and obtaining a plurality of division precision combinations corresponding to the number of the alternative layers again; and the same is repeated until all the division precision combinations corresponding to the number of the alternative levels are obtained through traversal.
For example, the map range is [0,1], and each division precision concerning the map range includes: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1. Setting the division precision at the first level indicated by the number of alternative levels 2 to be 0.1 and keeping unchanged when the number of alternative levels is 2, and successively changing the division precision at the second level indicated by the number of alternative levels 2 from 0.2 in order from small to large until the division precision at the second level indicated by the number of alternative levels 2 is 1; thus, 9 division precision combinations corresponding to the number of alternative levels 2 can be obtained.
Then, changing the division precision under the first level indicated by the number of alternative levels 2 to 0.2 and keeping unchanged, and sequentially changing the division precision under the second level indicated by the number of alternative levels 2 from 0.3 in order from small to large until the division precision under the second level indicated by the number of alternative levels 2 is 1; thus, 8 division precision combinations corresponding to the number of alternative levels 2 can be obtained.
Then, changing the division precision under the first level indicated by the number of alternative levels 2 to 0.3 and keeping unchanged, and sequentially changing the division precision under the second level indicated by the number of alternative levels 2 from 0.4 in order from small to large until the division precision under the second level indicated by the number of alternative levels 2 is 1; thus, 7 division precision combinations corresponding to the number of alternative levels 2 can be obtained.
And the method is analogically performed until the division precision of the first level indicated by the number 2 of the alternative levels is 0.9, the division precision of the second level indicated by the number 2 of the alternative levels is 1, and all 45 division precision combinations corresponding to the number 2 of the alternative levels can be traversed.
For another example, the map range is [0,1], and each of the division accuracies concerning the map range includes: 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 and 1. Setting the division precision at the first level indicated by the number of alternative levels 3 to be 0.1 and keeping unchanged, and setting the division precision at the second level indicated by the number of alternative levels 3 to be 0.2 and keeping unchanged when the number of alternative levels is 3, and sequentially changing the division precision at the third level indicated by the number of alternative levels 3 from 0.3 until the division precision at the third level indicated by the number of alternative levels 3 is 1 according to the order from small to large; thus, 8 division precision combinations corresponding to the number 3 of the alternative layers can be obtained.
Then, the division precision under the first level indicated by the number of alternative levels 3 is set to 0.1 and kept unchanged, and the division precision under the second level indicated by the number of alternative levels 3 is changed to 0.3 and kept unchanged, then the division precision under the third level indicated by the number of alternative levels 3 is successively changed from 0.4 until the division precision under the third level indicated by the number of alternative levels 3 is 1 in order from small to large; thus, 7 division precision combinations corresponding to the number 3 of the alternative levels can be obtained.
And so on until the division precision of the first level indicated by the number 3 of the alternative levels is 0.1, the division precision of the second level indicated by the number 3 of the alternative levels is 0.9, and the division precision combination of the third level indicated by the number 3 of the alternative levels is 1, so that 36 division precision combinations are obtained under the condition that the division precision of the first level indicated by the number 3 of the alternative levels is 0.1 and kept unchanged;
Then, changing the division precision under the first level indicated by the number of alternative levels 3 to 0.2 and keeping unchanged, setting the division precision under the second level indicated by the number of alternative levels 3 to 0.3 again and keeping unchanged, and successively changing the division precision under the third level indicated by the number of alternative levels 3 from 0.4 until the division precision under the third level indicated by the number of alternative levels 3 is 1; thus, 7 division precision combinations corresponding to the number 3 of the alternative layers can be obtained;
Then, the division precision under the first level indicated by the number of alternative levels 3 is set to 0.2 and kept unchanged, and the division precision under the second level indicated by the number of alternative levels 3 is changed to 0.4 and kept unchanged, then the division precision under the third level indicated by the number of alternative levels 3 is successively changed from 0.5 until the division precision under the third level indicated by the number of alternative levels 3 is 1 in order from small to large; thus, 6 division precision combinations corresponding to the number 3 of the alternative layers can be obtained;
and so on until the division precision of the first level indicated by the number 3 of the alternative levels is 0.2, the division precision of the second level indicated by the number 3 of the alternative levels is 0.9, the division precision of the third level indicated by the number 3 of the alternative levels is a division precision combination of 1, and 28 division precision combinations are obtained under the condition that the division precision of the first level indicated by the number 3 of the alternative levels is 0.2 and kept unchanged;
Further, the above-mentioned changing process is circularly executed until the division precision under the first level indicated by the number of the alternative levels 3 is 0.8, the division precision under the second level indicated by the number of the alternative levels 3 is 0.9, and the division precision under the third level indicated by the number of the alternative levels 3 is 1, so that all 120 division precision combinations corresponding to the number of the alternative levels 3 can be traversed.
S103: selecting a target division precision combination meeting preset conditions from the constructed multiple division precision combinations;
Wherein, the predetermined condition is: after each sample spatiotemporal data is stored in the map range according to the division precision combination, the number of the sample spatiotemporal data stored in the designated range of the map range is minimum;
after at least one division precision combination corresponding to the number of each alternative hierarchy is constructed, a plurality of constructed division precision combinations can be obtained, and then a target division precision combination meeting a preset condition can be selected from the plurality of constructed division precision combinations.
Wherein, since the predetermined condition is that the number of sample spatiotemporal data stored in the specified range of the map range is minimum after each sample spatiotemporal data is stored in the map range in accordance with the division precision combination, the number of sample time data stored in the specified range of the map range is minimum when each sample time data is stored in the map range.
In this way, when the above-mentioned target division precision combination and the corresponding target alternative level number are utilized to construct a distributed index for storing space-time data for a map range, then, when the data query is performed by utilizing the established distributed index, the number of invalid data obtained by the query can be reduced to a greater extent, and further, the query performance of the established distributed index is improved.
The storage of the sample time data in the map range means that: for each division precision combination, the division precision of each level indicated by the number of alternative levels corresponding to the division precision combination can be determined, wherein the division precision of each level is one division precision in the division precision combination, and the division precision of different levels is different. Thus, the map range can be divided according to the division precision of each level, and a plurality of storage grids of the map range under the level can be obtained. Further, for each sample spatiotemporal data, a minimum square grid in which the sample spatiotemporal data is located and a position of the minimum square grid within a map range are determined according to a spatial dimension of the sample spatiotemporal data, and further, a minimum storage grid capable of covering the minimum square range is determined from a plurality of storage grids under the obtained plurality of levels. Thus, the sample time-space data is stored in the determined minimum storage grid, and the action of storing the sample time-space data in the map range can be completed.
For example, the map range is 2, and one division precision combination includes: 0.1 and 0.2, the number of alternative levels corresponding to the division precision combination is 2; then it may be determined that the division precision of the first hierarchy indicated by the number of alternative hierarchies 2 is 0.1 and the division precision of the second hierarchy indicated by the number of alternative hierarchies 2 is 0.2; furthermore, according to the division precision of the first level of 0.1, the map range can be divided into storage grids with the size of 0.1 x 0.1, so that 100 storage grids of the map range under the first level can be obtained; the map range can be divided into storage grids with the size of 0.2 x 0.2 according to the division precision of 0.2 of the second level, so that 25 storage grids of the map range under the second level can be obtained.
The size of the smallest square grid where one sample of space-time data is located is 0.1 x 0.1, and the position of the smallest square grid in the map range is: the top left vertex of the minimum square grid coincides with a point (0.3 ) in the map range, and the bottom right vertex coincides with a point (0.4 ) in the map range, so that among 100 storage grids with the size of 0.1 x 0.1 and 25 storage grids with the size of 0.2 x 0.2, the storage grids capable of covering the minimum square grid are: the upper left vertex at the first level is (0.3 ), the lower right vertex is (0.4 ) and the size is 0.1 x 0.1, and then the sample space-time data can be stored in the determined storage grid.
Accordingly, the number of sample spatiotemporal data stored in the specified range of the map range mentioned above means: after each sample space-time data is stored in each storage grid obtained by dividing, the storage grids included in the designated range of the map range under each level are determined, then, for each level, the number of sample space-time data included in each storage grid included in the designated range under the level is determined, and then, the number of sample space-time data stored in the designated range of the map range can be obtained by adding the determined numbers under each level.
The storage grid included in the specified range of the map range under each level includes: a storage grid completely located within the specified range and a storage grid partially located within the specified range at each level; that is, the storage grid included in the specified range of the map range at each level includes: at each level, there is an intersection of each storage grid with the specified range.
Optionally, in a specific implementation manner, the step S103 may include the following steps 21 to 22:
Step 21: for each constructed division precision combination, storing each sample space-time data into a map range according to the division precision combination, and determining the number of the sample space-time data stored into a specified range of the map range after storing, wherein the number is corresponding to the division precision combination;
Step 22: and obtaining the partition precision combination corresponding to the minimum number as a target partition precision combination.
In this specific implementation manner, for each constructed division precision combination, each sample space-time data may be stored in the map range according to the division precision combination, and after the storing, the number of sample space-time data stored in the specified range of the map range may be determined as the number corresponding to the division precision combination. Thus, the number corresponding to each division precision combination in all the division precision combinations can be obtained, namely, a plurality of numbers can be obtained.
Further, the minimum number can be determined from the number corresponding to each of the obtained division precision combinations, and further, the division precision combination corresponding to the minimum number can be obtained as the target division precision combination.
Alternatively, in another specific implementation manner, the step S103 may include the following steps 31-34:
Step 31: selecting one division precision combination from the constructed multiple division precision combinations as a reference combination;
Step 32: for the reference combination, storing each sample space-time data into a map range according to the reference combination, determining the number of the sample space-time data stored into a specified range of the map range after storing, and recording the determined number;
step 33: for each of the division precision combinations other than the reference combination, storing the respective sample spatiotemporal data into the map range in accordance with the division precision combination, and determining the number of the sample spatiotemporal data stored into the specified range of the map range after the storing; when the determined number is smaller than the current recorded number, replacing the current recorded number with the determined number; otherwise, discarding the determined number;
Step 34: after traversing all the constructed division precision combinations, determining the division precision combination corresponding to the number recorded currently as a target division precision combination.
In the specific implementation manner, after a plurality of division precision combinations are constructed, one division precision combination is selected as a reference combination; further, for the reference combination, each sample spatiotemporal data may be stored in the map range according to the reference combination, and after the storage, the number of sample spatiotemporal data stored in the specified range of the map range may be determined, and the determined number may be recorded.
Further, one division precision combination other than the reference combination is acquired, each sample spatiotemporal data is stored in the map range according to the division precision combination, and after the storage, the number of sample spatiotemporal data stored in the specified range of the map range is determined. In this way, the size of the determined number and the number of current recordings determined for the reference combination can be calculated. Thus, when the determined number is smaller than the number of current records determined for the reference combination, the determined number may be taken as the number of current records, and thus the number of records determined for the reference combination is deleted. Accordingly, when the determined number is not less than the current recorded number determined for the reference combination, the current recorded number determined for the reference combination may be retained, and the determined number may be discarded.
And then, continuously selecting one partition precision combination from unused partition precision combinations, storing each sample space-time data into a map range according to the partition precision combination, and determining the number of the sample space-time data stored into a specified range of the map range after storing. In this way, the size of the determined number and the number of current records can be calculated. Thus, when the determined number is smaller than the current recorded number, the current recorded number may be replaced with the determined number. Accordingly, when the determined number is not less than the number of current records determined for the reference combination, the number of current records may be reserved and discarded.
Based on this, that is, for each of the division precision combinations other than the reference combination, each sample spatiotemporal data is stored in the map range in accordance with the division precision combination, and after the storage, the number of sample spatiotemporal data stored in the specified range of the map range is determined; when the determined number is smaller than the current recorded number, replacing the current recorded number with the determined number; otherwise, the determined number is discarded.
Thus, after traversing all the constructed division precision combinations, the number recorded at present is: after each sample space-time data is stored in the map range according to each division precision combination, the minimum number of the sample space-time data stored in the designated range of the map range can be used as the target division precision combination.
Optionally, the first division precision combination constructed may be determined as the reference combination, and then, after each subsequent division precision combination is constructed, the step 23 may be executed immediately, until after the last division precision combination is constructed, the step 23 is executed last time, and after the last execution of the step 23, the step 23 is executed, so as to obtain the target division precision combination.
S104: and constructing a distributed index for storing the space-time data for the map range according to the target division precision combination and the corresponding target alternative level number.
After the target division precision combination is obtained, since the number of division precision included in the division precision combination is the same as the number of corresponding alternative levels, the corresponding number of target alternative levels can be determined according to the target division precision combination.
Further, the division precision of each level indicated by the number of target candidate levels may be determined among the division precision included in the target division precision combination, and the map range may be divided to obtain a plurality of storage grids of the map range at each level indicated by the number of target candidate levels.
After the plurality of storage grids are obtained, the digital codes of the storage grids can be determined according to the position information such as the longitude and latitude of the map corresponding to the grid range for each storage grid by utilizing a preset coding mode, so that the distributed index of the storage grid is obtained.
Based on this, after the digital code of each of the plurality of storage grids is obtained, the constructed distributed index for storing the spatio-temporal data for the map range can be obtained.
As can be seen from the above, the above-determined target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing spatiotemporal data for the map range are determined in consideration of the distribution of the data to be stored in the map range, that is, in consideration of the distribution of the data to be stored in the map range in the process of constructing the distributed index for storing spatiotemporal data for the map range, so that the number of invalid data obtained by querying when querying the index with the established distribution can be reduced, and the query performance of the established distributed index can be improved.
Corresponding to the method for constructing the distributed index provided by the embodiment of the invention, the embodiment of the invention also provides a data storage method based on the constructed distributed index, and the distributed index is constructed by using any one of the method for constructing the distributed index provided by the embodiment of the invention. For convenience of description, hereinafter, a data storage method is abbreviated.
That is, the data storage method provided by the embodiment of the present invention is implemented on the basis of the method for constructing a distributed index provided by the embodiment of the present invention. Based on this, after the above-mentioned method for constructing a distributed index provided by the embodiment of the present invention is used to construct a distributed index for storing spatiotemporal data for a map range, the constructed distributed index may be used to execute the method for storing spatiotemporal data to be stored provided by the embodiment of the present invention to store the spatiotemporal data to be stored in the map range, that is, the method for constructing a distributed index provided by the embodiment of the present invention is used to construct a distributed index for storing spatiotemporal data for a map range to store the spatiotemporal data, and further determine the distributed index of the stored temporal data to be stored.
Furthermore, the storage method can be suitable for any scene for storing time data by using a distributed index, for example, the moving track of each target in a certain area is stored; the storage method can be applied to any type of electronic equipment, such as a notebook computer, a desktop computer and the like, wherein in order to improve the construction speed, a plurality of electronic equipment in a distributed equipment cluster can be used for data storage at the same time. In this regard, the embodiment of the present invention does not specifically limit the application scenario and execution subject of the storage method.
The data storage method provided by the embodiment of the invention can comprise the following steps:
Acquiring space-time data to be stored;
determining a storage grid to which the space-time data to be stored belongs according to the space dimension of the space-time data to be stored; the storage grid is obtained by dividing a preset map range according to the target division precision combination and the corresponding target alternative level number;
storing the space-time data to be stored into the storage grid, and taking the digital code of the storage grid as a distributed index of the space-time data to be stored.
As can be seen from the above, the above-determined target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing the spatio-temporal data for the map range are determined in consideration of the distribution of the data to be stored in the map range, that is, in consideration of the distribution of the data to be stored in the map range in the process of constructing the distributed index for storing the spatio-temporal data for the map range, so that, after the data to be stored is stored by using the above-described distributed index, when the data query is performed, the number of invalid data obtained by the query when the data query is performed by using the established distribution time index can be reduced, and further, the query performance of the established distributed index is improved.
The following describes a data storage method according to an embodiment of the present invention in detail with reference to the accompanying drawings.
Fig. 2 is a flow chart of a data storage method according to an embodiment of the present invention. As shown in fig. 2, the data storage method may include the steps of:
S201: acquiring space-time data to be stored;
s202: determining a storage grid to which the space-time data to be stored belongs according to the space dimension of the space-time data to be stored;
the storage grid is obtained by dividing a preset map range according to the target division precision combination and the corresponding target alternative level number;
S203: storing the space-time data to be stored into a storage grid, and taking the digital code of the storage grid as a distributed index of the space-time data to be stored.
Firstly, acquiring space-time data to be stored, and further, determining the minimum square grid where the space-time data to be stored is located and the position of the minimum square grid in a map range according to the space dimension of the space-time data to be stored.
In this way, since the preset map range is divided according to the target division precision combination and the corresponding target alternative level number to obtain a plurality of storage grids, the storage grid to which the space-time data to be stored belongs can be determined from the obtained plurality of storage grids according to the determined minimum square grid to which the space-time data to be stored belongs and the position of the minimum square grid within the map range, namely, the minimum storage grid capable of covering the minimum square grid to which the space-time data to be stored exists is determined from the obtained plurality of storage grids.
Furthermore, the space-time data to be stored can be stored in the determined storage grid to which the space-time data to be stored belongs, namely, the space-time data to be stored is stored. Further, the digital code of the storage grid can be used as a distributed index of the space-time data to be stored.
As can be seen from the above, the above-determined target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing the spatio-temporal data for the map range are determined in consideration of the distribution of the data to be stored in the map range, that is, in consideration of the distribution of the data to be stored in the map range in the process of constructing the distributed index for storing the spatio-temporal data for the map range, so that, after the data to be stored is stored by using the above-described distributed index, when the data query is performed, the number of invalid data obtained by the query when the data query is performed by using the established distribution time index can be reduced, and further, the query performance of the established distributed index is improved.
Compared with the method for constructing the distributed index provided by the embodiment of the invention, the method for constructing the distributed index is provided. The embodiment of the invention also provides a device for constructing the distributed index.
Fig. 3 is a schematic structural diagram of a device for constructing a distributed index according to an embodiment of the present invention, where, as shown in fig. 3, the device may include the following modules:
A number acquisition module 310 for determining the number of individual alternative levels for the map range when storing the spatiotemporal data;
A combination determining module 320, configured to construct, for each candidate level number, at least one division precision combination corresponding to the candidate level number, using respective division precision regarding the map range; the number of the division precision in each division precision combination is the same as the number of the alternative levels, and the division precision in different division precision combinations is not completely the same;
A combination selecting module 330, configured to select a target division precision combination that meets a predetermined condition from the constructed multiple division precision combinations; wherein the predetermined condition is: after each sample spatiotemporal data is stored in the map range according to the division precision combination, the quantity of the sample spatiotemporal data stored in the designated range of the map range is minimum;
The index construction module 340 is configured to construct a distributed index for storing spatiotemporal data for the map range according to the target division precision combination and the corresponding target candidate hierarchy number.
In the above, when the scheme provided by the embodiment of the invention is applied to construct a distributed index for storing space-time data for a map range, the number of each alternative hierarchy level for the map range is first determined, and then, for each alternative hierarchy level, at least one division precision combination corresponding to the number of alternative hierarchy levels is constructed by utilizing each division precision about the map range. Then, a target division precision combination meeting a predetermined condition can be selected from the constructed plurality of division precision combinations. The selected target division precision combination is as follows: after each sample spatiotemporal data is stored in the map range according to each division precision combination, the division precision combination with the least number of sample spatiotemporal data stored in the designated range of the map range is used, so that a distributed index for storing the spatiotemporal data for the map range can be constructed according to the target division precision combination and the corresponding target alternative level number.
Based on this, in the solution provided in the embodiment of the present invention, the target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing the spatiotemporal data for the map range are determined based on the distribution situation of the sample data in the map range, and the distribution situation of the sample data in the map range may reflect the distribution situation of the spatiotemporal data to be stored in the map range. That is, the above-described target division precision combination and the corresponding target candidate hierarchy number can be regarded as being determined based on the distribution situation of the spatio-temporal data to be stored in the map range.
In this way, the above-determined target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing the spatio-temporal data for the map range are determined in consideration of the distribution of the data to be stored within the map range, that is, in consideration of the distribution of the data to be stored within the map range in constructing the distributed index for storing the spatio-temporal data for the map range, so that the number of invalid data obtained by querying when querying the index with the established distribution can be reduced, and thus, the query performance of the established distributed index can be improved.
Optionally, in a specific implementation manner, the combination selecting module 330 is specifically configured to:
For each constructed division precision combination, storing each sample space-time data into the map range according to the division precision combination, and determining the number of the sample space-time data stored into the designated range of the map range after storing, wherein the number corresponds to the division precision combination;
And obtaining the partition precision combination corresponding to the minimum number as a target partition precision combination.
Optionally, in a specific implementation manner, the combination selecting module 330 is specifically configured to:
Selecting one division precision combination from the constructed multiple division precision combinations as a reference combination;
for the reference combination, storing each sample space-time data into the map range according to the reference combination, determining the quantity of the sample space-time data stored into the designated range of the map range after storing, and recording the determined quantity;
For each of the division precision combinations other than the reference combination, storing the respective sample spatiotemporal data into the map range in accordance with the division precision combination, and after storing, determining the number of sample spatiotemporal data stored into a specified range of the map range; when the determined number is smaller than the current recorded number, replacing the current recorded number with the determined number; otherwise, discarding the determined number;
After traversing all the constructed division precision combinations, determining the division precision combination corresponding to the number recorded currently as a target division precision combination.
Optionally, in a specific implementation manner, the index building module 340 is specifically configured to:
Determining the corresponding division precision of each level in the target alternative level number according to the target division precision combination and the corresponding target alternative level number;
dividing the map range according to the dividing precision of each level to obtain each storage grid of the map range under the level;
and determining a digital code for each divided storage grid according to a preset coding mode to obtain a distributed index for storing space-time data for the map range.
Optionally, in a specific implementation manner, the combination determining module 320 is specifically configured to:
For each candidate level number, all division precision combinations corresponding to the candidate level number are constructed by utilizing the division precision related to the map range.
Optionally, in a specific implementation manner, the map range is: normalizing the real map range; sample spatiotemporal data are: and carrying out space-time data normalized on the space dimension.
Compared with the data storage method provided by the embodiment of the invention, the embodiment of the invention also provides a data storage device. The data storage device is a distributed index constructed based on any one of the distributed index construction methods provided by the embodiment of the invention.
Fig. 4 is a schematic structural diagram of a data storage device according to an embodiment of the present invention, and as shown in fig. 4, the device may include the following modules:
a data acquisition module 410, configured to acquire spatiotemporal data to be stored;
the grid determining module 420 is configured to determine a storage grid to which the spatiotemporal data to be stored belongs according to the spatial dimension of the spatiotemporal data to be stored; the storage grid is obtained by dividing a preset map range according to the target division precision combination and the corresponding target alternative level number;
And the data storage module 430 is configured to store the spatiotemporal data to be stored in the storage grid, and use the digital code of the storage grid as a distributed index of the spatiotemporal data to be stored.
As can be seen from the above, the above-determined target division precision combination and the corresponding target candidate level number for constructing the distributed index for storing the spatio-temporal data for the map range are determined in consideration of the distribution of the data to be stored in the map range, that is, in consideration of the distribution of the data to be stored in the map range in the process of constructing the distributed index for storing the spatio-temporal data for the map range, so that, after the data to be stored is stored by using the above-described distributed index, when the data query is performed, the number of invalid data obtained by the query when the data query is performed by using the established distribution time index can be reduced, and further, the query performance of the established distributed index is improved.
The embodiment of the invention also provides an electronic device, as shown in fig. 5, which comprises a processor 501, a communication interface 502, a memory 503 and a communication bus 504, wherein the processor 501, the communication interface 502 and the memory 503 complete communication with each other through the communication bus 504,
A memory 503 for storing a computer program;
The processor 501 is configured to implement any one of the above-described method for constructing a distributed index according to the embodiment of the present invention and/or the above-described method for storing data according to the embodiment of the present invention when executing a program stored in the memory 503.
The communication bus mentioned above for the electronic device may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but may also be a digital signal processor (DIGITAL SIGNAL Processing, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components.
In still another embodiment of the present invention, a computer readable storage medium is provided, where a computer program is stored, where the computer program, when executed by a processor, implements a method for constructing any of the distributed indexes provided in the foregoing embodiment of the present invention and/or a method for storing data provided in the foregoing embodiment of the present invention.
In yet another embodiment of the present invention, a computer program product containing instructions that, when executed on a computer, cause the computer to perform any of the above-described method for constructing a distributed index according to the embodiment of the present invention and/or a method for storing data according to the embodiment of the present invention is provided.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk Solid STATE DISK (SSD)), etc.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the apparatus embodiments, the electronic device embodiments, the computer-readable storage medium embodiments, and the computer program product embodiments, the description is relatively simple, as relevant to the description of the method embodiments in part, since they are substantially similar to the method embodiments.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.
Claims (11)
1. A method for constructing a distributed index, the method comprising:
Determining a number of respective alternative levels for the map range when storing the spatiotemporal data;
Constructing at least one division precision combination corresponding to each alternative level number by utilizing each division precision related to the map range; the number of the division precision in each division precision combination is the same as the number of the alternative levels, and the division precision in different division precision combinations is not completely the same, wherein the division precision is used for indicating the size of a grid for dividing the map range;
Selecting a target division precision combination meeting preset conditions from the constructed multiple division precision combinations; wherein the predetermined condition is: after each sample spatiotemporal data is stored in the map range according to the division precision combination, the quantity of the sample spatiotemporal data stored in the designated range of the map range is minimum;
And constructing a distributed index for storing space-time data for the map range according to the target division precision combination and the corresponding target alternative level number.
2. The method of claim 1, wherein the step of selecting a target division precision combination satisfying a predetermined condition from the constructed plurality of division precision combinations comprises:
For each constructed division precision combination, storing each sample space-time data into the map range according to the division precision combination, and determining the number of the sample space-time data stored into the designated range of the map range after storing, wherein the number corresponds to the division precision combination;
And obtaining the partition precision combination corresponding to the minimum number as a target partition precision combination.
3. The method of claim 1, wherein the step of selecting a target division precision combination satisfying a predetermined condition from the constructed plurality of division precision combinations comprises:
Selecting one division precision combination from the constructed multiple division precision combinations as a reference combination;
for the reference combination, storing each sample space-time data into the map range according to the reference combination, determining the quantity of the sample space-time data stored into the designated range of the map range after storing, and recording the determined quantity;
For each of the division precision combinations other than the reference combination, storing the respective sample spatiotemporal data into the map range in accordance with the division precision combination, and after storing, determining the number of sample spatiotemporal data stored into a specified range of the map range; when the determined number is smaller than the current recorded number, replacing the current recorded number with the determined number; otherwise, discarding the determined number;
After traversing all the constructed division precision combinations, determining the division precision combination corresponding to the number recorded currently as a target division precision combination.
4. The method of claim 1, wherein the step of constructing a distributed index for the map range for storing spatiotemporal data in accordance with the target division precision combination and the corresponding target alternative hierarchy number comprises:
Determining the corresponding division precision of each level in the target alternative level number according to the target division precision combination and the corresponding target alternative level number;
dividing the map range according to the dividing precision of each level to obtain each storage grid of the map range under the level;
and determining a digital code for each divided storage grid according to a preset coding mode to obtain a distributed index for storing space-time data for the map range.
5. The method according to any one of claims 1-4, wherein for each candidate level, the step of constructing at least one combination of division accuracies corresponding to that candidate level using the respective division accuracies for the map range comprises:
For each candidate level number, all division precision combinations corresponding to the candidate level number are constructed by utilizing the division precision related to the map range.
6. The method of any one of claims 1-4, wherein the map range is: normalizing the real map range; sample spatiotemporal data are: and carrying out space-time data normalized on the space dimension.
7. A method of data storage of a distributed index constructed based on the method of any one of claims 1-6, the method comprising:
Acquiring space-time data to be stored;
determining a storage grid to which the space-time data to be stored belongs according to the space dimension of the space-time data to be stored; the storage grid is obtained by dividing a preset map range according to the target division precision combination and the corresponding target alternative level number;
storing the space-time data to be stored into the storage grid, and taking the digital code of the storage grid as a distributed index of the space-time data to be stored.
8. A distributed index building apparatus, the apparatus comprising:
the quantity acquisition module is used for determining the quantity of each alternative level aiming at the map range when the space-time data is stored;
The combination determining module is used for constructing at least one division precision combination corresponding to each alternative level number by utilizing each division precision related to the map range; the number of the division precision in each division precision combination is the same as the number of the alternative levels, and the division precision in different division precision combinations is not completely the same, wherein the division precision is used for indicating the size of a grid for dividing the map range;
The combination selection module is used for selecting a target division precision combination meeting preset conditions from the constructed multiple division precision combinations; wherein the predetermined condition is: after each sample spatiotemporal data is stored in the map range according to the division precision combination, the quantity of the sample spatiotemporal data stored in the designated range of the map range is minimum;
and the index construction module is used for constructing a distributed index for storing the space-time data for the map range according to the target division precision combination and the corresponding target alternative level number.
9. A data storage device of a distributed index constructed based on the method of any one of claims 1-6, the device comprising:
the data acquisition module is used for acquiring space-time data to be stored;
The grid determining module is used for determining a storage grid to which the space-time data to be stored belongs according to the space dimension of the space-time data to be stored; the storage grid is obtained by dividing a preset map range according to the target division precision combination and the corresponding target alternative level number;
And the data storage module is used for storing the space-time data to be stored into the storage grid and taking the digital code of the storage grid as a distributed index of the space-time data to be stored.
10. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
A memory for storing a computer program;
A processor for carrying out the method steps of any one of claims 1-6 and/or the method steps of claim 7 when executing a program stored on a memory.
11. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-6 and/or the method steps of claim 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011183760.2A CN112307025B (en) | 2020-10-29 | 2020-10-29 | Distributed index construction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011183760.2A CN112307025B (en) | 2020-10-29 | 2020-10-29 | Distributed index construction method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112307025A CN112307025A (en) | 2021-02-02 |
CN112307025B true CN112307025B (en) | 2024-06-04 |
Family
ID=74332184
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011183760.2A Active CN112307025B (en) | 2020-10-29 | 2020-10-29 | Distributed index construction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112307025B (en) |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11174954A (en) * | 1997-12-17 | 1999-07-02 | Mitsubishi Electric Corp | Map data management method, route search device, and storage medium |
WO2003098578A1 (en) * | 2002-05-17 | 2003-11-27 | Xanavi Informatics Corporation | Map data product, map data processing program product, map data processing method, and map data processing device |
WO2004008073A1 (en) * | 2002-07-17 | 2004-01-22 | Xanavi Informatics Corporation | Navigation method, processing method for navigation system, map data management device, map data management program, and computer program |
CN101370025A (en) * | 2007-08-17 | 2009-02-18 | 北京灵图软件技术有限公司 | Storing method, scheduling method and management system for geographic information data |
WO2009042271A1 (en) * | 2007-09-28 | 2009-04-02 | Hewlett-Packard Development Company, L.P. | Method and system for visualizing distributed systems |
JP2012058836A (en) * | 2010-09-06 | 2012-03-22 | Yahoo Japan Corp | Distributed processing system and distributed processing method |
JP2013130909A (en) * | 2011-12-20 | 2013-07-04 | Yahoo Japan Corp | Information processor and information processing method |
CN103365948A (en) * | 2012-03-30 | 2013-10-23 | 富士通株式会社 | Data management apparatus and data management method |
WO2014057524A1 (en) * | 2012-10-09 | 2014-04-17 | 三菱電機株式会社 | Map data storage device and map display device |
CN104182453A (en) * | 2014-06-20 | 2014-12-03 | 银江股份有限公司 | Distributed map matching method for massive historical floating car data |
CN109299060A (en) * | 2018-11-22 | 2019-02-01 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Distributed file method and system based on image rectangular block |
CN110399446A (en) * | 2019-07-26 | 2019-11-01 | 广州市城市规划勘测设计研究院 | Visualization method, device, equipment and storage medium for large-scale spatio-temporal data |
CN110597935A (en) * | 2019-08-05 | 2019-12-20 | 北京云和时空科技有限公司 | A method and device for spatial analysis |
CN111190987A (en) * | 2019-12-31 | 2020-05-22 | 武汉中海庭数据技术有限公司 | Map data distributed storage system based on administrative division |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8982130B2 (en) * | 2011-07-15 | 2015-03-17 | Green Charge Networks | Cluster mapping to highlight areas of electrical congestion |
US20190188305A1 (en) * | 2017-12-15 | 2019-06-20 | Conduce, Inc. | Spatial and temporal data storage and retrieval |
-
2020
- 2020-10-29 CN CN202011183760.2A patent/CN112307025B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH11174954A (en) * | 1997-12-17 | 1999-07-02 | Mitsubishi Electric Corp | Map data management method, route search device, and storage medium |
WO2003098578A1 (en) * | 2002-05-17 | 2003-11-27 | Xanavi Informatics Corporation | Map data product, map data processing program product, map data processing method, and map data processing device |
WO2004008073A1 (en) * | 2002-07-17 | 2004-01-22 | Xanavi Informatics Corporation | Navigation method, processing method for navigation system, map data management device, map data management program, and computer program |
CN101370025A (en) * | 2007-08-17 | 2009-02-18 | 北京灵图软件技术有限公司 | Storing method, scheduling method and management system for geographic information data |
WO2009042271A1 (en) * | 2007-09-28 | 2009-04-02 | Hewlett-Packard Development Company, L.P. | Method and system for visualizing distributed systems |
JP2012058836A (en) * | 2010-09-06 | 2012-03-22 | Yahoo Japan Corp | Distributed processing system and distributed processing method |
JP2013130909A (en) * | 2011-12-20 | 2013-07-04 | Yahoo Japan Corp | Information processor and information processing method |
CN103365948A (en) * | 2012-03-30 | 2013-10-23 | 富士通株式会社 | Data management apparatus and data management method |
WO2014057524A1 (en) * | 2012-10-09 | 2014-04-17 | 三菱電機株式会社 | Map data storage device and map display device |
CN104182453A (en) * | 2014-06-20 | 2014-12-03 | 银江股份有限公司 | Distributed map matching method for massive historical floating car data |
CN109299060A (en) * | 2018-11-22 | 2019-02-01 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Distributed file method and system based on image rectangular block |
CN110399446A (en) * | 2019-07-26 | 2019-11-01 | 广州市城市规划勘测设计研究院 | Visualization method, device, equipment and storage medium for large-scale spatio-temporal data |
CN110597935A (en) * | 2019-08-05 | 2019-12-20 | 北京云和时空科技有限公司 | A method and device for spatial analysis |
CN111190987A (en) * | 2019-12-31 | 2020-05-22 | 武汉中海庭数据技术有限公司 | Map data distributed storage system based on administrative division |
Non-Patent Citations (3)
Title |
---|
Efficient map/reduce-based dbscan algorithm with optimized data partition;Dai B R, et al;2012 IEEE Fifth international conference on cloud computing;第59-66页 * |
土地利用时空数据管理与分析关键技术研究;郜允兵;中国农业大学;第1-146页 * |
基于多级格网与STR树的混合索引研究;黄志;浙江大学;第1-69页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112307025A (en) | 2021-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106407207B (en) | Real-time newly-added data updating method and device | |
CN109522382B (en) | Spatial data gridding statistical method and device | |
US20200057782A1 (en) | Optimized navigable key-value store | |
CN110168532B (en) | Data update method and storage device | |
US11429581B2 (en) | Spatial-temporal query for cognitive IoT contexts | |
CN111552694A (en) | Self-adaptive geographic space grid indexing method | |
CN111858607B (en) | Data processing method, device, electronic equipment and computer readable medium | |
CN113722415A (en) | Point cloud data processing method and device, electronic equipment and storage medium | |
CN109063194A (en) | Data retrieval method and device based on space encoding | |
CN117992562B (en) | Data processing method, data query method, computing device, storage medium and program product | |
CN112307025B (en) | Distributed index construction method and device | |
CN112100175A (en) | Partition data directional transmission method and device | |
CN116466885A (en) | Data access method and data processing system | |
CN113392130B (en) | Data processing method, device and equipment | |
CN112612784B (en) | River basin calculation unit automatic dividing method and device and computer equipment | |
CN109769042A (en) | A kind of localization method and device | |
CN117271480B (en) | Data processing method, device, electronic equipment and medium | |
CN111782588A (en) | File reading method, device, equipment and medium | |
CN116561374B (en) | Resource determination method, device, equipment and medium based on semi-structured storage | |
JP5953262B2 (en) | DATA INDEX DEVICE, DATA INDEX METHOD, AND PROGRAM | |
CN112052268A (en) | Data query method and device and electronic equipment | |
CN116028337B (en) | Application-based data processing method, device, computer equipment, and storage medium | |
CN119166709B (en) | Method, device, electronic equipment and medium for data storage and database access | |
CN111506790B (en) | Method, system, device and storage medium for determining extraction object and refreshing data | |
CN112612415B (en) | Data processing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |