CN116303434A - Grid R-tree Hybrid Index Construction Method, Retrieval Method and Device - Google Patents
Grid R-tree Hybrid Index Construction Method, Retrieval Method and Device Download PDFInfo
- Publication number
- CN116303434A CN116303434A CN202310135732.0A CN202310135732A CN116303434A CN 116303434 A CN116303434 A CN 116303434A CN 202310135732 A CN202310135732 A CN 202310135732A CN 116303434 A CN116303434 A CN 116303434A
- Authority
- CN
- China
- Prior art keywords
- grid
- index
- tree
- feature element
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2246—Trees, e.g. B+trees
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Remote Sensing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域technical field
本发明涉及信息技术领域,具体涉及一种基于网格R树混合索引构建方法、检索方法和装置。The invention relates to the field of information technology, in particular to a method for constructing a hybrid index based on a grid R tree, a retrieval method and a device.
背景技术Background technique
近年来,随着地理信息系统在电力行业、交通运输、气象预警、水文监测、车辆物流、防灾减灾、农林渔业、公共安全等领域得到广泛应用,空间数据信息的价值在各行各业体现尤为明显,针对日积月累的空间大数据的统一存储、集中管理、高效访问成为了研究地理信息系统的关键处理技术。In recent years, as geographic information systems have been widely used in the power industry, transportation, meteorological early warning, hydrological monitoring, vehicle logistics, disaster prevention and reduction, agriculture, forestry and fishery, public safety and other fields, the value of spatial data information is particularly reflected in all walks of life. Obviously, the unified storage, centralized management, and efficient access to the accumulated spatial big data have become the key processing technologies for the study of geographic information systems.
当前,由于空间数据信息的数据量规模庞大,且这些空间数据信息具有多样性和复杂性,因而在实际运用过程中对于空间数据的查询性能要求极高,并将空间查询性能作为衡量空间数据操作的重要指标。At present, due to the large scale of spatial data information, and the diversity and complexity of these spatial data information, the query performance of spatial data is extremely high in the actual application process, and the spatial query performance is used as a measure of spatial data operations. important indicators.
需要指出的是,空间数据查询技术主要是构建各类空间索引,以实现空间数据查询操作,从而满足空间数据高性能访问要求。其中,空间索引主要包括:网格索引、KD树索引、四叉树索引、R树索引等类型。It should be pointed out that the spatial data query technology is mainly to build various spatial indexes to realize the spatial data query operation, so as to meet the high-performance access requirements of spatial data. Among them, the spatial index mainly includes: grid index, KD tree index, quadtree index, R tree index and other types.
然而,发明人研究发现,传统空间索引主要采用单一性内存索引结构进行空间数据管理,其只能满足数据量规模较小的空间数据访问,且空间索引数据结构全量加载到内存中,这种技术手段的内存开销较大,构建索引速度慢,查询效率低,实际应用效果不佳。However, the inventor found that the traditional spatial index mainly uses a single memory index structure for spatial data management, which can only meet the spatial data access with a small data volume, and the spatial index data structure is fully loaded into the memory. The memory overhead of the method is large, the index building speed is slow, the query efficiency is low, and the actual application effect is not good.
由此,当前传统的空间索引手段存在以下几个缺点:Therefore, the current traditional spatial indexing methods have the following disadvantages:
(1)传统单一内存空间索引仅支持数据规模较小的空间数据。(1) The traditional single-memory spatial index only supports spatial data with a small data size.
(2)传统单一内存空间索引查询效率低。空间数据检索效率作为空间索引的重要指标,由于大规模多样化无规则空间矢量数据,空间分布没有规律,采用单一二维树索引结构,将会导致索引树高度变高,空间数据检索代价较大,查询变慢。(2) The query efficiency of the traditional single memory space index is low. Spatial data retrieval efficiency is an important indicator of spatial indexing. Due to large-scale and diverse irregular spatial vector data, the spatial distribution is irregular. Using a single two-dimensional tree index structure will lead to a higher index tree height and higher spatial data retrieval costs. Larger, the query will be slower.
(3)传统单一内存空间索引,将空间矢量数据全量加载到内存中,对内存开销较大。(3) The traditional single-memory spatial index loads all the space vector data into the memory, which has a large memory overhead.
(4)传统单一内存空间索引顺序构建速度慢,其采用R树、四叉树、KD树索引等二维树索引结构,构建大规模空间数据速度慢,难于进行并行化构建。(4) The sequential construction speed of the traditional single memory space index is slow. It uses two-dimensional tree index structures such as R tree, quad tree, and KD tree index. The speed of building large-scale spatial data is slow, and it is difficult to carry out parallel construction.
(5)复杂几何对象构成的点集合总数比较大,一般达到几万个点数甚至更多,采用传统单一内存空间索引对复杂几何对象精确空间几何关系判断需要消耗较多cpu运算时间。(5) The total number of point sets composed of complex geometric objects is relatively large, generally reaching tens of thousands of points or even more. Using a traditional single memory space index to judge the precise spatial geometric relationship of complex geometric objects requires a lot of CPU computing time.
发明内容Contents of the invention
为了克服上述现有技术的缺陷,本发明所要解决的技术问题是:提供一种基于网格R树混合索引构建方法、检索方法和装置,该方案设计简单,其通过采用混合索引构建方式,可以支持超大规模的空间矢量数据索引构建,同时对复杂几何对象进行了三角剖分优化构建流程,以在降低内存使用开销同时,提升数据检索性能,并能够根据检索范围进行高效空间数据检索。In order to overcome the above-mentioned defects in the prior art, the technical problem to be solved by the present invention is to provide a method for building a hybrid index based on a grid R tree, a retrieval method and a device. It supports the construction of ultra-large-scale spatial vector data indexes, and optimizes the construction process of complex geometric objects by triangulation to reduce memory usage overhead while improving data retrieval performance and enabling efficient spatial data retrieval based on the retrieval range.
为了解决上述技术问题,本发明采用的第一个技术方案为:一种基于网格R树混合索引构建方法,其包括步骤:In order to solve the above-mentioned technical problems, the first technical solution adopted by the present invention is: a method for constructing a hybrid index based on a grid R tree, which includes the steps of:
S1:利用网格初始化参数初始化网格混合索引数据结构;S1: Initialize the grid hybrid index data structure by using grid initialization parameters;
S2:生成全局网格索引分区映射文件,并在所述全局网格索引分区映射文件中存储记录每个地物要素的地物要素顺序编号和地物要素唯一编号;S2: Generate a global grid index partition mapping file, and store and record the sequential number of the feature element and the unique number of the feature element in the global grid index partition mapping file;
S3:生成地物要素存储文件,并在所述地物要素存储文件中存储记录每个地物要素的地物要素信息;S3: Generate a feature element storage file, and store and record feature element information of each feature element in the feature element storage file;
S4:对每个地物要素的地物要素信息中的几何范围参数、构成点数、构成的多面孔洞个数进行分析判断,若所述几何范围参数、构成点数、构成的多面孔洞个数超过设定阈值,则判断该地物要素为复杂几何对象,并采用三角剖分算法将该地物要素的复杂几何对象拆分成若干个地物要素剖分三角形;S4: Analyze and judge the geometric range parameters, the number of constituent points, and the number of multifaceted holes in the feature element information of each feature element, if the geometric range parameters, constituent points, and the number of multifaceted holes exceed the set If the threshold is fixed, then it is judged that the feature element is a complex geometric object, and the complex geometric object of the feature element is split into several triangles of the feature element by using a triangulation algorithm;
S5:结合地物要素和/或地物要素剖分三角形的最小外接矩形及所述网格初始化参数计算获得地物要素的网格分区集合;S5: Combining the ground feature elements and/or the minimum circumscribed rectangle of the triangular subdivision of the ground feature elements and the grid initialization parameters to calculate and obtain the grid partition set of the ground feature elements;
S6:遍历所述地物要素的网格分区集合,为每个网格分区对应建立R树索引。S6: Traverse the set of grid partitions of the feature elements, and establish an R-tree index corresponding to each grid partition.
在本发明所设计的这种基于网格R树混合索引构建方法中,采用分而治之的思路进行存储优化,通过将超大规模的空间矢量数据按照网格一级索引分区,每个分区对应一个存储文件,以使每个分区文件对应一棵R树索引。本发明创造性地将网格索引与R树索引结合,有效发挥两个索引的优点,混合索引优势互补。其中,在分区后的R树索引降低了只有唯一R树的树高度,减小了查询矩形范围与更多层次的节点矩形范围进行几何相交判断,从而提高查询效率。In the grid-based R-tree hybrid index construction method designed by the present invention, the idea of divide and conquer is adopted for storage optimization, and the ultra-large-scale space vector data is partitioned according to the grid first-level index, and each partition corresponds to a storage file , so that each partition file corresponds to an R-tree index. The invention creatively combines the grid index and the R-tree index, effectively exerts the advantages of the two indexes, and the advantages of the mixed index are complementary. Among them, the R-tree index after partition reduces the tree height of only the unique R-tree, reduces the geometric intersection judgment between the query rectangle range and the node rectangle range of more levels, thereby improving the query efficiency.
此外,在本发明中,通过将复杂几何对象进行三角剖分,能够将复杂几何对象化解为若干简化的地物要素剖分三角形,以提高复杂对象精确化几何运算效率,提高查询效率,降低cpu资源使用率。In addition, in the present invention, by triangulating the complex geometric object, the complex geometric object can be decomposed into a number of simplified triangulations of ground feature elements, so as to improve the precision geometric operation efficiency of the complex object, improve the query efficiency, and reduce the cpu Resource usage.
将本发明所设计的技术方案与传统空间索引采用单一内存索引结构进行空间数据管理的技术方案进行对比不难看出:Comparing the technical solution designed by the present invention with the technical solution in which the traditional spatial index uses a single memory index structure for spatial data management, it is not difficult to see that:
(1)传统单一内存空间索引仅支持数据规模较小的空间数据。而本发明所提出的这种基于网格R树混合索引方案,能够将海量数据先通过网格进行分区,再在每个分区上对应独立构建一棵R树索引,以此满足海量空间矢量数据的存储组织管理。(1) The traditional single-memory spatial index only supports spatial data with a small data size. The grid-based R-tree hybrid index scheme proposed by the present invention can partition massive data through grids, and then independently construct an R-tree index on each partition, so as to meet the requirements of massive space vector data. storage organization management.
(2)传统单一内存空间索引查询效率低。而本发明提出的这种基于网格R树混合索引方案,通过网格划分出多棵R树索引,以使得每棵R树的索引矩形范围控制在一定的范围,减小了单棵树的几何对象数据量,降低了树高度,其在检索查询时的树路径较小,可快速获得查询结果集,从而提高检索查询效率。(2) The query efficiency of the traditional single memory space index is low. However, this grid-based R-tree hybrid index scheme proposed by the present invention divides multiple R-tree indexes through grids, so that the index rectangle range of each R-tree is controlled within a certain range, reducing the cost of a single tree. The amount of geometric object data reduces the height of the tree, and its tree path is small during the retrieval query, which can quickly obtain the query result set, thereby improving the efficiency of the retrieval query.
(3)传统单一内存空间索引,将空间矢量数据全量加载到内存中,对内存开销较大。而本发明提出的这种基于网格R树混合索引方案,首先将海量空间矢量数据进行分区管理,减小每个分区的空间数据量,同时构建一棵基于磁盘的R树索引,以支持按需从磁盘中访问相关空间数据。(3) The traditional single-memory spatial index loads all the space vector data into the memory, which has a large memory overhead. However, the grid-based R-tree hybrid indexing scheme proposed by the present invention first partitions massive space vector data to reduce the amount of spatial data in each partition, and at the same time builds a disk-based R-tree index to support Relevant spatial data needs to be accessed from disk.
(4)传统单一内存空间索引顺序构建速度慢。而本发明提出的这种基于网格R树混合索引方案,采用一级网格索引进行分区,将数据规模划小,并且每个网格二级索引都相互独立,可开展独立索引构建,其易于实现并行化多棵树索引结构同时构建,能够有效缩短整体空间索引构建时间。(4) The sequential construction speed of the traditional single memory space index is slow. However, the grid R-tree-based hybrid index scheme proposed by the present invention adopts the first-level grid index for partitioning, and the data size is reduced, and each grid second-level index is independent of each other, and independent index construction can be carried out. It is easy to implement parallel construction of multiple tree index structures at the same time, which can effectively shorten the overall spatial index construction time.
(5)传统内存空间索引针对复杂几何对象的查询精确化几何关系运算耗时。而本发明提出的这种基于网格R树混合索引方案,特地针对复杂几何对象进行了索引构建处理,其将复杂对象进行三角剖分处理,然后再构建索引。当精确匹配时,由于复杂对象由若干个简单的地物要素剖分三角形组成,在大部分情况下查询范围只需要跟剖分几何对象进行精细化匹配,其能够有效提高查询效率,并减少cpu的运算消耗。(5) Traditional in-memory spatial indexes are time-consuming for querying complex geometric objects to refine geometric relations. However, the grid R-tree-based hybrid indexing scheme proposed by the present invention specifically performs index construction processing for complex geometric objects, which performs triangulation processing on complex objects, and then constructs indexes. When matching accurately, since complex objects are composed of several simple subdivision triangles of ground features, in most cases the query range only needs to be finely matched with subdivision geometric objects, which can effectively improve query efficiency and reduce cpu computing consumption.
进一步的,在本发明所述的基于网格R树混合索引构建方法中,在步骤S6中,为每个网格分区对应建立R树索引,具体包括如下步骤:Further, in the grid R-tree-based hybrid index construction method of the present invention, in step S6, an R-tree index is correspondingly established for each grid partition, specifically including the following steps:
S61:初始化R树索引,创建R树索引文件,并设定R树节点的最大节点数量N;S61: Initialize the R-tree index, create an R-tree index file, and set the maximum node number N of R-tree nodes;
S62:在插入R树索引文件时,首先从根节点出发选择地物要素索引项插入的合适路径,将地物要素的最小外接矩形与每个节点的矩形范围进行几何关系运算,找出一条从根节点到叶子节点的插入路径;S62: When inserting the R-tree index file, firstly select the appropriate path for inserting the feature element index item starting from the root node, perform geometric relationship calculation on the minimum circumscribing rectangle of the feature element and the rectangular range of each node, and find a path from The insertion path from the root node to the leaf node;
S63:将所述地物要素索引项以追加的方式插入到所述插入路径的叶子节点;S63: Insert the feature element index item into the leaf node of the insertion path in an additional manner;
S64:若所述叶子节点的节点数量已满,即超过了R树节点的最大节点数量N,则该叶子节点将分裂为两个节点,整个过程自底向上沿着所述插入路径由叶子节点向根节点传播;若根节点进行分裂,则整棵R树索引将增加一层。S64: If the number of nodes of the leaf node is full, that is, exceeds the maximum number of nodes N of the R tree node, the leaf node will be split into two nodes, and the whole process is followed by the leaf node along the insertion path from bottom to top Propagate to the root node; if the root node splits, the entire R-tree index will increase by one layer.
进一步的,在本发明所述的基于网格R树混合索引构建方法中,在步骤S1中,根据网格最小经度grid_minx、网格最小纬度grid_miny、网格最大经度grid_maxx、网格最大纬度grid_maxy、网格大小grid_cellsize、网格行数grid_rows、网格列数grid_columns初始化网格混合索引数据结构。Further, in the grid R tree-based hybrid index construction method according to the present invention, in step S1, according to grid minimum longitude grid_minx, grid minimum latitude grid_miny, grid maximum longitude grid_maxx, grid maximum latitude grid_maxy, The grid size grid_cellsize, the number of grid rows grid_rows, and the number of grid columns grid_columns initialize the grid hybrid index data structure.
进一步的,在本发明所述的基于网格R树混合索引构建方法中,在步骤S2中,所述地物要素顺序编号在地物要素索引构建时按照整数序列号从1开始依次递增生成,所述地物要素唯一编号为地物要素唯一标识的字符串。Further, in the method for constructing a hybrid index based on a grid R-tree according to the present invention, in step S2, the sequence numbers of the feature elements are incrementally generated according to the integer sequence numbers starting from 1 when the feature element index is constructed, The unique number of the feature element is a character string uniquely identifying the feature element.
进一步的,在本发明所述的基于网格R树混合索引构建方法中,在步骤S3中,所述地物要素信息包括地物要素空间属性信息和地物要素实体属性信息,所述地物要素空间属性信息包括地物要素几何对象形状,所述地物要素实体属性信息包括地物要素唯一编号标识、地物要素名称。Further, in the method for constructing a hybrid index based on a grid R tree according to the present invention, in step S3, the feature element information includes feature element spatial attribute information and feature element entity attribute information, and the feature element The spatial attribute information of the feature includes the shape of the geometric object of the feature element, and the entity attribute information of the feature element includes the unique number identification of the feature element and the name of the feature element.
为了解决上述技术问题,本发明采用的第二个技术方案为:一种基于网格R树混合索引检索方法,所述网格R树混合索引是根据上述任一项所述的基于网格R树混合索引构建方法构建,本发明所述的基于网格R树混合索引检索方法具体包括步骤:In order to solve the above technical problems, the second technical solution adopted by the present invention is: a retrieval method based on a grid R tree hybrid index, the grid R tree hybrid index is based on the grid R tree described in any one of the above The tree hybrid index construction method is constructed, and the grid R tree hybrid index retrieval method based on the present invention specifically includes steps:
100:根据输入几何对象形状,计算所述几何对象的最小外接矩形作为空间数据的查询矩形范围;100: According to the shape of the input geometric object, calculate the smallest circumscribed rectangle of the geometric object as the query rectangular range of the spatial data;
200:结合网格初始化参数计算所述查询矩形范围所在网格分区集合;200: Combining with the grid initialization parameters, calculate the grid partition set where the query rectangle range is located;
300:遍历所述网格分区集合,并加载对应网格分区的R树索引,根据所述查询矩形范围并行对所述对应网格分区的R树索引开展空间检索操作,以收集每个R树索引的结果集,并去除结果集中重复的地物要素顺序编号,合并成候选地物要素编号结果集;300: Traverse the set of grid partitions, load the R-tree index corresponding to the grid partition, and perform a spatial search operation on the R-tree index corresponding to the grid partition in parallel according to the scope of the query rectangle, so as to collect each R-tree The result set of the index, and remove the sequence numbers of the repeated feature elements in the result set, and merge them into a result set of candidate feature element numbers;
400:遍历候选地物要素编号结果集,从全局网格索引分区映射文件的关系中获得每个地物要素的地物要素唯一编号,然后再从地物要素存储文件中获取对应地物要素的地物要素信息,以构成完整的候选地物要素结果集;400: Traversing the result set of candidate feature element numbers, obtaining the unique number of each feature element from the relationship of the global grid index partition mapping file, and then obtaining the corresponding feature element number from the feature element storage file Feature information to form a complete result set of candidate feature features;
500:遍历候选地物要素结果集,获取地物要素剖分三角形筛选出的候选地物要素,并将其他的地物要素几何对象形状和输入几何对象形状进行精确几何相交判断,以筛选出相交的地物要素结果集,将所述地物要素剖分三角形筛选出的候选地物要素直接加入所述相交的地物要素结果集,以生成最终地物要素结果集。500: Traverse the result set of candidate feature elements, obtain the candidate feature elements screened out by triangulation of the feature elements, and perform precise geometric intersection judgments on the geometric object shapes of other feature elements and the shape of the input geometric object to filter out the intersection The result set of feature elements, the candidate feature elements screened out by the triangulation of the feature elements are directly added to the result set of intersecting feature elements, so as to generate the final result set of feature elements.
进一步的,在本发明所述的基于网格R树混合索引检索方法中,在上述步骤100中,所述几何对象的最小外接矩形由左下角点和右上角点信息组成,其包括最小经度query_minX、最小纬度query_minY、最大经度query_maxX和最大纬度query_maxY。Further, in the grid R tree-based hybrid index retrieval method of the present invention, in the above step 100, the minimum circumscribed rectangle of the geometric object is composed of the lower left corner point and the upper right corner point information, which includes the minimum longitude query_minX , minimum latitude query_minY, maximum longitude query_maxX and maximum latitude query_maxY.
进一步的,在本发明所述的基于网格R树混合索引检索方法中,在上述步骤200中,通过已计算出的所述查询矩形范围跨越的网格区域的最小网格X方向编号cell_minx、最小网格Y方向编号cell_miny、最大网格X方向编号cell_maxx、最大网格Y方向编号cell_maxy,获得网格区域的网格分区集合。Further, in the grid R tree-based hybrid index retrieval method of the present invention, in the above step 200, the calculated minimum grid X-direction number cell_minx, The minimum grid number cell_miny in the Y direction, the maximum grid number cell_maxx in the X direction, and the maximum grid number cell_maxy in the Y direction obtain the grid partition set of the grid area.
进一步的,在本发明所述的基于网格R树混合索引检索方法中,在上述步骤300中,根据所述查询矩形范围并行对所述对应网格分区的R树索引开展空间检索操作,以收集每个R树索引的结果集,具体包括如下步骤301-302:Further, in the grid R-tree-based hybrid index retrieval method of the present invention, in the above step 300, the spatial retrieval operation is performed on the R-tree index corresponding to the grid partition in parallel according to the scope of the query rectangle, so as to Collecting the result set of each R-tree index specifically includes the following steps 301-302:
301:从R树索引的根节点开始查询匹配,将所述查询矩形范围与所述根节点进行几何相交判断,若所述查询矩形范围与途经的根节点有交集,则需要与该根节点的所有子节点进行几何相交判断;301: Start to query and match from the root node of the R-tree index, and determine the geometric intersection between the query rectangle range and the root node. All child nodes perform geometric intersection judgment;
302:按照深度优先算法遍历R树索引,迭代遍历R树索引各个分支直至叶子节点,并收集有交集的叶子节点结果集;302: Traverse the R-tree index according to the depth-first algorithm, iteratively traverse each branch of the R-tree index until the leaf nodes, and collect the leaf node result sets with intersections;
303:将检索获取的所有结果集去除重复后,合并成对应R树的结果集。303: After removing duplications from all the result sets acquired by the retrieval, merge them into a result set corresponding to the R tree.
为了解决上述技术问题,本发明采用的第三个技术方案为:一种基于网格R树混合索引构建检索装置,其包括:In order to solve the above-mentioned technical problems, the third technical solution adopted by the present invention is: a retrieval device based on grid R-tree hybrid index construction, which includes:
索引元数据模块,所述索引元数据模块用以记录网格R树混合索引的基础数据信息,所述基础数据信息包括网格初始化参数信息和R树参数信息;An index metadata module, the index metadata module is used to record the basic data information of the grid R-tree hybrid index, and the basic data information includes grid initialization parameter information and R-tree parameter information;
三角剖分模块,所述三角剖分模块用以判断地物要素是否为复杂几何对象,并针对复杂几何对象进行三角剖分,以将地物要素的复杂几何对象剖分为若干个地物要素剖分三角形;A triangulation module, the triangulation module is used to judge whether the feature element is a complex geometric object, and perform triangulation on the complex geometric object, so as to divide the complex geometric object of the feature element into several feature elements Divide the triangle;
网格索引模块,所述网格索引模块用于计算地物要素网格索引,并根据索引元数据信息生成网格索引结构;A grid index module, the grid index module is used to calculate the grid index of feature elements, and generate a grid index structure according to the index metadata information;
R树索引模块,所述R树索引模块用以实现R树索引文件的生成和加载,并提供R树索引的数据节点插入算法、R树空间查询算法;R tree index module, the R tree index module is used to realize the generation and loading of R tree index files, and provide the data node insertion algorithm and R tree space query algorithm of R tree index;
全局网格索引分区模块,所述全局网格索引分区模块用以实现全局网格索引分区映射文件生成和加载,并提供地物要素顺序编号与地物要素唯一编号的映射关系建立功能;A global grid index partition module, the global grid index partition module is used to realize the generation and loading of the global grid index partition mapping file, and provide the function of establishing the mapping relationship between the sequence number of the feature element and the unique number of the feature element;
地物要素存储模块,所述地物要素存储模块用以实现地物要素存储文件的生成和加载,提供对地物要素的基础属性信息存储,地物要素空间几何对象的序列化和地物要素相关属性信息的序列化存储功能;The feature element storage module, the feature element storage module is used to realize the generation and loading of the feature element storage file, provide the basic attribute information storage of the feature element, the serialization of the spatial geometric object of the feature element and the feature element Serialized storage function of related attribute information;
查询数据预处理模块,所述查询数据预处理模块用以实现对输入几何对象的数据预处理,根据几何对象要素类型生成对应的最小外接矩形,调用网格索引模块和R树索引模块实现空间检索;Query data preprocessing module, the query data preprocessing module is used to realize the data preprocessing of the input geometric object, generate the corresponding minimum circumscribed rectangle according to the element type of the geometric object, and call the grid index module and the R tree index module to realize the spatial retrieval ;
查询结果集生成模块,所述查询结果集生成模块能够对查询候选地物要素结果集的合并,并对查询输入几何对象和候选结果集进行精确几何匹配生成最终地物要素结果集。A query result set generation module, the query result set generation module is capable of merging query candidate feature element result sets, and performing precise geometric matching on query input geometric objects and candidate result sets to generate a final feature element result set.
本发明的有益效果在于:本发明所设计的这种基于网格R树混合索引构建方法、检索方法和装置的方案设计简单,其通过为大规模空间矢量数据构建一级网格索引分区,再对每个网格索引分区构建基于磁盘的二级R树索引,以实现对海量空间矢量数据索引存储,并提供高效的混合索引空间查询方法,能够满足高性能空间矢量数据访问要求。The beneficial effects of the present invention are: the scheme design of the grid R-tree-based hybrid index construction method, retrieval method and device designed by the present invention is simple, and it constructs a first-level grid index partition for large-scale space vector data, and then Construct a disk-based secondary R-tree index for each grid index partition to achieve index storage of massive spatial vector data, and provide an efficient hybrid index spatial query method, which can meet the high-performance spatial vector data access requirements.
本发明创造性地将网格索引与R树索引结合,有效发挥两个索引的优点,混合索引优势互补,以利用混合索引构建方式,支持超大规模的空间矢量数据索引构建;此外,本发明还针对复杂几何对象进行了三角剖分优化构建流程,其基于内外存结合的方式能够根据检索范围进行高效空间数据检索,以在降低内存使用开销的同时,提升数据检索性能。The present invention creatively combines the grid index and the R-tree index to effectively utilize the advantages of the two indexes, and the advantages of the hybrid index are complementary, so as to use the hybrid index construction method to support the construction of a super-large-scale space vector data index; in addition, the present invention also aims at Complex geometric objects are triangulated to optimize the construction process. Based on the combination of internal and external memory, it can perform efficient spatial data retrieval according to the retrieval range, so as to reduce memory usage overhead and improve data retrieval performance.
附图说明Description of drawings
图1为本发明所述的一种基于网格R树混合索引构建方法在一种实施方式下的步骤流程图;Fig. 1 is a flow chart of steps in an embodiment of a method for constructing a hybrid index based on a grid R tree according to the present invention;
图2为本发明所述的网格R树混合索引数据结构的结构示意图。FIG. 2 is a schematic structural diagram of a grid R-tree hybrid index data structure according to the present invention.
图3为本发明所述的一种基于网格R树混合索引检索方法在一种实施方式下的步骤流程图;Fig. 3 is a flow chart of steps in an embodiment of a grid-based R-tree hybrid index retrieval method according to the present invention;
图4为本发明所述的基于网格R树混合索引构建检索装置在一种实施方式下的功能模块示意图。Fig. 4 is a schematic diagram of the functional modules of the retrieval device based on the grid R-tree hybrid index construction in an implementation manner according to the present invention.
具体实施方式Detailed ways
为详细说明本发明的技术内容、所实现目的及效果,以下结合实施方式并配合附图予以说明。In order to describe the technical content, achieved goals and effects of the present invention in detail, the following descriptions will be made in conjunction with the embodiments and accompanying drawings.
请参照图1和图2所示,在本实施方式中,本发明设计了一种基于网格R树混合索引构建方法,其包括步骤:Please refer to Fig. 1 and Fig. 2, in this embodiment, the present invention designs a method for constructing a hybrid index based on a grid R tree, which includes steps:
S1:利用网格初始化参数初始化网格混合索引数据结构;S1: Initialize the grid hybrid index data structure by using grid initialization parameters;
S2:生成全局网格索引分区映射文件,并在所述全局网格索引分区映射文件中存储记录每个地物要素的地物要素顺序编号和地物要素唯一编号;S2: Generate a global grid index partition mapping file, and store and record the sequential number of the feature element and the unique number of the feature element in the global grid index partition mapping file;
S3:生成地物要素存储文件,并在所述地物要素存储文件中存储记录每个地物要素的地物要素信息;S3: Generate a feature element storage file, and store and record feature element information of each feature element in the feature element storage file;
S4:对每个地物要素的地物要素信息中的几何范围参数、构成点数、构成的多面孔洞个数进行分析判断,若所述几何范围参数、构成点数、构成的多面孔洞个数超过设定阈值,则判断该地物要素为复杂几何对象,并采用三角剖分算法将该地物要素的复杂几何对象拆分成若干个地物要素剖分三角形;S4: Analyze and judge the geometric range parameters, the number of constituent points, and the number of multifaceted holes in the feature element information of each feature element, if the geometric range parameters, constituent points, and the number of multifaceted holes exceed the set If the threshold is fixed, then it is judged that the feature element is a complex geometric object, and the complex geometric object of the feature element is split into several triangles of the feature element by using a triangulation algorithm;
S5:结合地物要素和/或地物要素剖分三角形的最小外接矩形及所述网格初始化参数计算获得地物要素的网格分区集合;S5: Combining the ground feature elements and/or the minimum circumscribed rectangle of the triangular subdivision of the ground feature elements and the grid initialization parameters to calculate and obtain the grid partition set of the ground feature elements;
S6:遍历所述地物要素的网格分区集合,为每个网格分区对应建立R树索引。S6: Traverse the set of grid partitions of the feature elements, and establish an R-tree index corresponding to each grid partition.
参阅图1,并结合图2所示的这种网格R树混合索引数据结构不难看出,本发明最关键的构思在于:通过为大规模空间矢量数据构建一级网格索引分区,再对每个网格索引分区构建基于磁盘的二级R树索引,实现对海量空间矢量数据索引存储,以提供高效的混合索引空间查询方法,并能够满足高性能空间矢量数据访问要求。Referring to Fig. 1, combined with the grid R-tree hybrid index data structure shown in Fig. 2, it is not difficult to see that the most critical idea of the present invention is: by constructing a first-level grid index partition for large-scale space vector data, and then Each grid index partition builds a disk-based secondary R-tree index to realize the index storage of massive spatial vector data, so as to provide an efficient hybrid index spatial query method and meet the high-performance spatial vector data access requirements.
如图1所示,发明人为地物要素和/或地物要素剖分三角形创建一级网格索引分区,并将地物要素和/或地物要素剖分三角形写入相关关联网格索引文件,使每个分区对应一个存储文件,以使每个分区存储文件对应一棵R树索引,而后再将地物要素索引项写入二级R树索引文件,以对每个网格索引分区构建基于磁盘的二级R树索引。As shown in Figure 1, the inventor creates a first-level grid index partition for the feature elements and/or the triangulation of the feature elements, and writes the feature elements and/or the triangulation triangles of the feature elements into the relevant associated grid index file , so that each partition corresponds to a storage file, so that each partition storage file corresponds to an R-tree index, and then write the feature element index items into the secondary R-tree index file to construct each grid index partition Disk-based secondary R-tree index.
此外,在本发明中,通过将复杂几何对象进行三角剖分,能够将复杂几何对象化解为若干简化的地物要素剖分三角形,以提高复杂对象精确化几何运算效率,提高查询效率,降低cpu资源使用率。其中,在步骤S4中,基于地物要素的复杂几何对象拆分成的若干个地物要素剖分三角形能够用于后续查询运算优化,所有的地物要素剖分三角形能够成为索引对象,并与原地物要素相关联。In addition, in the present invention, by triangulating the complex geometric object, the complex geometric object can be decomposed into a number of simplified triangulations of ground feature elements, so as to improve the precision geometric operation efficiency of the complex object, improve the query efficiency, and reduce the cpu Resource usage. Wherein, in step S4, the several subdivided triangles of feature elements based on the complex geometric object of feature elements can be used for subsequent query operation optimization, and all the subdivided triangles of feature elements can become index objects, and can be combined with Associated with original features.
另外,还需要注意的是,本发明所设计的这种基于网格R树混合索引构建方法还可以包括步骤S7:将所有地物要素的地物要素信息写入R树索引文件后,刷新保存文件,并持久化写入本地磁盘。In addition, it should also be noted that the grid R-tree-based hybrid index construction method designed by the present invention may also include step S7: After writing the feature element information of all feature elements into the R-tree index file, refresh and save file, and write it persistently to the local disk.
在具体实施应用时,每个R树索引文件能够以X方向和Y方向网格编号命名,上述步骤S7中保存文件可以具体命名为R{cell_row}_{cell_column}.idx;其中,cell_row为网格Y方向行编号,cell_column为网格X方向列编号。When implementing the application, each R-tree index file can be named after the X-direction and Y-direction grid numbers, and the file saved in the above step S7 can be specifically named as R{cell_row}_{cell_column}.idx; wherein, cell_row is the grid number The row number in the Y direction of the grid, and cell_column is the column number in the X direction of the grid.
进一步的,在步骤S6中,为每个网格分区对应建立R树索引,具体包括如下步骤:Further, in step S6, an R-tree index is correspondingly established for each grid partition, specifically including the following steps:
S61:初始化R树索引,创建R树索引文件,并设定R树节点的最大节点数量N;S61: Initialize the R-tree index, create an R-tree index file, and set the maximum node number N of R-tree nodes;
S62:在插入R树索引文件时,首先从根节点出发选择地物要素索引项插入的合适路径,将地物要素的最小外接矩形与每个节点的矩形范围进行几何关系运算,找出一条从根节点到叶子节点的插入路径;S62: When inserting the R-tree index file, firstly select the appropriate path for inserting the feature element index item starting from the root node, perform geometric relationship calculation on the minimum circumscribing rectangle of the feature element and the rectangular range of each node, and find a path from The insertion path from the root node to the leaf node;
S63:将所述地物要素索引项以追加的方式插入到所述插入路径的叶子节点;S63: Insert the feature element index item into the leaf node of the insertion path in an additional manner;
S64:若所述叶子节点的节点数量已满,即超过了R树节点的最大节点数量N,则该叶子节点将分裂为两个节点,整个过程自底向上沿着所述插入路径由叶子节点向根节点传播;若根节点进行分裂,则整棵R树索引将增加一层。S64: If the number of nodes of the leaf node is full, that is, exceeds the maximum number of nodes N of the R tree node, the leaf node will be split into two nodes, and the whole process is followed by the leaf node along the insertion path from bottom to top Propagate to the root node; if the root node splits, the entire R-tree index will increase by one layer.
在本发明所设计的步骤S6中,需要遍历地物要素的网格分区集合,为每个网格分区对应建立R树索引。其中,可以具体针对地物要素和/或地物要素剖分三角形数据根据网格分区开展并行构建R树索引,以首先对未初始化的网格分区R树索引进行数据结构初始化准备,然后根据R树插入算法将地物要素插入基于磁盘的R树索引文件。In step S6 designed by the present invention, it is necessary to traverse the grid partition set of feature elements, and establish an R-tree index for each grid partition. Among them, the R-tree index can be constructed in parallel according to the grid partition specifically for the surface feature element and/or the triangular data of the surface feature element, so as to first prepare the data structure initialization for the uninitialized grid partition R-tree index, and then according to the R The tree insertion algorithm inserts feature features into disk-based R-tree index files.
进一步的,在步骤S1中,根据网格最小经度grid_minx、网格最小纬度grid_miny、网格最大经度grid_maxx、网格最大纬度grid_maxy、网格大小grid_cellsize、网格行数grid_rows、网格列数grid_columns初始化网格混合索引数据结构。Further, in step S1, according to grid minimum longitude grid_minx, grid minimum latitude grid_miny, grid maximum longitude grid_maxx, grid maximum latitude grid_maxy, grid size grid_cellsize, grid row number grid_rows, grid column number grid_columns initialization Grid hybrid index data structure.
需要说明的是,在本实施方式中,在本发明上述的步骤S1中,可以具体根据网格最小经度grid_minx、网格最小纬度grid_miny、网格最大经度grid_maxx、网格最大纬度grid_maxy、网格大小grid_cellsize、网格行数grid_rows、网格列数grid_columns初始化网格索引结构,以将给定地理空间范围进行网格划分,划分成大小相同的网格;其中,每个网格数据结构可以具体包括:外包矩形、X方向网格编号、Y方向网格编号和网格关联二级索引R树数据结构。It should be noted that, in this embodiment, in the above-mentioned step S1 of the present invention, the grid minimum longitude grid_minx, the grid minimum latitude grid_miny, the grid maximum longitude grid_maxx, the grid maximum latitude grid_maxy, and the grid size can be specified. grid_cellsize, the number of grid rows grid_rows, and the number of grid columns grid_columns initialize the grid index structure to divide the given geographic space into grids of the same size; each grid data structure can specifically include : R-tree data structure of enclosing rectangle, X-direction grid number, Y-direction grid number, and grid-associated secondary index.
进一步的,在步骤S2中,所述地物要素顺序编号在地物要素索引构建时按照整数序列号从1开始依次递增生成,所述地物要素唯一编号为地物要素唯一标识的字符串。Further, in step S2, the sequence number of the feature element is generated incrementally according to the integer serial number starting from 1 when the feature element index is constructed, and the unique number of the feature element is a character string uniquely identified by the feature element.
在本发明所述的步骤S2中,可以具体通过LSM算法将地物要素顺序编号和地物要素唯一编号写入全局网格索引分区映射文件,在全局网格索引分区映射文件中可以具体以地物要素顺序编号为键值,以地物要素唯一编号信息为存储内容。In the step S2 of the present invention, the sequential numbering of the feature elements and the unique numbering of the feature elements can be written into the global grid index partition mapping file through the LSM algorithm. The sequence number of the object element is the key value, and the unique number information of the object element is the storage content.
进一步的,在步骤S3中,所述地物要素信息包括地物要素空间属性信息和地物要素实体属性信息,所述地物要素空间属性信息包括地物要素几何对象形状,所述地物要素实体属性信息包括地物要素唯一编号标识、地物要素名称。Further, in step S3, the feature element information includes the feature element spatial attribute information and the feature element entity attribute information, the feature element spatial attribute information includes the feature element geometric object shape, and the feature element Entity attribute information includes the unique number identification of feature elements and the name of feature elements.
在本发明所设计的这种基于网格R树混合索引构建方法中,空间属性信息中所包括的地物要素几何对象形状,地物要素几何对象形状可以具体包括:点形状、线形状、面形状三类。在本发明中,基于地物要素几何对象形状可以进一步计算获得对应地物要素的最小外接矩形。In the grid R-tree-based hybrid index construction method designed by the present invention, the geometric object shape of feature elements included in the spatial attribute information, the geometric object shape of feature elements can specifically include: point shape, line shape, surface Three types of shapes. In the present invention, the minimum circumscribed rectangle of the corresponding feature element can be further calculated based on the shape of the geometric object of the feature element.
需要说明的是,在本实施方式中,在本发明所述的步骤S3中,地物要素信息可以具体通过LSM算法写入地物要素存储文件,在地物要素存储文件中,能够以地物要素唯一编号为键值,地物要素存储文件的存储内容可以具体包括:地物要素唯一编号、地物要素名称、地物要素几何对象形状和地物要素最小外接矩形这些信息。It should be noted that, in this embodiment, in step S3 of the present invention, the feature element information can be specifically written into the feature element storage file through the LSM algorithm, and in the feature element storage file, the feature element information can be The unique number of the feature is the key value, and the storage content of the feature feature storage file can specifically include: the unique number of the feature feature, the name of the feature feature, the shape of the geometric object of the feature feature, and the minimum circumscribed rectangle of the feature feature.
此外,需要注意的是,在本实施方式中,在上述步骤S5中,所述地物要素或者剖分三角形的最小外接矩形由左下角点和右上角点信息组成,其可以具体包括最小经度minX、最小纬度minY、最大经度maxX和最大纬度maxY。通过上述信息即可计算地物要素的最小网格X方向编号cell_minx,最小网格Y方向编号cell_miny,最大网格X方向编号cell_maxx,最大网格Y方向编号cell_maxy。In addition, it should be noted that in this embodiment, in the above step S5, the minimum circumscribing rectangle of the feature element or the subdivided triangle is composed of the lower left corner point and the upper right corner point information, which may specifically include the minimum longitude minX , minimum latitude minY, maximum longitude maxX and maximum latitude maxY. The minimum grid X-direction number cell_minx, the minimum grid Y-direction number cell_miny, the maximum grid X-direction number cell_maxx, and the maximum grid Y-direction number cell_maxy can be calculated based on the above information.
在具体计算过程中,计算公式中需要用到向上求整数学函数math.Ceil,各项数据的计算公式如下所述:In the specific calculation process, the mathematical function math.Ceil needs to be used in the calculation formula. The calculation formula of each data is as follows:
cell_minx=math.Ceil(minX-grid_minx)/grid_cellsize-1cell_minx=math.Ceil(minX-grid_minx)/grid_cellsize-1
计算最小网格Y方向编号公式:Calculate the minimum grid Y direction number formula:
cell_miny=math.Ceil(minY-grid_miny)/grid_cellsize-1cell_miny=math.Ceil(minY-grid_miny)/grid_cellsize-1
计算最大网格X方向编号公式:The formula for calculating the maximum grid number in the X direction:
cell_maxx=math.Ceil(maxX-grid_maxx)/grid_cellsize-1cell_maxx=math.Ceil(maxX-grid_maxx)/grid_cellsize-1
计算最大网格Y方向编号公式:The formula for calculating the maximum grid number in the Y direction:
cell_maxy=math.Ceil(maxY-grid_maxy)/grid_cellsize-1cell_maxy=math.Ceil(maxY-grid_maxy)/grid_cellsize-1
通过已计算出的地物要素或者剖分三角形跨越的网格区域的最小网格X方向编号cell_minx,最小网格Y方向编号cell_miny,最大网格X方向编号cell_maxx,最大网格Y方向编号cell_maxy,即可获得对应的网格分区集合。The minimum grid X-direction number cell_minx, the minimum grid Y-direction number cell_miny, the maximum grid X-direction number cell_maxx, and the maximum grid Y-direction number cell_maxy of the grid area spanned by the calculated ground object elements or subdivision triangles, The corresponding grid partition set can be obtained.
相应地,请参照图3所示,本发明设计了一种基于网格R树混合索引检索方法,该基于网格R树混合索引检索方法中所采用的网格R树混合索引是根据本发明上述的基于网格R树混合索引构建方法构建的。Correspondingly, referring to Fig. 3, the present invention designs a retrieval method based on grid R tree hybrid index, the grid R tree hybrid index used in the grid R tree hybrid index retrieval method is according to the present invention The above-mentioned grid R-tree-based hybrid index construction method is constructed.
在本发明中,本发明所述的基于网格R树混合索引检索方法具体包括步骤:In the present invention, the hybrid index retrieval method based on the grid R tree described in the present invention specifically includes steps:
100:根据输入几何对象形状,计算所述几何对象的最小外接矩形作为空间数据的查询矩形范围;100: According to the shape of the input geometric object, calculate the smallest circumscribed rectangle of the geometric object as the query rectangular range of the spatial data;
200:结合网格初始化参数计算所述查询矩形范围所在网格分区集合;200: Combining with the grid initialization parameters, calculate the grid partition set where the query rectangle range is located;
300:遍历所述网格分区集合,并加载对应网格分区的R树索引,根据所述查询矩形范围并行对所述对应网格分区的R树索引开展空间检索操作,以收集每个R树索引的结果集,并去除结果集中重复的地物要素顺序编号,合并成候选地物要素编号结果集;300: Traverse the set of grid partitions, load the R-tree index corresponding to the grid partition, and perform a spatial search operation on the R-tree index corresponding to the grid partition in parallel according to the scope of the query rectangle, so as to collect each R-tree The result set of the index, and remove the sequence numbers of the repeated feature elements in the result set, and merge them into a result set of candidate feature element numbers;
400:遍历候选地物要素编号结果集,从全局网格索引分区映射文件的关系中获得每个地物要素的地物要素唯一编号,然后再从地物要素存储文件中获取对应地物要素的地物要素信息,以构成完整的候选地物要素结果集;400: Traverse the result set of candidate feature element numbers, obtain the unique number of each feature element from the relationship of the global grid index partition mapping file, and then obtain the corresponding feature element number from the feature element storage file Feature information to form a complete result set of candidate feature features;
500:遍历候选地物要素结果集,获取地物要素剖分三角形筛选出的候选地物要素,并将其他的地物要素几何对象形状和输入几何对象形状进行精确几何相交判断,以筛选出相交的地物要素结果集,将所述地物要素剖分三角形筛选出的候选地物要素直接加入所述相交的地物要素结果集,以生成最终地物要素结果集。500: Traverse the result set of candidate feature elements, obtain the candidate feature elements screened out by triangulation of the feature elements, and perform precise geometric intersection judgments on the geometric object shapes of other feature elements and the shape of the input geometric object to filter out the intersection The result set of feature elements, the candidate feature elements screened out by the triangulation of the feature elements are directly added to the result set of intersecting feature elements, so as to generate the final result set of feature elements.
需要说明的是,在本实施方式中,在上述步骤300中,遍历网格分区集合,可以根据索引文件名R{cell_row}_{cell_column}.idx规则加载对应网格分区的R索引树。其中,根据输入的查询矩形范围并行对网格分区R树索引开展空间检索操作。R树索引过程中会对地物要素(即判断该地物要素不为复杂几何对象)或者地物要素剖分三角形(即判断该地物要素为复杂几何对象,并采用三角剖分算法将该地物要素的复杂几何对象拆分成若干个地物要素剖分三角形)进行检索,通过剖分三角形获得的结果进行标识,针对同一个地物要素剖分的三角形,只要存在一个三角形和查询矩形范围有几何交集,则将对应的地物要素加入查询结果集并标识。It should be noted that, in this embodiment, in the above step 300, the grid partition set is traversed, and the R index tree corresponding to the grid partition can be loaded according to the index file name R{cell_row}_{cell_column}.idx rule. Among them, the spatial retrieval operation is performed on the grid partition R-tree index in parallel according to the input query rectangle range. In the process of R-tree indexing, the feature element (that is, it is judged that the feature element is not a complex geometric object) or the feature element is divided into triangles (that is, it is judged that the feature element is a complex geometric object, and the triangulation algorithm is used to divide the triangular element into a triangle). The complex geometric object of the ground feature element is split into several ground feature elements and divided into triangles) for retrieval, and the results obtained by dividing the triangle are identified. For the triangle divided by the same ground feature element, as long as there is a triangle and a query rectangle If there is a geometric intersection in the range, the corresponding feature elements will be added to the query result set and identified.
在本发明上述的基于网格R树混合索引检索方法中,步骤400中所提及的地物要素信息可以具体包括地物要素空间属性信息和地物要素实体属性信息。In the above-mentioned retrieval method based on grid R-tree hybrid index of the present invention, the feature element information mentioned in step 400 may specifically include feature element spatial attribute information and feature element entity attribute information.
进一步的,在上述步骤100中,所述几何对象的最小外接矩形由左下角点和右上角点信息组成,其包括最小经度query_minX、最小纬度query_minY、最大经度query_maxX和最大纬度query_maxY。Further, in the above step 100, the minimum circumscribed rectangle of the geometric object is composed of lower left corner point and upper right corner point information, which includes minimum longitude query_minX, minimum latitude query_minY, maximum longitude query_maxX and maximum latitude query_maxY.
进一步的,在上述步骤200中,通过已计算出的所述查询矩形范围跨越的网格区域的最小网格X方向编号cell_minx、最小网格Y方向编号cell_miny、最大网格X方向编号cell_maxx、最大网格Y方向编号cell_maxy,获得网格区域的网格分区集合。Further, in the above step 200, the calculated minimum grid X-direction number cell_minx, minimum grid Y-direction number cell_miny, maximum grid X-direction number cell_maxx, maximum grid X-direction number cell_maxx, and maximum Cell_maxy is the cell_maxy number in the Y direction of the grid, and the grid partition set of the grid area is obtained.
需要说明的是,在上述步骤200中,在具体计算所述查询矩形范围跨越的网格区域的最小网格X方向编号cell_minx、最小网格Y方向编号cell_miny、最大网格X方向编号cell_maxx、最大网格Y方向编号cell_maxy时,计算公式中需要用到向上求整数学函数math.Ceil,各项数据的计算公式如下所述:It should be noted that, in the above step 200, the minimum grid X-direction number cell_minx, the minimum grid Y-direction number cell_miny, the maximum grid X-direction number cell_maxx, the maximum When cell_maxy is numbered in the Y direction of the grid, the math function math.Ceil needs to be used in the calculation formula. The calculation formula of each data is as follows:
计算最小网格X方向编号公式:Calculate the minimum grid X direction number formula:
cell_minx=math.Ceil(query_minX-grid_minx)/grid_cellsize-1cell_minx=math.Ceil(query_minX-grid_minx)/grid_cellsize-1
计算最小网格Y方向编号公式:Calculate the minimum grid Y direction number formula:
cell_miny=math.Ceil(query_minY-grid_miny)/grid_cellsize-1cell_miny=math.Ceil(query_minY-grid_miny)/grid_cellsize-1
计算最大网格X方向编号公式:The formula for calculating the maximum grid number in the X direction:
cell_maxx=math.Ceil(query_maxX-grid_maxx)/grid_cellsize-1cell_maxx=math.Ceil(query_maxX-grid_maxx)/grid_cellsize-1
计算最大网格Y方向编号公式:The formula for calculating the maximum grid number in the Y direction:
cell_maxy=math.Ceil(query_maxY-grid_maxy)/grid_cellsize-1cell_maxy=math.Ceil(query_maxY-grid_maxy)/grid_cellsize-1
进一步的,在上述步骤300中,根据所述查询矩形范围并行对所述对应网格分区的R树索引开展空间检索操作,以收集每个R树索引的结果集,具体包括如下步骤301-303:Further, in the above step 300, according to the scope of the query rectangle, the spatial retrieval operation is carried out on the R-tree index of the corresponding grid partition in parallel to collect the result set of each R-tree index, specifically including the following steps 301-303 :
301:从R树索引的根节点开始查询匹配,将所述查询矩形范围与所述根节点进行几何相交判断,若所述查询矩形范围与途经的根节点有交集,则需要与该根节点的所有子节点进行几何相交判断;301: Start to query and match from the root node of the R-tree index, and determine the geometric intersection between the query rectangle range and the root node. All child nodes perform geometric intersection judgment;
302:按照深度优先算法遍历R树索引,迭代遍历R树索引各个分支直至叶子节点,并收集有交集的叶子节点结果集;302: Traverse the R-tree index according to the depth-first algorithm, iteratively traverse each branch of the R-tree index until the leaf nodes, and collect the leaf node result sets with intersections;
303:将检索获取的所有结果集去除重复后,合并成对应R树的结果集。303: After removing duplications from all the result sets acquired by the retrieval, merge them into a result set corresponding to the R tree.
此外,进一步参阅图4可以看出,本发明还设计了一种基于网格R树混合索引构建检索装置,该检索装置用于实施上述基于网格R树混合索引检索方法。In addition, further referring to FIG. 4, it can be seen that the present invention also designs a retrieval device based on a grid R-tree hybrid index, which is used to implement the above retrieval method based on a grid R-tree hybrid index.
如图4所示,在本实施方式中,本发明所设计的基于网格R树混合索引构建检索装置可以具体包括:As shown in Figure 4, in this embodiment, the retrieval device based on the grid R-tree hybrid index designed by the present invention may specifically include:
索引元数据模块,所述索引元数据模块用以记录网格R树混合索引的基础数据信息,所述基础数据信息包括网格初始化参数信息和R树参数信息;其中,网格初始化参数信息可以包括网格最小经度grid_minx、网格最小纬度grid_miny、网格最大经度grid_maxx、网格最大纬度grid_maxy、网格大小grid_cellsize、网格行数grid_rows、网格列数grid_columns;R树参数信息可以包括R树文件名、R树节点的最大节点数量N;An index metadata module, the index metadata module is used to record the basic data information of the grid R-tree hybrid index, the basic data information includes grid initialization parameter information and R-tree parameter information; wherein, the grid initialization parameter information can be Including grid minimum longitude grid_minx, grid minimum latitude grid_miny, grid maximum longitude grid_maxx, grid maximum latitude grid_maxy, grid size grid_cellsize, grid row number grid_rows, grid column number grid_columns; R tree parameter information can include R tree File name, the maximum number of nodes N of R tree nodes;
三角剖分模块,所述三角剖分模块用以判断地物要素是否为复杂几何对象,并针对复杂几何对象进行三角剖分,以将地物要素的复杂几何对象剖分为若干个地物要素剖分三角形;A triangulation module, the triangulation module is used to judge whether the feature element is a complex geometric object, and perform triangulation on the complex geometric object, so as to divide the complex geometric object of the feature element into several feature elements Divide the triangle;
网格索引模块,所述网格索引模块用于计算地物要素网格索引,并根据索引元数据信息生成网格索引结构;A grid index module, the grid index module is used to calculate the grid index of feature elements, and generate a grid index structure according to the index metadata information;
R树索引模块,所述R树索引模块用以实现R树索引文件的生成和加载,并提供R树索引的数据节点插入算法、R树空间查询算法;R tree index module, the R tree index module is used to realize the generation and loading of R tree index files, and provide the data node insertion algorithm and R tree space query algorithm of R tree index;
全局网格索引分区模块,所述全局网格索引分区模块用以实现全局网格索引分区映射文件生成和加载,并提供地物要素顺序编号与地物要素唯一编号的映射关系建立功能;A global grid index partition module, the global grid index partition module is used to realize the generation and loading of the global grid index partition mapping file, and provide the function of establishing the mapping relationship between the sequence number of the feature element and the unique number of the feature element;
地物要素存储模块,所述地物要素存储模块用以实现地物要素存储文件的生成和加载,提供对地物要素的基础属性信息存储,地物要素空间几何对象的序列化和地物要素相关属性信息的序列化存储功能;The feature element storage module, the feature element storage module is used to realize the generation and loading of the feature element storage file, provide the basic attribute information storage of the feature element, the serialization of the spatial geometric object of the feature element and the feature element Serialized storage function of related attribute information;
查询数据预处理模块,所述查询数据预处理模块用以实现对输入几何对象的数据预处理,根据几何对象要素类型生成对应的最小外接矩形,调用网格索引模块和R树索引模块实现空间检索;Query data preprocessing module, the query data preprocessing module is used to realize the data preprocessing of the input geometric object, generate the corresponding minimum circumscribed rectangle according to the element type of the geometric object, and call the grid index module and the R tree index module to realize the spatial retrieval ;
查询结果集生成模块,所述查询结果集生成模块能够对查询候选地物要素结果集的合并,并对查询输入几何对象和候选结果集进行精确几何匹配生成最终地物要素结果集。A query result set generation module, the query result set generation module is capable of merging query candidate feature element result sets, and performing precise geometric matching on query input geometric objects and candidate result sets to generate a final feature element result set.
综上所述,在本发明中,发明人采用分而治之的思路进行存储优化,通过将超大规模的空间矢量数据按照网格一级索引分区,每个分区对应一个存储文件,以使每个分区文件对应一棵R树索引。发明人创造性地将网格索引与R树索引结合,有效发挥两个索引的优点,混合索引优势互补。其中,在分区后的R树索引降低了只有唯一R树的树高度,减小了查询矩形范围与更多层次的节点矩形范围进行几何相交判断,从而提高查询效率。To sum up, in the present invention, the inventor adopts the idea of divide and conquer for storage optimization. By partitioning the ultra-large-scale space vector data according to the first-level grid index, each partition corresponds to a storage file, so that each partition file Corresponds to an R-tree index. The inventor creatively combines the grid index and the R-tree index to effectively utilize the advantages of the two indexes, and the advantages of the hybrid index are complementary. Among them, the R-tree index after partition reduces the tree height of only the unique R-tree, reduces the geometric intersection judgment between the query rectangle range and the node rectangle range of more levels, thereby improving the query efficiency.
此外,在本发明中,通过将复杂几何对象进行三角剖分,能够将复杂几何对象化解为若干简化的地物要素剖分三角形,以提高复杂对象精确化几何运算效率,提高查询效率,降低cpu资源使用率。In addition, in the present invention, by triangulating the complex geometric object, the complex geometric object can be decomposed into a number of simplified triangulations of ground feature elements, so as to improve the precision geometric operation efficiency of the complex object, improve the query efficiency, and reduce the cpu Resource usage.
以上所述仅为本发明的实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等同变换,或直接或间接运用在相关的技术领域,均同理包括在本发明的专利保护范围内。The above description is only an embodiment of the present invention, and does not limit the patent scope of the present invention. All equivalent transformations made by using the description of the present invention and the contents of the accompanying drawings, or directly or indirectly used in related technical fields, are all included in the same principle. Within the scope of patent protection of the present invention.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310135732.0A CN116303434A (en) | 2023-02-20 | 2023-02-20 | Grid R-tree Hybrid Index Construction Method, Retrieval Method and Device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310135732.0A CN116303434A (en) | 2023-02-20 | 2023-02-20 | Grid R-tree Hybrid Index Construction Method, Retrieval Method and Device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN116303434A true CN116303434A (en) | 2023-06-23 |
Family
ID=86817800
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310135732.0A Pending CN116303434A (en) | 2023-02-20 | 2023-02-20 | Grid R-tree Hybrid Index Construction Method, Retrieval Method and Device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN116303434A (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070198568A1 (en) * | 2006-02-20 | 2007-08-23 | Linkage Technology Co., Ltd. | "Grid Plus T Tree" Index Method for Quick Orientation in Massive Memory Database |
| WO2015062540A1 (en) * | 2013-10-31 | 2015-05-07 | 中国移动通信集团公司 | Driving amount model event-based storage and index methods and system |
| CN108009265A (en) * | 2017-12-15 | 2018-05-08 | 中国公路工程咨询集团有限公司 | A kind of space data index method under cloud computing environment |
| CN113901156A (en) * | 2021-09-08 | 2022-01-07 | 燕山大学 | 3D Adaptive Grid R+Tree Hybrid Index Construction, Maintenance and Query Method |
| CN114254537A (en) * | 2021-12-17 | 2022-03-29 | 西安前沿动力软件开发有限责任公司 | Multi-scale component model finite element grid generation method and device and storage medium |
-
2023
- 2023-02-20 CN CN202310135732.0A patent/CN116303434A/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070198568A1 (en) * | 2006-02-20 | 2007-08-23 | Linkage Technology Co., Ltd. | "Grid Plus T Tree" Index Method for Quick Orientation in Massive Memory Database |
| WO2015062540A1 (en) * | 2013-10-31 | 2015-05-07 | 中国移动通信集团公司 | Driving amount model event-based storage and index methods and system |
| CN108009265A (en) * | 2017-12-15 | 2018-05-08 | 中国公路工程咨询集团有限公司 | A kind of space data index method under cloud computing environment |
| CN113901156A (en) * | 2021-09-08 | 2022-01-07 | 燕山大学 | 3D Adaptive Grid R+Tree Hybrid Index Construction, Maintenance and Query Method |
| CN114254537A (en) * | 2021-12-17 | 2022-03-29 | 西安前沿动力软件开发有限责任公司 | Multi-scale component model finite element grid generation method and device and storage medium |
Non-Patent Citations (1)
| Title |
|---|
| 周小军;王光霞;夏青;王富强;: "基于OpenGL技术的多幅电子地图数据快速组织与表达", 地球信息科学学报, no. 04, 15 August 2013 (2013-08-15), pages 1 - 7 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6032467B2 (en) | Spatio-temporal data management system, spatio-temporal data management method, and program thereof | |
| Pandey et al. | The case for learned spatial indexes | |
| CN104199986B (en) | Vector data space index method based on hbase and geohash | |
| CN103092926B (en) | The three-dimensional space index method of multi-level mixing | |
| Tauheed et al. | Accelerating range queries for brain simulations | |
| CN106327577B (en) | Dimensional topography optimal method based on local curvature's entropy and quad-tree structure | |
| CN110597935A (en) | A method and device for spatial analysis | |
| CN105160706B (en) | Landform parallel constructing method is constrained under a kind of unit multi-core environment | |
| CN103544249B (en) | A kind of method for indexing scattered point cloud space of historic building | |
| CN103116610A (en) | Vector space big data storage method based on HBase | |
| Mao et al. | Comprehensive comparison of LSM architectures for spatial data | |
| CN113449052A (en) | Method for establishing spatial index, method and device for querying spatial region | |
| CN116303434A (en) | Grid R-tree Hybrid Index Construction Method, Retrieval Method and Device | |
| CN115708079A (en) | Spatial index query method, device, equipment and computer readable storage medium | |
| CN104008147A (en) | Multi-body index construction method for three-dimensional geologic structure model | |
| CN115952250A (en) | Space sub-table method based on space dynamic grid | |
| Rslan et al. | Spatial R-tree index based on grid division for query processing | |
| CN101826098A (en) | AB column diagram-based method for estimating spatial query selection rate | |
| CN102254093A (en) | Connected domain statistical correlation algorithm based on Thiessen polygon | |
| CN108052778B (en) | An Efficient Double Search Method for Neighboring Particles for Meshless Particle Simulation Techniques | |
| CN116090395A (en) | Data processing method, data structure generating method and query method | |
| Weng et al. | Servicing range queries on multidimensional datasets with partial replicas | |
| CN114240727B (en) | Polygon topology generation method and device based on GPU acceleration | |
| CN116756139B (en) | Data indexing method, system, storage medium and electronic equipment | |
| CN111538725A (en) | Method and system for nearest quick search for ten-million-level dot-shaped elements |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |