CN105405070A

CN105405070A - A method for constructing a distributed memory grid system

Info

Publication number: CN105405070A
Application number: CN201510881897.8A
Authority: CN
Inventors: 张春平; 林峰; 胡牧; 杨志; 刘铭; 张琦
Original assignee: NARI Group Corp; NARI Information and Communication Technology Co; State Grid Corp of China SGCC
Current assignee: NARI Group Corp; NARI Information and Communication Technology Co; State Grid Corp of China SGCC
Priority date: 2015-12-03
Filing date: 2015-12-03
Publication date: 2016-03-16

Abstract

The invention discloses a method for constructing a distributed memory grid system, which specifically includes the following steps: (1) Building a memory grid resource model: the memory grid resource modeling follows the public information model specification in the IEC61970 standard, and uses object-oriented packaging , inheritance and object application, establish a tree-type memory grid resource model; (2) Determine the memory storage structure: use an array to store all attribute data of the resource, and determine the attribute data by defining the mapping relationship between the attribute name identifier and the array subscript Position in the array to realize data access; (3) horizontally split grid resource data; (4) load and cache grid resource data; (5) synchronize memory grid data; (6) determine memory grid section; (7) Determine the memory grid access interface. The present invention can minimize data transmission between distributed nodes when realizing resource access and calculation, and greatly improve efficiency.

Description

A method for constructing a distributed memory grid system

技术领域technical field

本发明涉及一种分布式内存电网系统构建方法，属于电网技术领域。The invention relates to a method for constructing a distributed memory power grid system, belonging to the technical field of power grids.

背景技术Background technique

为解决大数据的存储、计算及分析问题，Google公司提出了分布式文件系统、面向列的分布式数据库、MapReduce分布式编程模型。Google公司的分布式存储、分布式计算技术为大型互联网站系统大数据量的存储和分析提出了解决方案，提升了性能。在大数据实时分析方面，SAP推出了HANA内存计算平台，它通过内存计算技术以及软硬件结合的一体机技术，实现高性能的数据查询与分析，满足用户对大数据处理的实时性需求。在大容量、高速存储方面，Oracle推出了Exadata数据库一体机，它通过高性能硬件、高速网络接口，采用智能扫描、智能存储、智能索引、混合列压缩等技术，提高大数据和多并发应场景下的系统处理效率。Exadata数据库一体机能够使得基于oracle数据库的已有系统的性能得到10倍左右性能提升。另外，也有一些内存数据库技术，如：为应用程序提供即时响应和高吞吐量的OracleTimesTen产品；将基于内存和磁盘的全事务处理数据库引擎，可用性强的数据复制功能紧密地融为一体的IBMSolidDB产品；用于各种需要高性能、小尺寸、紧密存储、零内存分配的eXtremeDB；占用资源低、能够跟很多程序语言相结合、轻量级的SQLite内存数据库；采用高性能的键-值存储、内存数据集方式的开源Redis内存数据库，都提出了数据的实时存储方案，避免解决频繁的磁盘I/O操作，使得数据存取效率提高。In order to solve the storage, calculation and analysis problems of big data, Google proposed a distributed file system, a column-oriented distributed database, and a MapReduce distributed programming model. Google's distributed storage and distributed computing technologies provide solutions for the storage and analysis of large amounts of data in large-scale Internet site systems, and improve performance. In terms of big data real-time analysis, SAP has launched the HANA memory computing platform, which uses memory computing technology and an all-in-one computer technology combining software and hardware to achieve high-performance data query and analysis to meet users' real-time demand for big data processing. In terms of large-capacity and high-speed storage, Oracle launched the Exadata database all-in-one machine, which uses high-performance hardware, high-speed network interfaces, and technologies such as intelligent scanning, intelligent storage, intelligent indexing, and hybrid column compression to improve the performance of big data and multi-concurrency scenarios. Under the system processing efficiency. The Exadata database all-in-one machine can improve the performance of the existing system based on the oracle database by about 10 times. In addition, there are also some in-memory database technologies, such as: OracleTimesTen products that provide instant response and high throughput for applications; IBMSolidDB products that closely integrate memory- and disk-based full transaction processing database engines and highly available data replication functions ; Used for various eXtremeDBs that require high performance, small size, compact storage, and zero memory allocation; a lightweight SQLite in-memory database that occupies low resources and can be combined with many programming languages; uses high-performance key-value storage, Open-source Redis memory databases in the form of memory datasets have proposed real-time data storage solutions to avoid frequent disk I/O operations and improve data access efficiency.

在分布式计算领域，常见的分析方法利用Hadoop框架，由于大量的磁盘I/O操作以及复杂的MapReduce过程，使得系统性能和运行效率低下，不能满足实时性的要求,只能用于非实时的数据分析应用场景。在分布式缓存领域,通常采用Key-Value的存储方式和基于Key哈希散列的数据划分方式，这种方式使得多种数据连接查询时缓存节点需要交换大量数据，性能低下，且一般用于结构简单的数据，很难处理模型复杂的电网资源数据。目前主流的分布式和内存计算产品由于其通用性设计，使得其在面对复杂电网资源模型和海量资源数据，无法发挥最大优势，甚至存在无法解决的工程应用问题。In the field of distributed computing, common analysis methods use the Hadoop framework. Due to a large number of disk I/O operations and complex MapReduce processes, the system performance and operating efficiency are low, which cannot meet the real-time requirements and can only be used for non-real-time applications. Data analysis application scenarios. In the field of distributed caching, the storage method of Key-Value and the data division method based on Key hash hash are usually used. This method makes the cache nodes need to exchange a large amount of data when multiple data connections are queried, and the performance is low. It is generally used for Data with a simple structure is difficult to deal with power grid resource data with complex models. Due to their universal design, the current mainstream distributed and memory computing products cannot give full play to their advantages in the face of complex power grid resource models and massive resource data, and even have unsolvable engineering application problems.

发明内容Contents of the invention

针对现有技术存在的不足，本发明目的是提供一种分布式内存电网系统构建方法，在实现资源访问和计算时能够最大程度减少分布式节点间的数据传输，大幅提升效率。In view of the deficiencies in the existing technology, the purpose of the present invention is to provide a method for constructing a distributed memory grid system, which can minimize data transmission between distributed nodes and greatly improve efficiency when realizing resource access and calculation.

为了实现上述目的，本发明是通过如下的技术方案来实现：In order to achieve the above object, the present invention is achieved through the following technical solutions:

本发明的一种分布式内存电网系统构建方法，具体包括以下几个步骤：A method for constructing a distributed memory grid system of the present invention specifically includes the following steps:

(1)构建内存电网资源模型：内存电网资源建模遵循IEC61970标准中的公共信息模型规范,利用面向对象的封装、继承和对象应用，建立树型的内存电网资源模型；(1) Construct the memory grid resource model: the memory grid resource modeling follows the public information model specification in the IEC61970 standard, and uses object-oriented encapsulation, inheritance and object application to establish a tree-type memory grid resource model;

(2)确定内存存储结构：采用数组方式对资源所有属性数据进行存储，通过定义属性名称标识符与数组下标的映射关系，确定属性数据在数组中的位置，实现数据存取；(2) Determining the memory storage structure: store all attribute data of the resource in the form of an array, and determine the position of the attribute data in the array by defining the mapping relationship between the attribute name identifier and the array subscript to realize data access;

(3)水平拆分电网资源数据：将具有不同业务属性值的电网资源数据映射到不同的数据块，将具有相同业务属性值的电网资源数据映射到一同数据块；(3) Split power grid resource data horizontally: map power grid resource data with different business attribute values to different data blocks, and map power grid resource data with the same business attribute value to the same data block;

(4)加载并缓存电网资源数据：系统第一次启动时，根据数据的水平划分方式，将电网资源数据库中的数据按照内存电网资源模型进行加载，然后分布式缓存到集群中多个服务器内存中；数据加载完成后，将缓存中的电网资源数据进行序列化，形成二进制文件保存到磁盘，当再次启动系统时，从文件直接读取电网数据，并从电网资源数据库读取从上次序列化到目前的增量数据；电网资源分布式缓存后，将缓存服务器IP地址、电网资源的区域属性值、电网资源电压等级值、占用空间、缓存所用时间信息发送到电网资源管理服务器中进行统一管理；(4) Loading and caching power grid resource data: When the system starts for the first time, according to the horizontal division method of data, the data in the power grid resource database is loaded according to the memory grid resource model, and then distributed and cached to the memory of multiple servers in the cluster Medium; after the data loading is complete, serialize the grid resource data in the cache to form a binary file and save it to the disk. When the system is restarted, the grid data is directly read from the file and read from the grid resource database from the last sequence After the grid resources are distributed and cached, the IP address of the cache server, the regional attribute value of the grid resource, the voltage level value of the grid resource, the occupied space, and the time used for the cache are sent to the grid resource management server for unification manage;

(5)同步内存电网数据：将电网数据库中的资源数据同步到内存电网、省内存电网的资源数据同步到总部内存电网,使得数据库和内存电网、总部内地电网和省内存电网的数据保持一致；(5) Synchronize memory grid data: Synchronize the resource data in the grid database to the memory grid, and the resource data of the provincial memory grid to the headquarters memory grid, so that the data of the database and the memory grid, the headquarter's inland power grid and the provincial memory grid are consistent;

(6)确定内存电网断面：内存电网断面以对象化并行计算框架持久化功能作为支撑，各省及总部的所有断面数据保存在各自的内存电网服务器，采用与内存电网服务器共享内存的方式，加载本地历史断面数据；(6) Determine the section of the memory grid: the section of the memory grid is supported by the persistence function of the object-oriented parallel computing framework. All the section data of each province and headquarters are stored in their respective memory grid servers, and are loaded locally by sharing the memory with the memory grid server Historical section data;

(7)确定内存电网访问接口：内存电网接口用于向电网业务信息系统提供数据物理位置无关的电网资源访问方式，包括电网断面接口、电网数据统计分析接口、数据查询接口。(7) Determine the memory grid access interface: the memory grid interface is used to provide the grid business information system with a grid resource access mode independent of the physical location of the data, including the grid section interface, the grid data statistical analysis interface, and the data query interface.

步骤(1)中，进行树型展示时，首先要确定内存电网资源模型中电网资源的层次关系，要求必须有一个根节点，且每一个子节点的父节点是唯一的。In step (1), when performing tree display, the hierarchical relationship of grid resources in the memory grid resource model must first be determined, requiring that there must be a root node, and the parent node of each child node is unique.

步骤(2)中，内存存储结构具体的确定方法如下：In step (2), the specific method for determining the memory storage structure is as follows:

设资源数据为A，包含4个属性，记录条数为:3，则为每条记录分配一个长度为4的Object对象数组，Suppose the resource data is A, which contains 4 attributes, and the number of records is 3, then an array of Object objects with a length of 4 is assigned to each record.

属性1属性2属性3属性4attribute 1 attribute 2 attribute 3 attribute 4

数组1array 1 值11value 11 值12value 12 值13value 13 值14value 14 数组2array 2 值21value 21 值22value 22 值23value 23 值24value 24 数组3array 3 值31value 31 值32value 32 值33value 33 值34value 34

采用Key-Value方式建立属性标识符与数组下标的映射关系如下：Use the Key-Value method to establish the mapping relationship between attribute identifiers and array subscripts as follows:

属性(key)对应数组下标(value)The attribute (key) corresponds to the array subscript (value)

属性1attribute 1 11 属性2attribute 2 22 属性3attribute 3 33 属性4attribute 4 44

访问某条记录的某个属性值时，首先找到该条记录对应的数组，然后根据属性标识符与数组下标关系，找到该属性对应的数组下标，根据数组下标访问对应的属性值。When accessing an attribute value of a record, first find the array corresponding to the record, then find the array subscript corresponding to the attribute according to the relationship between the attribute identifier and the array subscript, and access the corresponding attribute value according to the array subscript.

步骤(3)中，电网资源数据水平拆分方法如下：In step (3), the grid resource data horizontal split method is as follows:

(3-1)由计算任务对象模型根据业务逻辑和所使用的数据，分析最优的N个数据切分属性字段；(3-1) Analyze the optimal N data segmentation attribute fields according to the business logic and the data used by the computing task object model;

(3-2)将N个待切分业务数据对象属性的类型和属性值范围，作为数据切分的原始输入；(3-2) The types and attribute value ranges of the attributes of the N business data objects to be segmented are used as the original input for data segmentation;

(3-3)将对象的N个属性视作N维空间的轴，根据对象属性值范围映到多维空间区域，形成多维空间的数据超平面；(3-3) The N attributes of the object are regarded as the axes of the N-dimensional space, and are mapped to the multidimensional space region according to the object attribute value range to form a data hyperplane of the multidimensional space;

(3-4)每个数据超平面被映射到分布式计算集群中的不同计算节点的内存，形成数据的分布式内存存储。(3-4) Each data hyperplane is mapped to the memory of different computing nodes in the distributed computing cluster to form a distributed memory storage of data.

步骤(6)中，内存电网断面利用HDFS分布式文件系统进行永久、可靠存储，数据持久化时的序列化采用开源的Hession组件实现。In step (6), the memory grid section uses the HDFS distributed file system for permanent and reliable storage, and the serialization of data persistence is implemented using the open source Hession component.

步骤(6)中，所述内存电网断面提供断面管理、断面生成和多断面载入；所述断面管理提供断面数据的查询、断面生成计划任务的制定、断面载入指令的下达；所述断面生成根据计划任务定期将内存电网的数据进行持久化，生成电网数据断面；所述多断面载入根据断面指令，载入多个历史断面到内存；内存电网提供断面管理工具用于查询所有历史断面信息，同时可以向数据断面处理器发起断面生成请求和断面加载请求；数据断面处理器在接收到相关请求后，执行断面生成、断面加载任务。In step (6), the memory grid section provides section management, section generation, and multi-section loading; the section management provides section data query, section generation planning task formulation, and section loading instruction issuance; the section According to the scheduled task, the data of the memory grid is periodically persisted to generate a grid data section; the multi-section loading loads multiple historical sections into the memory according to the section instruction; the memory grid provides a section management tool for querying all historical sections At the same time, it can initiate a section generation request and a section loading request to the data section processor; after receiving the relevant request, the data section processor executes the section generation and section loading tasks.

步骤(7)中，所述内存电网访问接口采用PRC远程调用协议，传输的数据为资源对象序列化后的二进制流。In step (7), the memory grid access interface adopts the PRC remote call protocol, and the transmitted data is a binary stream after resource object serialization.

步骤(7)中，通过所述电网断面接口查询电网断面信息，并下达断面加载指令，加载指定的电网断面数据到计算机内存；通过所述电网数据统计分析接口获取总部和省电网资源多条件、多维度的统计分析结果；通过所述数据查询接口向业务系统提供全网电网资源的查询，业务系统不需要关注数据的来源和物理位置。In step (7), query the grid section information through the grid section interface, and issue a section loading instruction, and load the specified grid section data into the computer memory; obtain the headquarters and provincial grid resource multi-conditions, Multi-dimensional statistical analysis results; through the data query interface, the business system is provided with the query of the power grid resources of the whole network, and the business system does not need to pay attention to the source and physical location of the data.

本发明的分布式内存电网充分考虑了电网资源数据和电网业务特性，采用统一的电网资源模型和全新的存储方式，并使得模型能够准确的反映电网资源的情况，同时具备可重用、易理解、高效资源存储和访问等特性；同时通过利用电网资源的区域特性和电压等级特性，可很好的解决数据水平分割问题，使得资源数据访问和计算时能够最大程度上减少分布式节点间的数据传输，大幅提升效率，同时又能支持多种资源数据的关联查询；分布式内存电网是分布式技术、内存计算技术与电网业务融合的产物，是一种为电网定制的分布式内存计算产品,能够有效支持电网业务信息系统对电网资源的快速、高效处理。The distributed memory grid of the present invention fully considers grid resource data and grid business characteristics, adopts a unified grid resource model and a new storage method, and enables the model to accurately reflect the situation of grid resources, and is reusable, easy to understand, Features such as efficient resource storage and access; at the same time, by utilizing the regional characteristics and voltage level characteristics of power grid resources, the problem of data horizontal segmentation can be well solved, so that resource data access and calculation can minimize the data transmission between distributed nodes , greatly improve efficiency, and at the same time support the associated query of multiple resource data; the distributed memory grid is the product of the integration of distributed technology, memory computing technology and power grid business, and is a distributed memory computing product customized for the grid. Effectively support the fast and efficient processing of power grid resources by the power grid business information system.

附图说明Description of drawings

图1为一种分布式内存电网系统构建方法工作流程图；Fig. 1 is a working flow diagram of a method for constructing a distributed memory grid system;

图2为电网资源数据水平拆分原理图；Figure 2 is a schematic diagram of horizontal splitting of power grid resource data;

图3为内存电网数据同步方式原理图；Figure 3 is a schematic diagram of the memory grid data synchronization method;

图4为数据断面总体架构图；Figure 4 is an overall architecture diagram of the data section;

图5为分布式内存电网系统架构图；Figure 5 is a diagram of the architecture of the distributed memory grid system;

图6为分布式内存电网功能架构图。Figure 6 is a functional architecture diagram of the distributed memory grid.

具体实施方式detailed description

为使本发明实现的技术手段、创作特征、达成目的与功效易于明白了解，下面结合具体实施方式，进一步阐述本发明。In order to make the technical means, creative features, goals and effects achieved by the present invention easy to understand, the present invention will be further described below in conjunction with specific embodiments.

参见图1，本发明的分布式内存电网系统构建方法如下：Referring to Fig. 1, the construction method of the distributed memory grid system of the present invention is as follows:

步骤一、电网资源模型构建Step 1. Construction of grid resource model

内存电网资源建模遵循IEC61970标准中的公共信息模型(CIM)规范,利用面向对象的封装、继承、对象应用等，建立树型的内存电网模型。进行树型展示时首先要确定模型中电网资源的层次关系，要求必须有一个根节点，且每一个子节点的父节点是唯一的。电网中的电力系统资源有其自身特点，例如在物理特性上存在一定的包容性，如某一子控制区包含了多个变电站，一个变电站下包含多个电压等级，一个电压等级又包含了母线、开关、刀闸、负荷等电力设备。为满足包容性的要求，在面向对象建模时，为每一个具有父节点和子节点的类添加父节点和字节点属性，一个父节点属性设置为其所从属的电网资源父类的对象指针，另子节点属性设置为其所包含的电网设备子类的列表。The memory grid resource modeling follows the Common Information Model (CIM) specification in the IEC61970 standard, and uses object-oriented encapsulation, inheritance, object application, etc. to establish a tree-type memory grid model. When performing tree display, it is first necessary to determine the hierarchical relationship of the power grid resources in the model. It is required that there must be a root node, and the parent node of each child node is unique. The power system resources in the power grid have their own characteristics, such as certain inclusiveness in physical characteristics, such as a sub-control area contains multiple substations, a substation contains multiple voltage levels, and a voltage level contains busbars , Switches, switches, loads and other electrical equipment. In order to meet the requirements of inclusiveness, in object-oriented modeling, add parent node and byte node attributes for each class with parent nodes and child nodes, and a parent node attribute is set to the object pointer of the parent class of the power grid resource to which it belongs , and the properties of the sub-nodes are set to the list of grid equipment subclasses it contains.

步骤二、内存存储结构设计Step 2. Memory storage structure design

从易用性和节省内存开销两方面对内存存储结构进行了设计。节省内存开销方面，舍弃了Key-Value键值对或者哈希表等一般方式，采用数组方式对资源所有属性数据进行存储。通过定义属性名称标识符与数组下标的映射关系，实现数据存取，其实现表格如下:The memory storage structure is designed from two aspects of ease of use and saving memory overhead. In terms of saving memory overhead, general methods such as Key-Value key-value pairs or hash tables are discarded, and arrays are used to store all attribute data of resources. By defining the mapping relationship between attribute name identifiers and array subscripts, data access is realized. The implementation table is as follows:

假设某种资源数据A，包含4个属性，记录条数为:3，则为每条记录分配一个长度为4的Object对象数组。Suppose some kind of resource data A contains 4 attributes, and the number of records is: 3, then allocate an Object object array with a length of 4 for each record.

属性1属性2属性3属性4attribute 1 attribute 2 attribute 3 attribute 4

步骤三、电网资源数据水平拆分Step 3. Horizontal splitting of power grid resource data

根据业务数据对象自身所含有的业务逻辑特征，按照业务数据对象的属性进行多维度切分，其实现原理图参见图2。According to the business logic features contained in the business data object itself, multi-dimensional segmentation is performed according to the attributes of the business data object. See Figure 2 for the schematic diagram of its implementation.

(1)由计算任务对象模型根据业务逻辑和所使用的数据，分析最优的N个数据切分属性字段。(1) According to the business logic and the data used by the calculation task object model, analyze the optimal N data segmentation attribute fields.

(2)将N个待切分业务数据对象属性的类型和属性值范围，作为数据切分的原始输入。(2) The types and attribute value ranges of the attributes of the N business data objects to be segmented are used as the original input for data segmentation.

(3)将对象的N个属性视作N维空间的轴，根据对象属性值范围映到多维空间区域。形成多维空间的数据超平面。(3) Treat the N attributes of the object as the axes of the N-dimensional space, and map to the multi-dimensional space area according to the value range of the object attributes. A data hyperplane that forms a multidimensional space.

(4)每个数据超平面被映射到分布式计算集群中的不同计算节点的内存，形成数据的分布式内存存储。(4) Each data hyperplane is mapped to the memory of different computing nodes in the distributed computing cluster to form a distributed memory storage of data.

步骤四、电网资源数据加载与缓存Step 4. Grid resource data loading and caching

加载电网资源数据，系统第一次启动时，根据数据的水平划分方式，将电网资源数据库中的数据按照电网资源内存模型进行加载，然后分布式缓存到集群中多个服务器内存中。数据加载完成后，将缓存中的电网资源数据进行序列化，形成二进制文件保存到磁盘，当再次启动系统时，从文件直接读取电网数据，并从电网资源数据库读取从上次序列化到目前的增量数据。Load grid resource data. When the system starts for the first time, according to the horizontal division method of data, the data in the grid resource database is loaded according to the grid resource memory model, and then distributed and cached in the memory of multiple servers in the cluster. After the data loading is complete, serialize the grid resource data in the cache to form a binary file and save it to the disk. When the system is restarted, the grid data is directly read from the file and read from the grid resource database from the last serialization to The current incremental data.

电网资源分布式缓存后，将缓存服务器IP地址、电网资源的区域属性值、电网资源电压等级值、占用空间、缓存所用时间等信息发送到电网资源管理服务器中进行统一管理。After the grid resources are distributed and cached, information such as the IP address of the cache server, the regional attribute value of the grid resource, the voltage level value of the grid resource, the occupied space, and the time used for caching are sent to the grid resource management server for unified management.

步骤五、内存电网数据同步方式设计Step 5. Design of memory grid data synchronization method

参见图3，数据同步用于将电网数据库中的资源数据同步到内存电网、省内存电网的资源数据同步到总部内存电网,使得数据库和内存电网、总部内地电网和省内存电网的数据保持一致。Referring to Figure 3, data synchronization is used to synchronize the resource data in the power grid database to the memory grid, and the resource data of the provincial memory grid to the headquarters memory grid, so that the data of the database and the memory grid, the headquarters inland grid and the provincial memory grid are consistent.

数据同步比对包括全量同步、增量同步。全量同步将磁盘关系数据库的数据以对象化的形式全量同步到内存电网的对象缓存中，保证内存电网对象数据与磁盘关系数据库数据源数据的一致性；将各网省内存电网的数据对象模型全量同步到国网总部并完成拼接，保证总部内存电网和各网省内存电网的同步。网省内存电网根据数据变更记录表的元数据完成对象数据增量的更新。更新后同时产生对象数据变更文件，用以记录网省内存电网中数据模型的变更信息，并存储在对比服务器的变更信息存储器中。国网总部内存电网通过访问各网省对比服务器中的对象数据变更文件，将相应变更的数据对象模型从网省对象服务器中上传至总部内存电网，完成变更数据对象模型的更新。Data synchronization comparison includes full synchronization and incremental synchronization. Full synchronization Synchronizes the data of the disk relational database to the object cache of the memory grid in the form of objectification, ensuring the consistency of the object data of the memory grid and the data source data of the disk relational database; Synchronize to the State Grid headquarters and complete the splicing to ensure the synchronization of the memory grids of the headquarters and the memory grids of each province. According to the metadata of the data change record table, the network provincial memory grid completes the incremental update of the object data. After the update, the object data change file is generated at the same time, which is used to record the change information of the data model in the memory grid of the network province, and stored in the change information storage of the comparison server. State Grid Headquarters memory grid accesses the object data change files in the comparison server of each network province, uploads the correspondingly changed data object model from the network province object server to the headquarters memory grid, and completes the update of the changed data object model.

特别的，由于内存电网的缓存有限，不能将全量数据一次性加载到对象缓存中进行比对。因此，数据比对模块提供了数据分块比对技术。有些数据表内的数据量十分庞大，如果一次就将其全部加载进行比对不仅耗时而且占用内存电网对象缓存的资源。首先将对所有数据进行分块处理和优先级设定。数据块的划分是按照业务需求制定的。一个数据块可以含有来自不同数据表而具有逻辑联系或业务关联的数据，同张数据表中无关联的数据不划入同一数据块内。每个数据块再按照对应业务的实时性、数据的精确性等来设定优先级别。每次比对时先处理优先级高的数据块。甚至可以为特别的数据块增加比对的频率。其次在对象服务器中创建数据比对对象池，每次比对时先将部分指定的数据块加载到对象缓存进行比对，完成后释放数据比对对象池中的数据对象模型，再从磁盘关系数据库加载接下来需要比对的指定数据块。In particular, due to the limited cache of the memory grid, the full amount of data cannot be loaded into the object cache at one time for comparison. Therefore, the data comparison module provides data block comparison technology. The amount of data in some data tables is very large. If you load all of them at once for comparison, it will not only take time but also occupy the resources of the memory grid object cache. All data will first be chunked and prioritized. The division of data blocks is formulated according to business requirements. A data block can contain logically linked or business-related data from different data tables, and unrelated data in the same data table is not classified into the same data block. Each data block is then prioritized according to the real-time performance of the corresponding business and the accuracy of the data. Data blocks with higher priority are processed first during each comparison. It is even possible to increase the frequency of comparisons for particular data blocks. Secondly, create a data comparison object pool in the object server. For each comparison, first load some specified data blocks into the object cache for comparison. After completion, release the data object model in the data comparison object pool, and then start from the disk relationship. The database loads the specified data blocks that need to be compared next.

步骤六、电网断面设计Step 6. Power Grid Section Design

数据断面提供断面管理、断面生成、多断面载入等功能。断面管理提供断面数据的查询、断面生成计划任务的制定、断面载入指令的下达等。断面生成根据计划任务定期将内存电网的数据进行持久化，生成电网数据断面。多断面载入根据断面指令，载入多个的历史断面到内存。The data section provides functions such as section management, section generation, and multi-section loading. Section management provides the query of section data, the formulation of section generation planning tasks, and the issuance of section loading instructions. The section generation periodically persists the data of the memory grid according to the scheduled task, and generates the grid data section. Multi-section loading According to the section command, load multiple historical sections into the memory.

内存电网断面以对象化并行计算框架持久化功能作为支撑，各省及总部的所有断面数据保存在各自的内存电网服务器。采用与内存电网服务器共享内存的方式，加载本地历史断面数据。数据断面总体架构图如图4。The memory grid section is supported by the persistence function of the object-oriented parallel computing framework, and all the section data of each province and headquarters are stored in their respective memory grid servers. Load the local historical section data by sharing the memory with the memory grid server. The overall architecture of the data section is shown in Figure 4.

内存电网提供断面管理工具用于查询所有历史断面信息，同时可以向数据断面处理器发起断面生成请求和断面加载请求。数据断面处理器在接收到相关请求后，执行断面生成、断面加载任务。The memory grid provides a section management tool for querying all historical section information, and at the same time can initiate a section generation request and a section loading request to the data section processor. After receiving relevant requests, the data section processor executes the tasks of section generation and section loading.

特别的，电网断面利用HDFS分布式文件系统进行永久、可靠存储，数据持久化时的序列化采用开源的Hession组件实现，该组件的特点是序列化效率高，序列化后的字节流短。In particular, the grid section uses the HDFS distributed file system for permanent and reliable storage. The serialization of data persistence is implemented using the open source Hession component. This component is characterized by high serialization efficiency and short byte streams after serialization.

步骤七、内存电网访问接口设计Step 7. Memory grid access interface design

内存电网接口用于向电网业务信息系统提供数据物理位置无关的电网资源访问方式，包括电网断面接口、电网数据统计分析接口、数据查询接口。通过电网断面接口业务系统可以查询电网断面信息，并下达断面加载指令，加载指定的电网断面数据到计算机内存。通过统计分析接口，业务系统可以获取总部和省电网资源多条件、多维度的统计分析结果。数据查询接口用于向业务系统提供全网电网资源的查询，业务系统不需要关注数据的来源和物理位置。电网资源数据访问接口采用PRC远程调用协议，传输的数据为资源对象序列化后的二进制流。The memory grid interface is used to provide the grid business information system with a grid resource access mode independent of the physical location of the data, including the grid section interface, the grid data statistical analysis interface, and the data query interface. Through the grid section interface business system, the grid section information can be queried, and the section loading command can be issued to load the specified grid section data into the computer memory. Through the statistical analysis interface, the business system can obtain multi-conditional and multi-dimensional statistical analysis results of headquarters and provincial power grid resources. The data query interface is used to provide the business system with the query of the grid resources of the whole network, and the business system does not need to pay attention to the source and physical location of the data. The grid resource data access interface adopts the PRC remote call protocol, and the transmitted data is a binary stream after serialization of resource objects.

(1)内存电网系统架构(1) Memory grid system architecture

内存电网构建在对象化并行计算框架之上。总部内存电网与各省内存电网之间、省内存电网与省数据库之间通过数据同步对比模块进行数据同步复制。内存电网具备数据断面、内存电网展现等自身应用功能，内存电网应用接口为PMS2.0等其他系统提供基于OPC的统计分析、查询等应用编程接口，参见图5。Memory Grid is built on top of the object-oriented parallel computing framework. Data synchronous replication is performed between the headquarters memory grid and each provincial memory grid, and between the provincial memory grid and the provincial database through the data synchronization comparison module. The memory grid has its own application functions such as data section and memory grid display. The memory grid application interface provides OPC-based statistical analysis, query and other application programming interfaces for other systems such as PMS2.0, see Figure 5.

(2)内存电网功能架构(2) Memory Grid Functional Architecture

内存电网在OPC基础上，以内存电网模型为核心，构建电网资源数据对象和任务对象池，在电网资源对象池基础上构建相关管理和应用功能，主要包括：数据同步比对、数据断面管理、内存电网查询统计分析、内存电网展现以及内存电网应用接口等，参见图6。On the basis of OPC, with the memory grid model as the core, the memory grid constructs grid resource data objects and task object pools, and builds related management and application functions on the basis of grid resource object pools, mainly including: data synchronization comparison, data section management, Refer to Figure 6 for memory grid query statistical analysis, memory grid display, and memory grid application interface.

数据同步比对功能负责将电网数据库中的资源数据同步到内存电网、省内存电网的资源数据同步到总部内存电网,使数据库和内存电网、总部内地电网和省内存电网的数据保持一致。数据断面功能负责将某一时刻的内存电网数据持久化生成数据断面，提供断面信息的查询统计、断面任务管理功能等。内存电网管理包括电网内存模型的管理和电网资源对象的管理。内存电网展现提供对内存电网的多维度展示，包括电网断面概括展现、电网断面数据展现、电网断面比对展现子功能。内存电网应用接口用于向业务系统提供电网断面接口、统计分析接口、数据查询接口。The data synchronization comparison function is responsible for synchronizing the resource data in the grid database to the memory grid, and the resource data of the provincial memory grid to the headquarters memory grid, so that the data of the database and the memory grid, the headquarter's inland grid and the provincial memory grid are consistent. The data section function is responsible for persisting the memory power grid data at a certain moment to generate a data section, providing query statistics of section information, section task management functions, etc. The memory grid management includes the management of the grid memory model and the management of the grid resource objects. The memory grid display provides a multi-dimensional display of the memory grid, including the grid section overview display, grid section data display, and grid section comparison display sub-functions. The memory grid application interface is used to provide the grid section interface, statistical analysis interface, and data query interface to the business system.

因此，在具体的设计时，本发明充分考虑现有技术的问题，采用全新的弱对象存储模式，通过融合业务逻辑进行对象模型设计，使得模型能够准确反映电网资源的情况，同时具备可重用、易理解、高效资源存储和访问等特性。此外，分布式内存电网的数据分布采用基于业务属性哈希散列方式，而被用于哈希散列的业务属性，则针对电网数据和业务特点进行了优化和定制，在实现资源访问和计算时能够最大程度减少分布式节点间的数据传输，大幅提升效率。Therefore, in the specific design, the present invention fully considers the problems of the prior art, adopts a new weak object storage mode, and carries out object model design by integrating business logic, so that the model can accurately reflect the situation of power grid resources, and at the same time has reusable, Features such as easy to understand, efficient resource storage and access. In addition, the data distribution of the distributed memory grid adopts a hash method based on business attributes, and the business attributes used for hash hashing are optimized and customized for grid data and business characteristics. It can minimize the data transmission between distributed nodes and greatly improve the efficiency.

以上显示和描述了本发明的基本原理和主要特征和本发明的优点。本行业的技术人员应该了解，本发明不受上述实施例的限制，上述实施例和说明书中描述的只是说明本发明的原理，在不脱离本发明精神和范围的前提下，本发明还会有各种变化和改进，这些变化和改进都落入要求保护的本发明范围内。本发明要求保护范围由所附的权利要求书及其等效物界定。The basic principles and main features of the present invention and the advantages of the present invention have been shown and described above. Those skilled in the industry should understand that the present invention is not limited by the above-mentioned embodiments. What are described in the above-mentioned embodiments and the description only illustrate the principle of the present invention. Without departing from the spirit and scope of the present invention, the present invention will also have Variations and improvements are possible, which fall within the scope of the claimed invention. The protection scope of the present invention is defined by the appended claims and their equivalents.

Claims

1. a distributed memory network system construction method, is characterized in that, specifically comprises following step:

(1) build internal memory power network resources model: the common information model specification in IEC61970 standard is followed in the modeling of internal memory power network resources, utilize OO encapsulation, succession and object application, set up the internal memory power network resources model of tree-shaped;

(2) determine memory structure: adopt array mode to store resource all properties data, by the mapping relations of defined attribute name identifiers and array index, determine the position of attribute data in array, realize data access;

(3) level splits power network resources data: will have the power network resources data-mapping of different business property value to different data blocks, by the power network resources data-mapping with identical services property value to together data block;

(4) also buffer memory power network resources data are loaded: when system first time starts, according to the horizontal division mode of data, data in power network resources database are loaded according to internal memory power network resources model, then in distributed caching to cluster in multiple server memory; After Data import completes, power network resources data in buffer memory are carried out serializing, forms binary file and be saved in disk, when start up system again, directly read electric network data from file, and read from serializing last time incremental data up till now from power network resources database; After power network resources distributed caching, by the area attribute value of cache server IP address, power network resources, power network resources electric pressure value, take up room, buffer memory temporal information used is sent in managing power network resources server and carries out unified management;

(5) isochronous memory electric network data: the resource data in electric network database is synchronized to internal memory electrical network, economizes the resource data of internal memory electrical network and be synchronized to general headquarters' internal memory electrical network, makes the data of ground electrical network and province's internal memory electrical network in database and internal memory electrical network, general headquarters be consistent;

(6) internal memory electrical network section is determined: internal memory electrical network section is using objectification parallel computation frame persistence function as support, all profile datas of each province and general headquarters are kept at respective internal memory grid service device, adopt the mode with internal memory grid service device shared drive, load local Historic Section data;

(7) internal memory electrical network access interface is determined: internal memory grid interface is used for the power network resources access mode providing Data Physical position irrelevant to electrical network operating information system, comprises electrical network section interface, electric network data statistical study interface, data-query interfaces.

2. distributed memory network system construction method according to claim 1, it is characterized in that, in step (1), when carrying out tree-shaped displaying, first the hierarchical relationship of power network resources in internal memory power network resources model will be determined, requirement must have a root node, and the father node of each child node is unique.

3. distributed memory network system construction method according to claim 1, is characterized in that, in step (2), the defining method that memory structure is concrete is as follows:

If resource data is A, comprise 4 attributes, record number is: 3, then distribute for every bar record the Object object array that a length is 4,

The mapping relations that employing Key-Value mode sets up attribute-identifier and array index are as follows:

When accessing certain property value of certain record, first find the array that this record is corresponding, then according to attribute-identifier and array index relation, find the array index that this attribute is corresponding, the property value corresponding according to array index access.

4. distributed memory network system construction method according to claim 1, is characterized in that, in step (3), power network resources data level method for splitting is as follows:

(3-1) by calculation task object model according to service logic and the data that use, analyze optimum N number of data cutting attribute field;

(3-2) by the type of N number of service data object attribute to be slit and attribute-value ranges, as the original input of data cutting;

(3-3) N number of attribute of object is regarded as the axle of N dimension space, reflect hyperspace region according to object attribute values scope, form the data lineoid of hyperspace;

(3-4) each data lineoid is mapped to the internal memory of the different computing nodes in Distributed Calculation cluster, and the distributed memory forming data stores.

5. distributed memory network system construction method according to claim 1, it is characterized in that, in step (6), internal memory electrical network section utilizes that HDFS distributed file system is carried out forever, reliable memory, and serializing during data persistence adopts the Hession assembly of increasing income to realize.

6. distributed memory network system construction method according to claim 1, is characterized in that, in step (6), described internal memory electrical network section provides section management, section generates and multibreak loading; The formulation that described section management provides the inquiry of profile data, section generates plan target, section are loaded into assigning of instruction; Described section generates and regularly the data of internal memory electrical network is carried out persistence according to plan target, generates electric network data section; Described multibreak is loaded into according to section instruction, is loaded into multiple Historic Section to internal memory; Internal memory electrical network provides section management tool for inquiring about all Historic Section information, can initiate section simultaneously generate request and section load request to data section processor; Data section processor, after receiving association requests, performs section generation, section loading tasks.

7. distributed memory network system construction method according to claim 1, is characterized in that, in step (7), described internal memory electrical network access interface adopts PRC far call agreement, and the data of transmission are the binary stream after resource object serializing.

8. distributed memory network system construction method according to claim 1, it is characterized in that, in step (7), by described electrical network section interface polls electrical network section information, and assign section load instructions, the electrical network profile data that loading is specified is to calculator memory; Obtain general headquarters by described electric network data statistical study interface and economize the statistic analysis result of power network resources many condition, various dimensions; Thered is provided the inquiry of the whole network power network resources to operation system by described data-query interfaces, operation system does not need source and the physical location of focused data.