[go: up one dir, main page]

WO2012130489A1 - Method, system, and computer program product for maintaining data consistency between two databases - Google Patents

Method, system, and computer program product for maintaining data consistency between two databases Download PDF

Info

Publication number
WO2012130489A1
WO2012130489A1 PCT/EP2012/050721 EP2012050721W WO2012130489A1 WO 2012130489 A1 WO2012130489 A1 WO 2012130489A1 EP 2012050721 W EP2012050721 W EP 2012050721W WO 2012130489 A1 WO2012130489 A1 WO 2012130489A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
database
destination
consistency
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/EP2012/050721
Other languages
French (fr)
Inventor
Himanshu Kumar SINGH
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens AG
Siemens Corp
Original Assignee
Siemens AG
Siemens Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens AG, Siemens Corp filed Critical Siemens AG
Publication of WO2012130489A1 publication Critical patent/WO2012130489A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/273Asynchronous replication or reconciliation

Definitions

  • the present invention is related to a method, a system and a computer program product for maintaining data consistency between multiple databases.
  • a relational database management system implements a
  • relational database in which data is grouped according to certain common characteristics.
  • the relational grouping of data is manifested in a collection of data tables.
  • Each data table is an array of data records.
  • Each data record is also referred to as a tuple and each tuple has a set of data attributes, each of which is assigned a data value.
  • the data value assigned to a data attribute must satisfy a set of rules and constraints such as domain constraints and entity and referential integrity constraints, to ensure the
  • the database maintains two distinct kinds of data, namely master data and transactional data.
  • Each computing application requires a set of configuration data in accordance with the desired field of use.
  • Such configuration data is stored as master data in the database.
  • transactional data a large amount of data is generated, which is referred to as transactional data.
  • An enterprise typically implements several distinct instances of a computing application. These distinct instances such as development, testing, training, quality, and production serve different objectives.
  • the production instance is the instance of computing application which is implemented for the actual intended use of the computing application. However, the other instances are equally important as they serve various other important purposes.
  • the development instance is used by the application developers initially, to develop the computing application and subsequently, to develop new features and new components during various upgrades of the computing
  • the testing instance is used to test the
  • the quality instance is used to ensure that the computing application meets all the desired quality-of- service parameters. Further, in order to ensure fault
  • the enterprise may maintain a redundant instance of the production instance.
  • Each instance of the computing application implements its own database and these instances of the computing application exist and maintained independently.
  • instances are based on a common database schema, which defines the structure of the databases.
  • the master data from the source database is
  • the underlying idea of the present invention is to provide a declarative specification corresponding to each data entity included in a database schema such that a dependency graph for the data entity is automatically generated from the declarative specification when a data consistency check between a source database and a destination database is required.
  • Various data tables associated with the data entity are included in the dependency graph.
  • the method of the present invention instantiates a source instance and a destination instance of the dependency graph. A first data record in the source instance and a second data record in the destination instance corresponding to the first data record in the source instance are compared to ascertain data
  • the method of maintaining data consistency between a source and a destination database in accordance with the present invention facilitates to eliminate manual
  • the method provides for high accuracy and high reliability in maintaining database consistency between the source and the destination databases.
  • the declarative specification for each data entity further includes information related to a natural key, which is used for identifying the second data record in the destination database corresponding to the first data record in the source database, and a consistency key for performing the
  • each of the natural key and the consistency key is a set of one or more data attributes in one or more data tables in the database schema.
  • the dependency graph comprises a set of nodes, wherein each node represents a data table included in a database schema, and wherein each node is connected to at least one other node in the set of nodes through an edge based on a parent-child relationship between the corresponding data tables.
  • the dependency graph provides information related to a dependency relationship between one or more data tables in the database schema.
  • the step of performing the consistency analysis and the step of managing the database transaction are successively performed corresponding to each data record in accordance with a dependency order starting from a least dependent node in the source instance of the dependency graph.
  • This technical feature facilitates in implementing simultaneous execution of various steps of the method according to the present invention such that the time required for ensuring data consistency is significantly reduced.
  • FIG 1 illustrates a schematic representation of a
  • database system illustrates a schematic representation of a consistency management system for maintaining data consistency between a source database and destination database
  • FIGS 3A-3B illustrate an exemplary representation of a
  • FIGS 4A-4B illustrate exemplary representations of a
  • FIG 5 illustrates a schematic representation of a
  • dependency graph illustrates an exemplary representation of a dependency graph for a data entity, illustrate an exemplary set of database query templates, and
  • FIG 8 illustrates a flow diagram corresponding to a method for maintaining data consistency between a source database and a destination database.
  • FIG 1 illustrates a schematic representation of a database system 100.
  • the database system 100 includes a consistency management system 102 and a plurality of databases 104a, 104b, 104c, and so on, hereinafter collectively referred to as databases 104, and individually referred to as database 104.
  • Each database 104 represents a database implemented for a specific instance of a computing application. Examples of such instances of a computing application include
  • one or more instances may be implemented in order to ensure fault tolerance.
  • the databases 104 are based on a database schema.
  • the database schema defines a structure of the database 104 in a formal language.
  • the database schema specifies the data tables, data attributes includes in each data table, and various data constraints that are applied on data values that can be assigned to a data attribute.
  • the consistency management system 102 is adapted to interface with each database 104 and maintain data consistency therein. The detailed operation of the consistency management system 102 is explained in conjunction with FIG 2.
  • consistency management system 102 may be implemented as a central system coordinating with each database 104.
  • consistency management system 102 may be implemented as a federated system with a dedicated
  • the multiple dedicated consistency management systems 102 for each database 104.
  • a database 104 which is known to include at least one recently updated and/or more accurate data records is selected as the source database and a database 104 which is known to include at least one outdated and/or inaccurate data records is selected as the destination database.
  • databases 104 share the database schema as explained above, it is possible that due to staggered system updates, the database schemas in a source database and a destination database may become inconsistent. In such cases, the database schemas in the source and
  • destination databases are made consistent either before applying the method of the present invention or as an
  • FIG 2 illustrates a schematic representation of the
  • the consistency management system 102 for maintaining data consistency between a source database and a destination database.
  • the consistency management system 102 includes a declarative specification module 202, a dependency graph module 204, an instantiation module 206, a consistency analysis module 208, and a transaction management module 210.
  • the declarative specification module 202 provides a
  • the dependency graph module 204 generates a dependency graph for a data entity based on the declarative specification.
  • the instantiation module 206 generates a source instance and a destination instance of the dependency graph.
  • the consistency analysis module 208 performs a consistency analysis for one or more data records included in the source and the
  • the transaction management module 210 performs one or more database transactions to modify one or more data records stored in the destination database.
  • the declarative specification module 202 maintains a
  • the declarative specification corresponding to each data entity included in the database schema defining the source and the destination databases.
  • the declarative specification includes dependency information related to at least one data table corresponding to each data entity.
  • a database architect, a database programmer, or any other user of the database system 100 may define a declarative specification for a data entity. When the database schema is altered, the declarative
  • the declarative specification for each data entity further includes information related to a natural key for identifying a second data record in the destination database
  • Each of the natural key and the consistency key is a set of one or more data attributes in one or more data tables in the database schema.
  • the declarative specification is defined using a suitable rule language as explained in conjunction with FIGS 4A and 4B.
  • the dependency graph module 204 generates a dependency graph for a data entity based on the declarative specification of the data entity stored in the declarative specification module 202.
  • the dependency graph includes a set of nodes interconnected with a plurality of edges. Each node
  • each node represents a data table included in the database schema and each node is connected to at least one other node in the set of nodes through an edge based on a parent-child relationship such that the edge provides information related to inter- dependency and cardinality of relationship between the corresponding data tables.
  • the dependency graph
  • the instantiation module 206 instantiates a source instance of the dependency graph based on a source data filter applied to the source database.
  • the source instance includes one or more data records retrieved from the source database in accordance with the source data filter.
  • the instantiation module 206 also instantiates a destination instance of the dependency graph based on a destination data filter applied to the destination database.
  • the destination instance includes one or more data records retrieved from the
  • the source data filter may be specified by a user of the database system 100. Alternatively, it may be automatically generated through a computer program.
  • the destination data filter is derived from the source data filter using the natural key included in the declarative specification.
  • the consistency analysis module 208 performs a consistency analysis for determining consistency between a first data record in the source instance and a second data record in the destination instance based on a consistency criterion.
  • the consistency analysis module 208 successively selects a first data record in accordance with a dependency order starting from a least dependent node in the source instance of the dependency graph.
  • the consistency analysis module 208 further identifies a second data record in the destination instance corresponding to the first data record based on the natural key included in the declarative specification.
  • the consistency analysis module 208 compares the first data record in the source instance with the second data record in the destination instance to perform the consistency analysis.
  • one or more data attributes in the first and second data records are matched. If the match is successful, the data records are deemed to be consistent else, the data records are deemed to be inconsistent.
  • the transaction management module 210 manages various database transactions for modifying the second data record in the destination database based on the consistency analysis. In an embodiment of the present invention, if the second data record is determined to be consistent with the first data record, the data values assigned to one or more data
  • the second data record is deleted from the destination database and the first data record is copied from the source database to the
  • the transaction management module 210 successively performs database transactions based on the consistency analysis for each data record retrieved in the source and the destination instances.
  • the instantiation module 206 simultaneously instantiates multiple source instances and multiple destination instances corresponding to the multiple source instances for multiple data entities. Subsequently, each pair of a source instance and a corresponding destination instance is simultaneously processed, such that the time required for ensuring data consistency is significantly reduced. Further, the
  • instantiation module 206 indicates a set of redundant nodes across different pairs of source and destination instances such that the consistency analysis module 208 and the
  • FIGS 3A and 3B illustrate an exemplary representation of a database schema 300.
  • the database schema 300 includes a plurality of data tables such as data table 302, data table 304, data table 306, and so on, and a plurality of
  • Each data table includes a plurality of data attributes.
  • data table 312 includes the data attributes such as "Object_ID”, “Creation_Time”, “Is_Deleted”, “Service_Type”, “Service Name”, “Common Definition”, and so on.
  • Each relationship 328 is a directed arrow pointing from a child data table to a parent data table.
  • Each child data table includes one or more attributes that have a reference to the parent data table.
  • data table 312 is a child data table with respect to data table 306 as the data attribute "Service_Type" has a reference to data table 306, that is, the data value stored in data attribute
  • “Service_Type” in data table 312 is one of the data values stored in the data attribute "Object_ID” in data table 306.
  • each data table is directly related to one or more other data tables in the database schema 300.
  • a cluster of such related data tables with respect to a given data table is referred to as a catalogue.
  • data table 312, which represents the data entity "Service” is related to data table 304, data table 314, data table 316, data table 318, and data table 306.
  • the data tables 312, 304, 314, 316, 318, and 306 may be logically clustered into a catalogue named as "Service” catalogue.
  • FIGS 3A and 3B illustrate exemplary representations of a declarative specification template 402 and a declarative specification 404 for a data entity respectively.
  • the declarative specification template 402 is based on a rule language for defining the declarative specification for a data entity. As shown in FIG 4A, the declarative
  • variable fields such as entity, catalogue, data table, primary key(s), foreign key(s), system key(s), natural key(s), and consistency key(s) .
  • the declarative specification template 402 indicates that for variable fields such as entity, catalogue, and data table, a desired data value may be provided whereas for other variable fields such as primary key(s), foreign key(s), system key(s), natural key(s), and consistency key(s), a combination of data table and data attribute is specified in accordance with the syntax shown in FIG 4A. It is noteworthy that for variable fields such as primary key(s), foreign key(s), system key(s), natural key(s), and consistency key(s), multiple inputs can be provided. For example, the primary key may be a combination of two or more data
  • the declarative specification 404 corresponds to data entity "Service” illustrated in database schema 300 in FIG 3.
  • the system key is "Service: Ob ect_ID” and the procedure to generate this system key is "Get_Service_Ob ect_ID” .
  • the natural key is "Service: Service_Name” .
  • the consistency key is "Service: Common_Definition, Common_Definition : Object_ID, Common Definition: Common Definition Name”. It is important to emphasize the significance of the natural key, which facilitates identifying a second data record in the destination instance corresponding to a first data record in the source instance. As explained earlier, although the source and the destination databases share the database schema, the databases are completely independent.
  • a primary key which is used to uniquely identify a data record in a data table, may not be same for the corresponding data records in the source and the destination instances.
  • the data table 312 may have data records corresponding to different types of services such as blood test, CT scan, X-ray, and so on. Each service is stored in the data table 312 as a separate data record.
  • the data attribute "Service_Name” is assigned the data value as "Blood Test". This is expected to be same across different databases.
  • the data attribute "Ob ect_ID" which is the primary key for data table 312
  • This numerical reference may vary across different databases and hence, can not be used to establish correspondence between data records in the source instance and the destination instance.
  • the data attribute used as natural key may be a data value directly stored in the data table.
  • the natural key may have a reference to a foreign table as illustrated in FIG 4A.
  • FIG 5 illustrates a schematic representation of a dependency graph 500.
  • the dependency graph 500 includes a plurality of nodes such as 502a, 502b, 502c, and so on, hereinafter collectively referred to as nodes 502 and individually referred to as node 502.
  • the dependency graph 500 includes a plurality of edges 504.
  • the dependency graph 500 includes a set of nodes, in which each node represents a data table included in the database schema. Each node is connected to at least one other node in the set of nodes through an edge based on a parent-child relationship between the corresponding data tables.
  • the dependency graph may be defined as a set ⁇ N, E ⁇ , N is a set of nodes 502, and E is a set of edges 504.
  • Each node 502 is defined as a tuple including the following fields "Table_Name”, “Entity_Name”, “Is_Catalogue_Parent”, and
  • Entity_Name specifies the data entity corresponding to the node 502
  • value of "Is_Catalogue_Parent” specifies whether the node is a least dependent node with regard to a cluster of nodes grouped in the corresponding catalogue
  • the "Records_Set” is a collection of tuples that satisfy a database filter.
  • the tuple corresponding to node 502 may include a field "Is_Nullable" that specifies whether it is possible to have zero data records in the node 502 in a specific instance of the dependency graph 500.
  • this field is optional, as this information may directly be inferred from a cardinality value associated with the edges 504 as explained below .
  • each edge 504 is a tuple including the following fields "Parent_Node”, “Parent_Cardinality”, “Child_Node”, and “Child_Cardinality” .
  • the value of "Parent_Node” specifies the parent node and "Child_Node” specifies the child node, which are interconnected by the edge 504.
  • the "Parent_Cardinality” and “Child_Cardinality” specify the cardinality relationship between the parent and the child node, such as one-to-many (l:n), many-to-one (n:l), many-to-many (n:n), and so on.
  • node 502a is a parent node for nodes 502b and 502c, which are referred to as child nodes.
  • the dependency graph 500 may have one or more least dependent nodes, which may be defined as nodes that have the least number of parent nodes in the dependency graph 500. As shown in FIG 5, node 502a is a least dependent node in the
  • dependency graph 500 depicts only one least dependent node, it is to be noted that it is possible to have multiple least dependent nodes in a dependency graph.
  • a node may be defined as a least dependent node in the context of a catalogue as well. In case multiple least dependent nodes exist, a suitable priority order may be assigned to each least dependent node to perform the method of the present invention.
  • FIG 6 illustrates an exemplary representation of a dependency graph 600 for the data entity "Service”.
  • the dependency graph 600 includes node 602 corresponding to data table 304, node 604 corresponding to data table 306, node 606 corresponding to data table 312, and node 608 corresponding to data table 318.
  • the dependency graph module 204 retrieves the declarative specification of data entity “Service” from the declarative specification module 202 and generates the dependency graph 600.
  • the dependency graph 600 takes into account various constraints such as foreign keys and other dependencies associated with data entity "Service”.
  • the dependency graph 600 is subsequently instantiated to generate a source instance and a destination instance to perform subsequent steps in accordance with various
  • FIGS 7A and 7B illustrate an exemplary set of database query templates such as a load query template 702, a source data filter template 704, a destination data filter template 706, a consistency criterion template 708, and a database
  • transaction query template 710 These templates correspond to data table 312 illustrated in FIG 3 and the corresponding dependency graph 600 illustrated in FIG 6. It should be noted that these templates are exemplary in nature and the specific templates will vary according to the relevant database schema.
  • the load query template 702 provides a template to
  • the value for the source data filter for an entry node may be specified
  • destination data filter is derived from the source data filter based on the natural keys defined in the declarative specification corresponding to the data entity "Service”.
  • the consistency criterion template 708 provides the template to match the data values stored in data attributes
  • the database transaction query template 710 provides the template to perform a database transaction to ensure that a data record in the destination database is consistent with a corresponding data record in the source database.
  • FIG 8 illustrates a flow diagram corresponding to a method for maintaining data consistency between a source database and a destination database.
  • a declarative specification for a data entity is provided. As explained in conjunction with FIG 2, the
  • declarative specification includes information related to at least one data table corresponding to the data entity.
  • a dependency graph for the data entity is generated.
  • the dependency graph includes information related to a dependency relationship between one or more data tables in the database schema based on the declarative specification for the data entity.
  • a source instance of the dependency graph is instantiated.
  • the source instance includes one or more data records retrieved from the source database based on a source data filter.
  • a destination instance of the dependency graph is instantiated.
  • the destination instance includes one or more data records retrieved from the destination database based on a destination data filter.
  • the destination data filter is derived from the source data filter based on the natural key included in the declarative specification of the data entity.
  • determining consistency between a first data record in the source instance and a second data record in the destination instance based on a consistency criterion determining consistency between a first data record in the source instance and a second data record in the destination instance based on a consistency criterion.
  • the second record in the destination database corresponding to the first record in the source database is identified based on a natural key, while the consistency criterion is derived from a consistency key included in the declarative specification.
  • a database transaction is managed for modifying the second data record in the destination database based on the consistency analysis. Therefore, the method, the system, and the computer program product for maintaining data consistency between multiple databases in accordance with the present invention facilitate eliminating manual intervention and reduce time required for ensuring data consistency. Further, the present invention provides for high accuracy and high reliability in
  • the present invention is particularly advantageous for facilitating large-scale data migration from one database to another and data merging between two databases.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a method, a system and a computer program product for maintaining data consistency between multiple databases. A dependency graph for a data entity is generated based on a declarative specification the data entity. A source instance and a destination instance of the dependency graph are instantiated. Each data record in the source instance and a corresponding data record in the destination instance are compared to ascertain data consistency therein. One or more database transactions are performed based on the result of such consistency analysis.

Description

Description
METHOD, SYSTEM, AND COMPUTER PROGRAM PRODUCT FOR MAINTAINING DATA
CONSISTENCY BETWEEN TWO DATABASES
The present invention is related to a method, a system and a computer program product for maintaining data consistency between multiple databases.
Various computing applications used in enterprise systems involve management of huge volumes of data. Towards this end, various relational database management systems such as those offered by Oracle®, Microsoft®, and IBM® are commonly used.
A relational database management system implements a
relational database in which data is grouped according to certain common characteristics. The relational grouping of data is manifested in a collection of data tables. Each data table is an array of data records. Each data record is also referred to as a tuple and each tuple has a set of data attributes, each of which is assigned a data value. The data value assigned to a data attribute must satisfy a set of rules and constraints such as domain constraints and entity and referential integrity constraints, to ensure the
integrity of data stored in various data tables in the relational database.
Broadly speaking, the database maintains two distinct kinds of data, namely master data and transactional data. Each computing application requires a set of configuration data in accordance with the desired field of use. Such configuration data is stored as master data in the database. During routine use of the computing application, a large amount of data is generated, which is referred to as transactional data.
An enterprise typically implements several distinct instances of a computing application. These distinct instances such as development, testing, training, quality, and production serve different objectives. The production instance is the instance of computing application which is implemented for the actual intended use of the computing application. However, the other instances are equally important as they serve various other important purposes. The development instance is used by the application developers initially, to develop the computing application and subsequently, to develop new features and new components during various upgrades of the computing
application. The testing instance is used to test the
computing application using various use cases whereas the training instance is used to train the potential users for using various features and functionalities of the computing application. The quality instance is used to ensure that the computing application meets all the desired quality-of- service parameters. Further, in order to ensure fault
tolerance, the enterprise may maintain a redundant instance of the production instance.
Each instance of the computing application implements its own database and these instances of the computing application exist and maintained independently.
It is noteworthy that the databases across different
instances are based on a common database schema, which defines the structure of the databases.
Though the databases are based on the common database schema, yet owing to staggered data creation and updates in the different instances, the data stored in the databases of these instances tends to become inconsistent. Such
inconsistency of data, especially the inconsistency of the master data, across different instances of the computing application is highly undesirable. Various systems and methods have been developed in the prior art to address the need for maintaining data consistency between multiple databases implemented in different instances of a computing application. In one of the methods known in the art to ensure data
consistency between a source database and a destination database, the master data from the source database is
extracted in a spreadsheet and manually entered into the destination database. Evidently, this method is quite time and labor-intensive and prone to human errors.
In another method known in the art, a set of tables
containing the master data in the destination databases are removed and a corresponding set of tables is copied from the source database. This approach becomes extremely time- consuming in case of large databases and more importantly, it leads to loss of referential integrity in the destination database.
Accordingly, there is a need for a system and a method for maintaining data consistency between multiple databases implemented in different instances of a computing
application.
It is an object of the present invention to provide a system, a method, and a computer program product for maintaining data consistency between multiple databases.
The object of the present invention is achieved by a method according to claim 1, a system according to claim 6, and a computer program product according to claim 11. Further embodiments of the present invention are addressed in the dependent claims.
The underlying idea of the present invention is to provide a declarative specification corresponding to each data entity included in a database schema such that a dependency graph for the data entity is automatically generated from the declarative specification when a data consistency check between a source database and a destination database is required. Various data tables associated with the data entity are included in the dependency graph. The method of the present invention instantiates a source instance and a destination instance of the dependency graph. A first data record in the source instance and a second data record in the destination instance corresponding to the first data record in the source instance are compared to ascertain data
consistency therein. One or more database transactions are performed based on the result of such consistency analysis. Therefore, the method of maintaining data consistency between a source and a destination database in accordance with the present invention facilitates to eliminate manual
intervention and reduce time required for ensuring data consistency. Further, the method provides for high accuracy and high reliability in maintaining database consistency between the source and the destination databases.
In accordance with an embodiment of the present invention, the declarative specification for each data entity further includes information related to a natural key, which is used for identifying the second data record in the destination database corresponding to the first data record in the source database, and a consistency key for performing the
consistency analysis, wherein each of the natural key and the consistency key is a set of one or more data attributes in one or more data tables in the database schema. These
technical features facilitate identifying a data record in the destination database corresponding to a data record in the source database, and further provide a consistency criterion to perform a consistency analysis and hence, determine a suitable database transaction that should be performed in a subsequent step.
In accordance with another embodiment of the present
invention, the dependency graph comprises a set of nodes, wherein each node represents a data table included in a database schema, and wherein each node is connected to at least one other node in the set of nodes through an edge based on a parent-child relationship between the corresponding data tables. Thus, the dependency graph provides information related to a dependency relationship between one or more data tables in the database schema. This technical feature of the present invention facilitates determining the order in which data records in various data tables associated with a data entity should be processed to ensure data consistency between the source and the
destination databases. In accordance with another embodiment of the present
invention, the step of performing the consistency analysis and the step of managing the database transaction are successively performed corresponding to each data record in accordance with a dependency order starting from a least dependent node in the source instance of the dependency graph. This technical feature ensures that a plurality of data records stored in multiple data tables are
systematically processed such that various database
constraints are complied while performing various database transactions.
In accordance with another embodiment of the present
invention, a plurality of source instances and a
corresponding plurality of destination instances are
instantiated for a plurality of data entities, and wherein each pair of a source instance and a corresponding
destination instance is simultaneously processed. This technical feature facilitates in implementing simultaneous execution of various steps of the method according to the present invention such that the time required for ensuring data consistency is significantly reduced.
The present invention is further described hereinafter with reference to illustrated embodiments shown in the
accompanying drawings, in which:
FIG 1 illustrates a schematic representation of a
database system, illustrates a schematic representation of a consistency management system for maintaining data consistency between a source database and destination database,
FIGS 3A-3B illustrate an exemplary representation of a
database schema, FIGS 4A-4B illustrate exemplary representations of a
declarative specification template and a
declarative specification for a data entity respectively, FIG 5 illustrates a schematic representation of a
dependency graph, illustrates an exemplary representation of a dependency graph for a data entity, illustrate an exemplary set of database query templates, and
FIG 8 illustrates a flow diagram corresponding to a method for maintaining data consistency between a source database and a destination database.
Various embodiments are described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purpose of explanation, numerous specific details are set forth in order to provide a thorough understanding of one or more embodiments. It may be evident that such embodiments may be practiced without these specific details.
FIG 1 illustrates a schematic representation of a database system 100. The database system 100 includes a consistency management system 102 and a plurality of databases 104a, 104b, 104c, and so on, hereinafter collectively referred to as databases 104, and individually referred to as database 104. Each database 104 represents a database implemented for a specific instance of a computing application. Examples of such instances of a computing application include
development, testing, training, quality, and production. In addition, one or more instances may be implemented in order to ensure fault tolerance.
These databases 104 are based on a database schema. The database schema defines a structure of the database 104 in a formal language. The database schema specifies the data tables, data attributes includes in each data table, and various data constraints that are applied on data values that can be assigned to a data attribute. Thus, there is a
correspondence between the data tables and data attributes stored in the databases 104.
The consistency management system 102 is adapted to interface with each database 104 and maintain data consistency therein. The detailed operation of the consistency management system 102 is explained in conjunction with FIG 2.
In various embodiments of the present invention, the
consistency management system 102 may be implemented as a central system coordinating with each database 104.
Alternatively, the consistency management system 102 may be implemented as a federated system with a dedicated
consistency management system 102 for each database 104. The multiple dedicated consistency management systems 102
coordinate with each other through a suitable communication protocol .
The present invention will hereinafter be described with reference to a source database and a destination database. In general, a database 104 which is known to include at least one recently updated and/or more accurate data records is selected as the source database and a database 104 which is known to include at least one outdated and/or inaccurate data records is selected as the destination database.
It should be noted that although the databases 104 share the database schema as explained above, it is possible that due to staggered system updates, the database schemas in a source database and a destination database may become inconsistent. In such cases, the database schemas in the source and
destination databases are made consistent either before applying the method of the present invention or as an
intermediate step while executing the steps of the method of the present invention.
FIG 2 illustrates a schematic representation of the
consistency management system 102 for maintaining data consistency between a source database and a destination database. The consistency management system 102 includes a declarative specification module 202, a dependency graph module 204, an instantiation module 206, a consistency analysis module 208, and a transaction management module 210.
The declarative specification module 202 provides a
declarative specification corresponding to one or more data entities included in the database schema. The dependency graph module 204 generates a dependency graph for a data entity based on the declarative specification. The
instantiation module 206 generates a source instance and a destination instance of the dependency graph. The consistency analysis module 208 performs a consistency analysis for one or more data records included in the source and the
destination instances. The transaction management module 210 performs one or more database transactions to modify one or more data records stored in the destination database.
The declarative specification module 202 maintains a
declarative specification corresponding to each data entity included in the database schema defining the source and the destination databases. The declarative specification includes dependency information related to at least one data table corresponding to each data entity. In accordance with various embodiments of the present invention, a database architect, a database programmer, or any other user of the database system 100 may define a declarative specification for a data entity. When the database schema is altered, the declarative
specification is also suitably modified to incorporate one or more changes in the database schema.
The declarative specification for each data entity further includes information related to a natural key for identifying a second data record in the destination database
corresponding to a first data record in the source database, and a consistency key for performing the consistency
analysis. Each of the natural key and the consistency key is a set of one or more data attributes in one or more data tables in the database schema. The declarative specification is defined using a suitable rule language as explained in conjunction with FIGS 4A and 4B.
The dependency graph module 204 generates a dependency graph for a data entity based on the declarative specification of the data entity stored in the declarative specification module 202. The dependency graph includes a set of nodes interconnected with a plurality of edges. Each node
represents a data table included in the database schema and each node is connected to at least one other node in the set of nodes through an edge based on a parent-child relationship such that the edge provides information related to inter- dependency and cardinality of relationship between the corresponding data tables. Thus, the dependency graph
provides information related to a dependency relationship between one or more data tables defined in the database schema. The dependency graph will be explained in more detail in conjunction with FIGS 5 and 6. The instantiation module 206 instantiates a source instance of the dependency graph based on a source data filter applied to the source database. Thus, the source instance includes one or more data records retrieved from the source database in accordance with the source data filter. The instantiation module 206 also instantiates a destination instance of the dependency graph based on a destination data filter applied to the destination database. Thus, the destination instance includes one or more data records retrieved from the
destination database in accordance with the destination data filter .
The source data filter may be specified by a user of the database system 100. Alternatively, it may be automatically generated through a computer program. The destination data filter is derived from the source data filter using the natural key included in the declarative specification.
The consistency analysis module 208 performs a consistency analysis for determining consistency between a first data record in the source instance and a second data record in the destination instance based on a consistency criterion.
The consistency analysis module 208 successively selects a first data record in accordance with a dependency order starting from a least dependent node in the source instance of the dependency graph. The consistency analysis module 208 further identifies a second data record in the destination instance corresponding to the first data record based on the natural key included in the declarative specification.
Subsequently, the consistency analysis module 208 compares the first data record in the source instance with the second data record in the destination instance to perform the consistency analysis.
In order to perform the consistency analysis, one or more data attributes in the first and second data records are matched. If the match is successful, the data records are deemed to be consistent else, the data records are deemed to be inconsistent.
The transaction management module 210 manages various database transactions for modifying the second data record in the destination database based on the consistency analysis. In an embodiment of the present invention, if the second data record is determined to be consistent with the first data record, the data values assigned to one or more data
attributes in the second data record are modified to
corresponding data values included in the first data record. Alternatively, if the second data record is determined to be inconsistent with the first data record, the second data record is deleted from the destination database and the first data record is copied from the source database to the
destination database. The transaction management module 210 successively performs database transactions based on the consistency analysis for each data record retrieved in the source and the destination instances.
In accordance with an embodiment of the present invention, the instantiation module 206 simultaneously instantiates multiple source instances and multiple destination instances corresponding to the multiple source instances for multiple data entities. Subsequently, each pair of a source instance and a corresponding destination instance is simultaneously processed, such that the time required for ensuring data consistency is significantly reduced. Further, the
instantiation module 206 indicates a set of redundant nodes across different pairs of source and destination instances such that the consistency analysis module 208 and the
transaction management module 210 process the redundant nodes only once. FIGS 3A and 3B illustrate an exemplary representation of a database schema 300. The database schema 300 includes a plurality of data tables such as data table 302, data table 304, data table 306, and so on, and a plurality of
relationships 328. Each data table includes a plurality of data attributes. For example, data table 312 includes the data attributes such as "Object_ID", "Creation_Time", "Is_Deleted", "Service_Type", "Service Name", "Common Definition", and so on. Each relationship 328 is a directed arrow pointing from a child data table to a parent data table. Each child data table includes one or more attributes that have a reference to the parent data table. For example, data table 312 is a child data table with respect to data table 306 as the data attribute "Service_Type" has a reference to data table 306, that is, the data value stored in data attribute
"Service_Type" in data table 312 is one of the data values stored in the data attribute "Object_ID" in data table 306. As evident from FIGS 3A and 3B, each data table is directly related to one or more other data tables in the database schema 300. A cluster of such related data tables with respect to a given data table is referred to as a catalogue. For example, data table 312, which represents the data entity "Service", is related to data table 304, data table 314, data table 316, data table 318, and data table 306. The data tables 312, 304, 314, 316, 318, and 306 may be logically clustered into a catalogue named as "Service" catalogue. Various technical features of the present invention will be explained with reference to the database schema 300. It is imperative to mention that the database schema 300 shown in FIGS 3A and 3B is a simplistic database schema. In practical applications, the database schema includes hundreds of data tables and each data table includes several data attributes. FIGS 4A and 4B illustrate exemplary representations of a declarative specification template 402 and a declarative specification 404 for a data entity respectively. The declarative specification template 402 is based on a rule language for defining the declarative specification for a data entity. As shown in FIG 4A, the declarative
specification template 402 includes variable fields such as entity, catalogue, data table, primary key(s), foreign key(s), system key(s), natural key(s), and consistency key(s) . The declarative specification template 402 indicates that for variable fields such as entity, catalogue, and data table, a desired data value may be provided whereas for other variable fields such as primary key(s), foreign key(s), system key(s), natural key(s), and consistency key(s), a combination of data table and data attribute is specified in accordance with the syntax shown in FIG 4A. It is noteworthy that for variable fields such as primary key(s), foreign key(s), system key(s), natural key(s), and consistency key(s), multiple inputs can be provided. For example, the primary key may be a combination of two or more data
attributes, in such cases, multiple inputs may be provided for the primary key(s) variable field. The declarative specification 404 corresponds to data entity "Service" illustrated in database schema 300 in FIG 3.
As evident from FIG 4B, the entity is "Service", the
catalogue is "Service", the data table is "Service", the primary key is "Ob ect_ID", the two foreign keys are
"Service: Common_Definition, Common_Definition : Ob ect_ID" and "Service: Service_Type, Service_Type : Ob ect_ID". The system key is "Service: Ob ect_ID" and the procedure to generate this system key is "Get_Service_Ob ect_ID" . The natural key is "Service: Service_Name" . The consistency key is "Service: Common_Definition, Common_Definition : Object_ID, Common Definition: Common Definition Name". It is important to emphasize the significance of the natural key, which facilitates identifying a second data record in the destination instance corresponding to a first data record in the source instance. As explained earlier, although the source and the destination databases share the database schema, the databases are completely independent. Hence, a primary key, which is used to uniquely identify a data record in a data table, may not be same for the corresponding data records in the source and the destination instances. Let us consider, for example, data table 312 which is related to the data entity service. The data table 312 may have data records corresponding to different types of services such as blood test, CT scan, X-ray, and so on. Each service is stored in the data table 312 as a separate data record. Considering, for example, blood test, when the data record is created, the data attribute "Service_Name" is assigned the data value as "Blood Test". This is expected to be same across different databases. However, the data attribute "Ob ect_ID", which is the primary key for data table 312, is created automatically through a computer program and is normally a numerical reference, such as 013314 or 110072 and so on. This numerical reference may vary across different databases and hence, can not be used to establish correspondence between data records in the source instance and the destination instance. Hence, there is a need to specify a data attribute which is expected to have the same data value in different databases. The data attribute used as natural key may be a data value directly stored in the data table. Alternatively, the natural key may have a reference to a foreign table as illustrated in FIG 4A.
FIG 5 illustrates a schematic representation of a dependency graph 500. The dependency graph 500 includes a plurality of nodes such as 502a, 502b, 502c, and so on, hereinafter collectively referred to as nodes 502 and individually referred to as node 502. The dependency graph 500 includes a plurality of edges 504. As explained earlier in conjunction with FIG 2, the dependency graph 500 includes a set of nodes, in which each node represents a data table included in the database schema. Each node is connected to at least one other node in the set of nodes through an edge based on a parent-child relationship between the corresponding data tables.
Thus, the dependency graph may be defined as a set {N, E}, N is a set of nodes 502, and E is a set of edges 504. Each node 502 is defined as a tuple including the following fields "Table_Name", "Entity_Name", "Is_Catalogue_Parent", and
"Records_Set" . The value of "Table_Name" specifies the data table corresponding to the node 502, the value of
"Entity_Name" specifies the data entity corresponding to the node 502, the value of "Is_Catalogue_Parent" specifies whether the node is a least dependent node with regard to a cluster of nodes grouped in the corresponding catalogue, the "Records_Set" is a collection of tuples that satisfy a database filter.
In addition, the tuple corresponding to node 502 may include a field "Is_Nullable" that specifies whether it is possible to have zero data records in the node 502 in a specific instance of the dependency graph 500. However, this field is optional, as this information may directly be inferred from a cardinality value associated with the edges 504 as explained below .
Similarly, each edge 504 is a tuple including the following fields "Parent_Node", "Parent_Cardinality", "Child_Node", and "Child_Cardinality" . The value of "Parent_Node" specifies the parent node and "Child_Node" specifies the child node, which are interconnected by the edge 504. The "Parent_Cardinality" and "Child_Cardinality" specify the cardinality relationship between the parent and the child node, such as one-to-many (l:n), many-to-one (n:l), many-to-many (n:n), and so on. As shown in FIG 5, node 502a is a parent node for nodes 502b and 502c, which are referred to as child nodes.
The dependency graph 500 may have one or more least dependent nodes, which may be defined as nodes that have the least number of parent nodes in the dependency graph 500. As shown in FIG 5, node 502a is a least dependent node in the
dependency graph 500. Although the dependency graph 500, as shown in FIG 5, depicts only one least dependent node, it is to be noted that it is possible to have multiple least dependent nodes in a dependency graph. As mentioned before, a node may be defined as a least dependent node in the context of a catalogue as well. In case multiple least dependent nodes exist, a suitable priority order may be assigned to each least dependent node to perform the method of the present invention.
FIG 6 illustrates an exemplary representation of a dependency graph 600 for the data entity "Service". The dependency graph 600 includes node 602 corresponding to data table 304, node 604 corresponding to data table 306, node 606 corresponding to data table 312, and node 608 corresponding to data table 318.
When a user selects the data entity "Service" for consistency analysis, the dependency graph module 204 retrieves the declarative specification of data entity "Service" from the declarative specification module 202 and generates the dependency graph 600. The dependency graph 600 takes into account various constraints such as foreign keys and other dependencies associated with data entity "Service".
The dependency graph 600 is subsequently instantiated to generate a source instance and a destination instance to perform subsequent steps in accordance with various
embodiments of the present invention as described herein. FIGS 7A and 7B illustrate an exemplary set of database query templates such as a load query template 702, a source data filter template 704, a destination data filter template 706, a consistency criterion template 708, and a database
transaction query template 710. These templates correspond to data table 312 illustrated in FIG 3 and the corresponding dependency graph 600 illustrated in FIG 6. It should be noted that these templates are exemplary in nature and the specific templates will vary according to the relevant database schema.
The load query template 702 provides a template to
instantiate the source and the destination instances using the source data filter template 704 and the destination data filter template 706 respectively.
As mentioned earlier, during execution, the value for the source data filter for an entry node may be specified
directly by a user of the database system 100. Alternatively, it may be provided by an automated script or similar means. While the value assigned to destination data filter is derived from the source data filter based on the natural keys defined in the declarative specification corresponding to the data entity "Service".
The consistency criterion template 708 provides the template to match the data values stored in data attributes
"Common_Definition" and "Service_Type" in the corresponding data records included in the source instance and the
destination instance.
The database transaction query template 710 provides the template to perform a database transaction to ensure that a data record in the destination database is consistent with a corresponding data record in the source database. FIG 8 illustrates a flow diagram corresponding to a method for maintaining data consistency between a source database and a destination database. At step 802, a declarative specification for a data entity is provided. As explained in conjunction with FIG 2, the
declarative specification includes information related to at least one data table corresponding to the data entity.
At step 804, a dependency graph for the data entity is generated. The dependency graph includes information related to a dependency relationship between one or more data tables in the database schema based on the declarative specification for the data entity.
At step 806, a source instance of the dependency graph is instantiated. The source instance includes one or more data records retrieved from the source database based on a source data filter.
At step 808, a destination instance of the dependency graph is instantiated. The destination instance includes one or more data records retrieved from the destination database based on a destination data filter. The destination data filter is derived from the source data filter based on the natural key included in the declarative specification of the data entity.
At step 810, a consistency analysis is performed for
determining consistency between a first data record in the source instance and a second data record in the destination instance based on a consistency criterion. The second record in the destination database corresponding to the first record in the source database is identified based on a natural key, while the consistency criterion is derived from a consistency key included in the declarative specification. At step 812, a database transaction is managed for modifying the second data record in the destination database based on the consistency analysis. Therefore, the method, the system, and the computer program product for maintaining data consistency between multiple databases in accordance with the present invention facilitate eliminating manual intervention and reduce time required for ensuring data consistency. Further, the present invention provides for high accuracy and high reliability in
maintaining database consistency between two databases.
The present invention is particularly advantageous for facilitating large-scale data migration from one database to another and data merging between two databases.
While the present invention has been described in detail with reference to certain embodiments, it should be appreciated that the present invention is not limited to those
embodiments. In view of the present disclosure, many
modifications and variations would present themselves, to those of skill in the art without departing from the scope and spirit of this invention. The scope of the present invention is, therefore, indicated by the following claims rather than by the foregoing description. All changes, modifications, and variations coming within the meaning and range of equivalency of the claims are to be considered within their scope.

Claims

Claims :
1. A method for maintaining data consistency between a source database and a destination database, the method comprising: - providing a declarative specification for a data entity, wherein the declarative specification comprises dependency information related to at least one data table corresponding to the data entity,
- generating a dependency graph for the data entity, wherein the dependency graph comprises information related to a dependency relationship between one or more data tables based on the declarative specification for the data entity,
- instantiating a source instance of the dependency graph based on a source data filter, wherein the source instance comprises one or more data records retrieved from the source database,
- instantiating a destination instance of the dependency graph based on a destination data filter, wherein the
destination instance comprises one or more data records retrieved from the destination database,
- performing a consistency analysis for determining
consistency between a first data record in the source
instance and a second data record in the destination instance based on a consistency criterion, and
- managing a database transaction for modifying the second data record in the destination database based on the
consistency analysis.
2. The method according to claim 1, wherein the declarative specification for each data entity further comprises
information related to a natural key for identifying the second data record in the destination database corresponding to the first data record in the source database, and a consistency key for performing the consistency analysis, wherein each of the natural key and the consistency key is a set of one or more data attributes in one or more data tables .
3. The method according to any of claims 1 or 2, wherein the dependency graph comprises a set of nodes, wherein each node represents a data table comprised in the database, and wherein each node is connected to at least one other node in the set of nodes through an edge based on a parent-child relationship between the corresponding data tables such that the dependency graph provides information related to a dependency relationship between one or more data tables.
4. The method according to claim 3, wherein the steps of performing the consistency analysis and managing the database transaction are successively performed corresponding to each data record in accordance with a dependency order starting from a least dependent node in the source instance of the dependency graph.
5. The method according to any of claims 1 to 4, wherein a plurality of source instances and a corresponding plurality of destination instances are instantiated for a plurality of data entities, and wherein each pair of a source instance and a corresponding destination instance is simultaneously processed .
6. A system for maintaining data consistency between a source database and a destination database, the system comprising:
- a declarative specification module maintaining a
declarative specification for a data entity, wherein the declarative specification comprises dependency information related to a data table corresponding to the data entity, - a dependency graph module for generating a dependency graph for the data entity, wherein the dependency graph comprises information related to a dependency relationship between one or more data tables derived from the declarative
specification for the data entity,
- an instantiation module for instantiating a source instance of the dependency graph based on a source data filter, wherein the source instance comprises one or more data records retrieved from the source database, and further instantiating a destination instance of the dependency graph based on a destination data filter, wherein the destination instance comprises one or more data records retrieved from the destination database,
- a consistency analysis module for performing consistency analysis for determining consistency between a first data record in the source instance and a second data record in the destination instance based on a consistency criterion, and - a transaction management module for managing a database transaction for modifying the second data record in the destination database based on the consistency analysis.
7. The system according to claim 6, wherein the declarative specification for the data entity further comprises
information related to a natural key for identifying the second data record in the destination database corresponding to the first data record in the source database, and a consistency key for performing the consistency analysis, wherein each of the natural key and the consistency key is a set of one or more data attributes in one or more data tables .
8. The system according to any of claims 6 or 7, wherein the dependency graph comprises a set of nodes, wherein each node represents a data table comprised in the database, and wherein each node is connected to at least one other node in the set of nodes through an edge based on a parent-child relationship between the corresponding data tables such that the dependency graph provides information related to a dependency relationship between one or more data tables.
9. The system according to claim 8, wherein the consistency analysis module successively performs consistency analysis for each data record in accordance with a dependency order starting from a least dependent node in the source instance of the dependency graph, and further wherein the transaction management module successively performs database transactions based on the consistency analysis.
10. The system according to any of claims 6 to 9, wherein a plurality of source instances and a corresponding plurality of destination instances are instantiated, and wherein each pair of a source instance and a corresponding destination instance is simultaneously processed.
11. A computer program product embodied on a computer
readable medium, the computer-readable medium comprising computer-executable instructions for maintaining data
consistency between a source database and a destination database, the computer-executable instructions comprising:
- providing a declarative specification for a data entity, wherein the declarative specification comprises dependency information related to at least one data table corresponding to the data entity,
- generating a dependency graph for the data entity, wherein the dependency graph comprises information related to a dependency relationship between one or more data tables based on the declarative specification for the data entity,
- instantiating a source instance of the dependency graph based on a source data filter, wherein the source instance comprises one or more data records retrieved from the source database,
- instantiating a destination instance of the dependency graph based on a destination data filter, wherein the
destination instance comprises one or more data records retrieved from the destination database,
- performing a consistency analysis for determining
consistency between a first data record in the source
instance and a second data record in the destination instance based on a consistency criterion, and
- managing a database transaction for modifying the second data record in the destination database based on the
consistency analysis.
12. The computer program product according to claim 11 further comprising computer-executable instructions for performing the method according to any of claims 2 to 5.
PCT/EP2012/050721 2011-04-01 2012-01-18 Method, system, and computer program product for maintaining data consistency between two databases Ceased WO2012130489A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN470KO2011 2011-04-01
IN470/KOL/2011 2011-04-01

Publications (1)

Publication Number Publication Date
WO2012130489A1 true WO2012130489A1 (en) 2012-10-04

Family

ID=45531870

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2012/050721 Ceased WO2012130489A1 (en) 2011-04-01 2012-01-18 Method, system, and computer program product for maintaining data consistency between two databases

Country Status (1)

Country Link
WO (1) WO2012130489A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9514164B1 (en) * 2013-12-27 2016-12-06 Accenture Global Services Limited Selectively migrating data between databases based on dependencies of database entities
CN107807972A (en) * 2017-10-19 2018-03-16 北京科技大学 A kind of test data consistency detecting method
WO2019241656A1 (en) * 2018-06-16 2019-12-19 Hexagon Technology Center Gmbh System and method for comparing and selectively merging database records
CN111259027A (en) * 2020-01-15 2020-06-09 中国科学院软件研究所 A data consistency detection method
CN111382083A (en) * 2020-04-30 2020-07-07 中国银行股份有限公司 Test data generation method and device
CN111563076A (en) * 2020-05-09 2020-08-21 咪咕文化科技有限公司 Data auditing method, device, network equipment and storage medium
CN111881288A (en) * 2020-05-19 2020-11-03 杭州中奥科技有限公司 Method and device for judging authenticity of record information, storage medium and electronic equipment
US20230367761A1 (en) * 2022-05-12 2023-11-16 Sap Se Managing Master Data For Distributed Environments
CN117251460A (en) * 2023-08-10 2023-12-19 上海栈略数据技术有限公司 Data consistency check system for graph database and relational database

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010027458A1 (en) * 2000-03-29 2001-10-04 Toshihiro Wakayama System for managing networked information contents
US20060224637A1 (en) * 2005-04-01 2006-10-05 Schlumberger Technology Corporation Chasing engine for data transfer
US20060224638A1 (en) * 2005-04-01 2006-10-05 Schlumberger Technology Corporation Method and system for dynamic data merge in databases
US20100082646A1 (en) * 2008-09-26 2010-04-01 Microsoft Corporation Tracking constraints and dependencies across mapping layers
US20100223231A1 (en) * 2009-03-02 2010-09-02 Thales-Raytheon Systems Company Llc Merging Records From Different Databases

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010027458A1 (en) * 2000-03-29 2001-10-04 Toshihiro Wakayama System for managing networked information contents
US20060224637A1 (en) * 2005-04-01 2006-10-05 Schlumberger Technology Corporation Chasing engine for data transfer
US20060224638A1 (en) * 2005-04-01 2006-10-05 Schlumberger Technology Corporation Method and system for dynamic data merge in databases
US20100082646A1 (en) * 2008-09-26 2010-04-01 Microsoft Corporation Tracking constraints and dependencies across mapping layers
US20100223231A1 (en) * 2009-03-02 2010-09-02 Thales-Raytheon Systems Company Llc Merging Records From Different Databases

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MCELROY P ET AL: "Oracle Database 10g: Oracle Streams Replication", 1 May 2005 (2005-05-01), pages 1 - 19, XP002582165, Retrieved from the Internet <URL:http://www.oracle.com/technology/products/dataint/pdf/twp_streams_replication_10gr2.pdf> [retrieved on 20100512] *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9514164B1 (en) * 2013-12-27 2016-12-06 Accenture Global Services Limited Selectively migrating data between databases based on dependencies of database entities
CN107807972A (en) * 2017-10-19 2018-03-16 北京科技大学 A kind of test data consistency detecting method
CN107807972B (en) * 2017-10-19 2020-12-22 北京科技大学 A test data consistency detection method
WO2019241656A1 (en) * 2018-06-16 2019-12-19 Hexagon Technology Center Gmbh System and method for comparing and selectively merging database records
US12393592B2 (en) 2018-06-16 2025-08-19 Hexagon Technology Center Gmbh System and method for comparing and selectively merging database records
KR102836318B1 (en) * 2018-06-16 2025-07-21 헥사곤 테크놀로지 센터 게엠베하 System and method for comparing and optionally merging database records
CN112654976A (en) * 2018-06-16 2021-04-13 赫克斯冈技术中心 System and method for comparing and selectively merging database records
KR20210041554A (en) * 2018-06-16 2021-04-15 헥사곤 테크놀로지 센터 게엠베하 System and method for comparing and selectively merging database records
US11120025B2 (en) 2018-06-16 2021-09-14 Hexagon Technology Center Gmbh System and method for comparing and selectively merging database records
CN111259027B (en) * 2020-01-15 2023-01-17 中国科学院软件研究所 Data consistency detection method
CN111259027A (en) * 2020-01-15 2020-06-09 中国科学院软件研究所 A data consistency detection method
CN111382083B (en) * 2020-04-30 2024-02-23 中国银行股份有限公司 Test data generation method and device
CN111382083A (en) * 2020-04-30 2020-07-07 中国银行股份有限公司 Test data generation method and device
CN111563076B (en) * 2020-05-09 2023-06-30 咪咕文化科技有限公司 Data audit method, device, network equipment and storage medium
CN111563076A (en) * 2020-05-09 2020-08-21 咪咕文化科技有限公司 Data auditing method, device, network equipment and storage medium
CN111881288A (en) * 2020-05-19 2020-11-03 杭州中奥科技有限公司 Method and device for judging authenticity of record information, storage medium and electronic equipment
CN111881288B (en) * 2020-05-19 2024-04-09 杭州中奥科技有限公司 Method, device, storage medium and electronic device for determining authenticity of transcript information
US20230367761A1 (en) * 2022-05-12 2023-11-16 Sap Se Managing Master Data For Distributed Environments
US12229113B2 (en) * 2022-05-12 2025-02-18 Sap Se Managing master data for distributed environments
CN117251460A (en) * 2023-08-10 2023-12-19 上海栈略数据技术有限公司 Data consistency check system for graph database and relational database
CN117251460B (en) * 2023-08-10 2024-04-05 上海栈略数据技术有限公司 Data consistency check system for graph database and relational database

Similar Documents

Publication Publication Date Title
WO2012130489A1 (en) Method, system, and computer program product for maintaining data consistency between two databases
US12386802B2 (en) Column lineage for resource dependency system and graphical user interface
CN114357088B (en) Nuclear power industry data warehouse system
CN110168515B (en) System for analyzing data relationships to support query execution
CN105512042B (en) A kind of automatic generation method of the test data of database, device and test system
US9280568B2 (en) Zero downtime schema evolution
US11615076B2 (en) Monolith database to distributed database transformation
US10191932B2 (en) Dependency-aware transaction batching for data replication
JP6434960B2 (en) Support for a combination of flow-based ETL and entity relationship-based ETL
US7076778B2 (en) Method and apparatus for upgrading a software application in the presence of user modifications
US7941398B2 (en) Autopropagation of business intelligence metadata
US9507820B1 (en) Data modeling system for runtime schema extensibility
CN109997125A (en) System for importing data to data storage bank
US20170308602A1 (en) Apparatus And Methods Of Data Synchronization
CN110300963A (en) Data management system in large-scale data repository
CN111651431A (en) Database service oriented management flow standardization method
US10296542B2 (en) Integration database framework
US11314489B1 (en) Automated authoring of software solutions by first analyzing and resolving anomalies in a data model
US20120272225A1 (en) Incremental upgrade of entity-relationship systems
US11609890B1 (en) Schema management for journal-based storage systems
US20110153562A1 (en) Error prevention for data replication
US11693834B2 (en) Model generation service for data retrieval
US8818974B2 (en) System and method for synchronously updating a hierarchy bridge table
CN105550342B (en) A kind of data processing method of the distributed data base of all-transparent
US11599520B1 (en) Consistency management using query restrictions in journal-based storage systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12701248

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12701248

Country of ref document: EP

Kind code of ref document: A1