CN113434482A - Data migration method and device, computer equipment and storage medium - Google Patents
Data migration method and device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN113434482A CN113434482A CN202110720293.0A CN202110720293A CN113434482A CN 113434482 A CN113434482 A CN 113434482A CN 202110720293 A CN202110720293 A CN 202110720293A CN 113434482 A CN113434482 A CN 113434482A
- Authority
- CN
- China
- Prior art keywords
- data table
- target
- database
- field
- matching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/214—Database migration support
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data migration method, a data migration device, computer equipment and a storage medium, relates to the technical field of information, and mainly aims to create data table tasks in batches, improve data migration efficiency and reduce the workload of storage engineers. The method comprises the following steps: acquiring metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; based on the matching information, creating data table tasks in batches; and executing the data table task, and migrating the data in the source database to the target database. The invention is suitable for data migration.
Description
Technical Field
The present invention relates to the field of information technologies, and in particular, to a data migration method and apparatus, a computer device, and a storage medium.
Background
In the technical field of data warehouse and the like, data in an external database is generally required to be imported into a data center for data warehouse construction, and then data analysis is performed on the data in the data warehouse.
Currently, during the process of importing data into a data warehouse, a data warehouse engineer usually creates a task manually for each data table, so as to complete data migration according to the created task. However, if a large amount of data exists in the external database, for example, a database has thousands of tables, the warehousing engineer needs to manually complete the task creation of the thousands of tables, which greatly increases the workload of the warehousing engineer and has low task creation efficiency, thereby seriously affecting the data migration efficiency.
Disclosure of Invention
The invention provides a data migration method, a data migration device, computer equipment and a storage medium, which mainly aim at creating a data table task in batch, improving the data migration efficiency and reducing the workload of a storage engineer.
According to a first aspect of the present invention, there is provided a data migration method comprising:
acquiring metadata information respectively corresponding to a source database and a target database;
matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database;
based on the matching information, creating data table tasks in batches;
and executing the data table task, and migrating the data in the source database to the target database.
According to a second aspect of the present invention, there is provided a data migration apparatus comprising:
the acquisition unit is used for acquiring metadata information respectively corresponding to the source database and the target database;
a matching unit, configured to match metadata information corresponding to the source database with metadata information corresponding to the target database, and generate matching information between the source database and the target database;
the creating unit is used for creating data table tasks in batches based on the matching information;
and the migration unit is used for executing the data table task and migrating the data in the source database to the target database.
According to a third aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:
acquiring metadata information respectively corresponding to a source database and a target database;
matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database;
based on the matching information, creating data table tasks in batches;
and executing the data table task, and migrating the data in the source database to the target database.
According to a fourth aspect of the present invention, there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the program:
acquiring metadata information respectively corresponding to a source database and a target database;
matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database;
based on the matching information, creating data table tasks in batches;
and executing the data table task, and migrating the data in the source database to the target database.
Compared with the current mode that a warehouse engineer manually creates tasks for each data table, the data migration method, the data migration device, the computer equipment and the storage medium can acquire metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; meanwhile, based on the matching information, creating data table tasks in batches; and finally executing the data table task, and migrating the data in the source database to the target database, so that the data table task can be created in batch based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of a warehousing engineer and improving the data migration efficiency.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow chart of a data migration method according to an embodiment of the present invention;
FIG. 2 is a flow chart of another data migration method provided by the embodiment of the invention;
FIG. 3 is a schematic structural diagram of a data migration apparatus according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of another data migration apparatus provided in the embodiment of the present invention;
fig. 5 shows a physical structure diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
Currently, during the process of importing data into a data warehouse, a data warehouse engineer usually creates a task manually for each data table, so as to complete data migration according to the created task. However, if a large amount of data exists in the external database, for example, a database has thousands of tables, the warehousing engineer needs to manually complete the task creation of the thousands of tables, which greatly increases the workload of the warehousing engineer and has low task creation efficiency, thereby seriously affecting the data migration efficiency.
In order to solve the above problem, an embodiment of the present invention provides a data migration method, as shown in fig. 1, the method includes:
101. and acquiring metadata information respectively corresponding to the source database and the target database.
The target database is used for receiving and storing data in the source database, the metadata information corresponding to the source database includes data table information and field information corresponding to each first data table in the source database, the metadata information corresponding to the target database includes data table information and field information corresponding to each second data table in the target database, the data table information may specifically be a data table name, and the field information may specifically be a field type, a field name, and the like. In order to overcome the defects of large workload of a storage engineer and low data migration efficiency in the prior art, the embodiment of the invention can create data table tasks in batches based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of the storage engineer and improving the data migration efficiency. The embodiment of the invention is mainly applied to a scene of importing data from an external database to a data center. The execution subject of the embodiment of the present invention is a device or an apparatus capable of data migration, and may be specifically set on the server side.
For the embodiment of the present invention, after the configuration of the source database and the target database is completed, when the user triggers the data migration instruction, the server collects metadata information corresponding to the source database, and establishes a corresponding data table and field in the target database according to the collected metadata information corresponding to the source database, thereby being capable of acquiring the metadata information corresponding to the target database.
102. And matching the metadata information corresponding to the source database with the metadata information corresponding to the target database to generate matching information between the source database and the target database.
The matching information between the source database and the target database comprises table matching information and field matching information between the source database and the target database, wherein the table matching information is table matching information between a first data table in the source database and a second data table in the target database, and specifically comprises marking the first data table and the second data table as matching or not matching; the field matching information is field matching information between two fields in the first data table and the second data table which are matched, and specifically comprises marking the fields in the first data table and the fields in the second data table as being matched or not matched.
For the embodiment of the present invention, in order to create the data table tasks in batch, matching information between the source database and the target database needs to be acquired, specifically, data table information corresponding to each first data table in the source database is matched with data table information corresponding to each second data table in the target database, if the data table information corresponding to a certain first data table in the source database is different from the data table information corresponding to each second data table in the target database, it is determined that there is no second data table matching with the first data table, at this time, a second data table closest to the first data table needs to be found through similarity calculation, a mapping relationship between the two data tables is established, and the two data tables are marked as unmatched, where a specific calculation process of the similarity is shown in step 203; if the data table information corresponding to a certain first data table in the source database is the same as the data table information corresponding to a certain second data table in the target database, determining that the first data table in the source database is matched with the second data table in the target database, establishing a mapping relation between the first data table and the second data table, marking the first data table and the second data table as being matched, and further generating table matching information.
Further, after it is determined that a first data table in the source database is matched with a second data table in the target database, matching field information corresponding to each field in the first data table with field information corresponding to each field in the second data table, if the field information corresponding to a field in the first data table is different from the field information corresponding to each field in the second data table, determining that no field matched with a field in the first data table exists in the second data table, at this time, a field closest to the field needs to be found from the second data table through similarity calculation, a mapping relation between the two fields needs to be established, and the two fields are marked as unmatched; and if the field information corresponding to a certain field in the first data table is the same as the field information corresponding to a certain field in the second data, determining that the two fields in the first data table are matched with the two fields in the second data table, establishing a mapping relation between the two fields, marking the mapping relation as matching, and further generating field matching information, and further determining the matching information between the source database and the target database according to the table matching information and the field matching information.
103. And creating data table tasks in batches based on the matching information.
For the embodiment of the invention, the matched data tables and fields in the source database and the target database are determined according to the matching information, and the data table tasks are established in batch aiming at the matched data tables and fields. For example, according to the matching information, it is determined that the data table a and the data table B in the source database match the data table a and the data table B in the target database, respectively, and the field 1 and the field 2 in the data table a match the field 3 and the field 4 in the data table a, respectively, and the field 5 and the field 6 in the data table B match the field 7 and the field 8 in the data table B, respectively, then according to the matching information, a data migration task between the data table a and a data migration task between the data table B and the data table B are created.
104. And executing the data table task, and migrating the data in the source database to the target database.
For the embodiment of the invention, when a user clicks a button of a one-key library, the ETL tool can execute the created data table tasks in batch, specifically, the created data table tasks can be put into the queue and executed in a thread starting mode, and specifically, if the background server resources are enough, the tasks in the queue can be executed together; if the background server resources are insufficient, the tasks in the queue can be executed in batches, and the data in the source database is written into the target database through the execution of the tasks, so that the data migration is completed. Further, the user can also view the execution result, and for the task with the migration failure, the user can continue the execution by setting the increment field.
Compared with the mode that a data warehouse engineer manually creates tasks for each data table at present, the data migration method provided by the embodiment of the invention can acquire the metadata information respectively corresponding to the source database and the target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; meanwhile, based on the matching information, creating data table tasks in batches; and finally executing the data table task, and migrating the data in the source database to the target database, so that the data table task can be created in batch based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of a warehousing engineer and improving the data migration efficiency.
Further, in order to better describe the process of the above task of creating a data table in bulk, as a refinement and extension to the above embodiment, an embodiment of the present invention provides another data migration method, as shown in fig. 2, where the method includes:
201. and respectively configuring database information corresponding to the source database and the target database, and collecting metadata information corresponding to the source database based on the database information corresponding to the source database.
For the embodiment of the present invention, before creating a data table task in a data migration process in batch, a source database and a destination database need to be configured, which specifically include information such as an ip address, a port, a database name, a user name, a password, and a schema corresponding to the source database, and information such as an ip address, a port, a database name, a user name, a password, and a schema corresponding to the destination database, and further, metadata corresponding to the source database is collected to obtain metadata information corresponding to the source database, which mainly includes data table information, field information, view information, and the like.
202. And generating metadata information corresponding to the target database based on the metadata information corresponding to the source database.
For the embodiment of the present invention, in order to generate metadata information corresponding to the target database, step 202 specifically includes: if the database type corresponding to the source database is the same as the database type corresponding to the target database, generating and executing an SQL table building statement according to the metadata information corresponding to the source database to obtain the metadata information corresponding to the target database; if the database type corresponding to the source database is different from the database type corresponding to the target database, performing field type conversion on metadata information corresponding to the source database according to a preset field type mapping relation to obtain converted metadata information corresponding to the source database, and generating and executing an SQL table building statement based on the converted metadata information to obtain the metadata information corresponding to the target database.
Specifically, after the metadata information corresponding to the source database is collected, a corresponding data table and a corresponding field are established on the target database according to the metadata information corresponding to the source database. Specifically, if the source database and the target database are databases of the same type, generating a table building SQL statement directly according to metadata information corresponding to the source database, and building a corresponding data table and field in the target database by executing the table building SQL statement; if the source database and the target database are heterogeneous databases, the field type of the data table in the source database is converted into the field type supported by the target database based on a preset field type mapping relationship, for example, the source database is an Oacle database, the supported field type is VARCHAR2(n), the target database is a Mysql database, and the supported field type is VARCHAR (n), so that in the process of building the table by the target database, the field type of the VARCHAR2(n) in the source database needs to be converted into the field type of VARCHAR (n), and then a table building SQL statement is generated according to the data table information corresponding to the source database and the converted field information, and the corresponding data table and field are built in the target data by executing the table building SQL statement.
Further, after a corresponding data table is created in the target database, when a data migration instruction triggered by a user is received, metadata information corresponding to a source database and metadata information corresponding to the target database selected by the user are obtained, that is, the data table information and the field information corresponding to the source database and the data table information and the field information corresponding to the target database are automatically read, so that the metadata information corresponding to the source database and the metadata information corresponding to the target database are matched, and according to a matching result, data table tasks are created and executed in batches, so that the data migration efficiency is improved.
203. And matching the metadata information corresponding to the source database with the metadata information corresponding to the target database to generate matching information between the source database and the target database.
The metadata information corresponding to the source database includes data table information and field information corresponding to each first data table in the source database, and the metadata information corresponding to the target database includes data table information and field information corresponding to each second data table in the target database.
For the embodiment of the present invention, in order to obtain matching information between the source database and the target database, step 203 specifically includes: matching data table information corresponding to a target first data table in each first data table with data table information corresponding to each second data; if a target second data table matched with the target first data table exists in each second data table, establishing a table mapping relation between the target first data table and the target second data table, and generating table matching information between the target first data and the target second data table based on the table mapping relation; matching field information corresponding to each field in the target first data table with field information corresponding to each field in the target second data table, and generating field matching information according to a matching result; and determining matching information between the source database and the target database according to the table matching information and the field matching information. The target first data table refers to any one of the first data tables in the source database.
Specifically, data table information corresponding to any one of first data tables in the source database (target first data table) is matched with data table information corresponding to second data tables in the target database, whether a target second data table matched with the target first data table exists in each second data table is judged, if the data table information corresponding to a certain second data table is the same as the data table information corresponding to the target first data table, the second data table is determined to be the target second data table, namely the target first data table is matched with the target second data table, a mapping relation between the target first data table and the target second data table is established and marked as matching, and then table matching information between the target first data table and the target second data table is generated. Further, matching field information corresponding to each field in the target first data table with a field corresponding to each field in the target second data table, for example, matching field information corresponding to a field a in the target first data table with field information corresponding to each field in the target second data table, if a target field identical to the field information corresponding to the field a exists in the target second data table, determining that the target field is matched with the field a, establishing a mapping relation between the two fields, marking the two fields as matching, and further generating field matching information.
Further, as an optional implementation manner, for a specific process of generating the field matching information, the method includes: if the field type corresponding to the target field in each field of the target first data table is a primary key field, determining the target field as a target increment field, matching field information corresponding to the target increment field with field information corresponding to an increment field in the target second data table, and generating primary key field matching information according to a matching result; and if the field type corresponding to the target field is a non-primary key field, matching the field information corresponding to the target field with the field information corresponding to the non-increment field in the target second data table, and generating non-primary key field matching information according to a matching result. The target field is any one of fields of the target first data table.
For example, if it is determined that a first data table a in the source database is matched with a second data table a in the target database, a field matched with the first data table a and the second data table a needs to be further determined, in the field matching process, a field type needs to be determined first, and if the target field in the first data table a is a common field, that is, a non-primary key field, field information (field name) is directly used for matching; if the target field in the first data table is the primary key field, the default is to match the delta field in the second data table a as the delta field.
In a specific application scenario, in order to facilitate a user to adjust a matching relationship between data tables, when a second data table matching a target first data table does not exist in a target database, similarity between the target first data table and each second data table may be calculated, and a second data table closest to the target first data table is selected according to the similarity, based on which, after matching data table information corresponding to the target first data table in each first data table with data table information corresponding to each second data, the method further includes: if the target second data table matched with the target first data table does not exist in each second data table, respectively calculating the similarity between the data table information corresponding to each second data table and the data table information corresponding to the target first data table, and determining the second data table which is closest to the target first data table in each second data table based on the similarity; and establishing a table mapping relation between the target first data table and the closest second data table, and generating table matching information between the target first data and the closest second data table based on the table mapping relation.
For example, the second data table closest to the target first data table is determined as a second data table a, a table mapping relationship between the target first data table and the second data table a is established, the target first data table and the second data table a are marked as being unmatched, and then table matching information between the target first data table and the second data table a is generated. After the matching information between all the first data tables and the second data tables is generated, the matching information is displayed to the user in a web form, and the user can adjust the marks between the data tables, for example, adjust the marks between the target first data table and the second data table a to be matched, or also can re-generate the table matching information between the target first data table and the second data table a by modifying the data table information corresponding to the second data table a to be matched.
Further, in the process of calculating the similarity between each of the second data tables and the target first data table, if the data table information (data table name) corresponding to the first data table and the second data table is a character string, respectively calculating the edit distance between the character string corresponding to each second data table and the character string corresponding to the target first data table, selecting the first data table which is most similar to the target first data table from each second data table based on the edit distance, wherein the larger the edit distance, the smaller the similarity between the explanatory data tables, and conversely, the smaller the edit distance, the larger the similarity between the explanatory data tables, therefore, the second data table corresponding to the minimum edit distance can be selected from the calculated edit distances, and determining the second data table corresponding to the minimum editing distance as the second data table closest to the target first data table.
For example, the character string a corresponding to the second data table is adfgc, the character string B corresponding to the target first data table is aefbc, when the edit distance between the character string a and the character string B is calculated, each character in the character string a is compared with each character in the character string B, the number of characters in the character string a different from the character string B is determined, since the second character "d" in the character string a is different from the second character "e" in the character string B, and the fourth character "g" in the character string a is different from the fourth character "B" in the character string B, the edit distance between the character string a and the character string B can be determined to be 2, that is, the edit distance between the second data table and the target first data table is 2, thereby the edit distance between each second data table and the target first data table in the target database can be calculated in the above manner, and then the minimum editing distance can be selected, and the second data table corresponding to the minimum editing distance is determined as the second data table closest to the target first data table.
In a specific application scenario, if data table information (data table names) corresponding to a first data table and a second data table are fields, a word vector method of word2vec in the prior art may be utilized to respectively determine vectors corresponding to the second data table and a target first data table, and then respectively calculate a euclidean distance between the vector corresponding to each second data table and the vector corresponding to the target first data table, where the greater the euclidean distance, the smaller the similarity between the description data tables is, and conversely, the smaller the euclidean distance, the greater the similarity between the description data tables is, so that a second data table corresponding to a minimum euclidean distance may be selected from the calculated euclidean distances, and the second data table corresponding to the minimum euclidean distance is determined to be the second data table closest to the target first data table.
204. And creating data table tasks in batches based on the matching information.
For the embodiment of the present invention, the step 204 specifically includes, for creating the data table task in batch: determining a first data table and a second data table marked as matching according to the table matching information; according to the field matching information, determining matching fields corresponding to the first data table and the second data table which are marked as matching; respectively generating a data reading function and a data writing function based on the first data table and the second data table marked as matching and the matching field; creating execution files in batches based on the data reading function and the data writing function; determining the spreadsheet task based on the execution file.
Specifically, in the process of creating a data table task in batch, execution files need to be generated in batch, each execution file includes a data reading function and a data writing function, the data reading function is used for reading data in a first data table in a source database, the data writing function is used for writing data in the first data table into a corresponding second data table in a target database, the data reading function corresponds to a data reading parameter, the data reading parameter represents a first data table and a field where the read data is located, the data writing function corresponds to a data writing parameter, and the data writing parameter represents a data table and a field where the data is written. In the prior art, the data reading parameters and the data writing parameters need to be manually filled in by a warehousing engineer, so that the efficiency of data migration is low, in the embodiment of the invention, each group of data tables marked as matching and corresponding matching fields thereof can be directly read from generated matching information, then execution files are created in batches according to each group of data tables marked as matching and corresponding matching fields thereof, each execution file corresponds to one group of data tables marked as matching, and further, the data migration can be completed by executing the created execution files.
For example, it is determined by the matching information that the data table 1 in the source database and the data table 2 in the target database are marked as matching, the field a in the data table 1 and the field b in the data table 2 are marked as matching, in the process of creating the execution file for the data table 1 and the data table 2, the data table 1 and the field a are used as data reading parameters to generate a data reading function for the data table 1, similarly, the data table 2 and the field b are used as data writing parameters to generate a data writing function for the data table 2, the execution file for the data table 1 and the data table 2 is created based on the generated data reading function and data writing function, and then a data migration task between the data table 1 and the data table 2 can be created.
205. And executing the data table task, and migrating the data in the source database to the target database.
For the embodiment of the present invention, after the task of creating the data table in batch, the data in the source database is migrated to the target database, and the data migration process is completely the same as that in step 104, and is not described herein again.
Compared with the current mode that a warehouse engineer manually creates tasks for each data table, the data migration method provided by the embodiment of the invention can acquire metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; meanwhile, based on the matching information, creating data table tasks in batches; and finally executing the data table task, and migrating the data in the source database to the target database, so that the data table task can be created in batch based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of a warehousing engineer and improving the data migration efficiency.
Further, as a specific implementation of fig. 1, an embodiment of the present invention provides a data migration apparatus, as shown in fig. 3, the apparatus includes: an acquisition unit 31, a matching unit 32, a creation unit 33, and a migration unit 34.
The obtaining unit 31 may be configured to obtain metadata information corresponding to the source database and the target database, respectively.
The matching unit 32 may be configured to match metadata information corresponding to the source database with metadata information corresponding to the target database, so as to generate matching information between the source database and the target database.
The creating unit 33 may be configured to create data table tasks in batches based on the matching information.
The migration unit 34 may be configured to execute the data table task and migrate the data in the source database to the target database.
In a specific application scenario, the metadata information corresponding to the source database includes data table information and field information corresponding to each first data table in the source database, and the metadata information corresponding to the target database includes data table information and field information corresponding to each second data table in the target database, as shown in fig. 4, the matching unit 32 includes: a matching module 321, a generating module 322 and a determining module 323.
The matching module 321 may be configured to match data table information corresponding to a target first data table in each first data table with data table information corresponding to each second data.
The generating module 322 may be configured to, if a target second data table that matches the target first data table exists in each second data table, establish a table mapping relationship between the target first data table and the target second data table, and generate table matching information between the target first data and the target second data table based on the table mapping relationship.
The generating module 322 may be further configured to match field information corresponding to each field in the target first data table with field information corresponding to each field in the target second data table, and generate field matching information according to a matching result.
The determining module 323 may be configured to determine matching information between the source database and the target database according to the table matching information and the field matching information.
Further, the matching unit 32 further includes: a calculation module 324.
The calculating module 324 may be configured to, if there is no target second data table matching the target first data table in the second data tables, respectively calculate similarity between data table information corresponding to the second data tables and data table information corresponding to the target first data table, and determine, based on the similarity, a second data table that is closest to the target first data table in the second data tables.
The generating module 322 may be further configured to establish a table mapping relationship between the target first data table and the closest second data table, and generate table matching information between the target first data and the closest second data table based on the table mapping relationship.
Further, the book search generation module 322 may be specifically configured to determine, if a field type corresponding to a target field in each field of the target first data table is a primary key field, the target field as a target increment field, match field information corresponding to the target increment field with field information corresponding to an increment field in the target second data table, and generate primary key field matching information according to a matching result; and if the field type corresponding to the target field is a non-primary key field, matching the field information corresponding to the target field with the field information corresponding to the non-increment field in the target second data table, and generating non-primary key field matching information according to a matching result.
In a specific application scenario, the creating unit 33 includes: a determination module 331, a generation module 332, and a creation module 333.
The determining module 331 may be configured to determine the first data table and the second data table marked as matching according to the table matching information.
The determining module 331 may be further configured to determine, according to the field matching information, a matching field corresponding to the first data table and the second data table that are marked as matching.
The generating module 332 may be configured to generate a data reading function and a data writing function based on the first data table and the second data table marked as matching and the matching field, respectively.
The creating module 333 may be configured to create the execution file in batch based on the data reading function and the data writing function.
The determining module 331 may be further configured to determine the spreadsheet task based on the execution file.
Further, the acquiring unit 31 includes: a configuration module 311, an acquisition module 312, and a generation module 313.
The configuration module 311 may be configured to configure database information corresponding to the source database and the target database, respectively.
The collection module 312 may be configured to collect metadata information corresponding to the source database based on the database information corresponding to the source database.
The generating module 313 may be configured to generate metadata information corresponding to the target database based on the metadata information corresponding to the source database.
Further, the generating module 313 may be specifically configured to generate and execute an SQL table building statement according to the metadata information corresponding to the source database to obtain the metadata information corresponding to the target database if the database type corresponding to the source database is the same as the database type corresponding to the target database; if the database type corresponding to the source database is different from the database type corresponding to the target database, performing field type conversion on metadata information corresponding to the source database according to a preset field type mapping relation to obtain converted metadata information corresponding to the source database, and generating and executing an SQL table building statement based on the converted metadata information to obtain the metadata information corresponding to the target database.
It should be noted that other corresponding descriptions of the functional modules related to the data migration apparatus provided in the embodiment of the present invention may refer to the corresponding description of the method shown in fig. 1, and are not described herein again.
Based on the method shown in fig. 1, correspondingly, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the following steps: acquiring metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; based on the matching information, creating data table tasks in batches; and executing the data table task, and migrating the data in the source database to the target database.
Based on the above embodiments of the method shown in fig. 1 and the apparatus shown in fig. 3, an embodiment of the present invention further provides an entity structure diagram of a computer device, as shown in fig. 5, where the computer device includes: a processor 41, a memory 42, and a computer program stored on the memory 42 and executable on the processor, wherein the memory 42 and the processor 41 are both arranged on a bus 43 such that when the processor 41 executes the program, the following steps are performed: acquiring metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; based on the matching information, creating data table tasks in batches; and executing the data table task, and migrating the data in the source database to the target database.
By the technical scheme, the metadata information corresponding to the source database and the target database can be acquired; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; meanwhile, based on the matching information, creating data table tasks in batches; and finally executing the data table task, and migrating the data in the source database to the target database, so that the data table task can be created in batch based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of a warehousing engineer and improving the data migration efficiency.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A method of data migration, comprising:
acquiring metadata information respectively corresponding to a source database and a target database;
matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database;
based on the matching information, creating data table tasks in batches;
and executing the data table task, and migrating the data in the source database to the target database.
2. The method according to claim 1, wherein the metadata information corresponding to the source database includes data table information and field information corresponding to each first data table in the source database, the metadata information corresponding to the target database includes data table information and field information corresponding to each second data table in the target database, and the matching the metadata information corresponding to the source database with the metadata information corresponding to the target database to generate matching information between the source database and the target database includes:
matching data table information corresponding to a target first data table in each first data table with data table information corresponding to each second data;
if a target second data table matched with the target first data table exists in each second data table, establishing a table mapping relation between the target first data table and the target second data table, and generating table matching information between the target first data and the target second data table based on the table mapping relation;
matching field information corresponding to each field in the target first data table with field information corresponding to each field in the target second data table, and generating field matching information according to a matching result;
and determining matching information between the source database and the target database according to the table matching information and the field matching information.
3. The method according to claim 2, wherein after the matching the data table information corresponding to the target first data table in each first data table with the data table information corresponding to each second data, the method further comprises:
if the target second data table matched with the target first data table does not exist in each second data table, respectively calculating the similarity between the data table information corresponding to each second data table and the data table information corresponding to the target first data table, and determining the second data table which is closest to the target first data table in each second data table based on the similarity;
and establishing a table mapping relation between the target first data table and the closest second data table, and generating table matching information between the target first data and the closest second data table based on the table mapping relation.
4. The method according to claim 2, wherein the matching the field information corresponding to each field in the target first data table with the field information corresponding to each field in the target second data table, and generating the field matching information according to the matching result includes:
if the field type corresponding to the target field in each field of the target first data table is a primary key field, determining the target field as a target increment field, matching field information corresponding to the target increment field with field information corresponding to an increment field in the target second data table, and generating primary key field matching information according to a matching result;
and if the field type corresponding to the target field is a non-primary key field, matching the field information corresponding to the target field with the field information corresponding to the non-increment field in the target second data table, and generating non-primary key field matching information according to a matching result.
5. The method of claim 2, wherein the batch creation of spreadsheet tasks based on the matching information comprises:
determining a first data table and a second data table marked as matching according to the table matching information;
according to the field matching information, determining matching fields corresponding to the first data table and the second data table which are marked as matching;
respectively generating a data reading function and a data writing function based on the first data table and the second data table marked as matching and the matching field;
creating execution files in batches based on the data reading function and the data writing function;
determining the spreadsheet task based on the execution file.
6. The method according to claim 1, wherein the obtaining metadata information corresponding to the source database and the target database respectively comprises:
respectively configuring database information corresponding to the source database and the target database;
collecting metadata information corresponding to the source database based on the database information corresponding to the source database;
and generating metadata information corresponding to the target database based on the metadata information corresponding to the source database.
7. The method of claim 6, wherein the generating metadata information corresponding to the target database based on the metadata information corresponding to the source database comprises:
if the database type corresponding to the source database is the same as the database type corresponding to the target database, generating and executing an SQL table building statement according to the metadata information corresponding to the source database to obtain the metadata information corresponding to the target database;
if the database type corresponding to the source database is different from the database type corresponding to the target database, performing field type conversion on metadata information corresponding to the source database according to a preset field type mapping relation to obtain converted metadata information corresponding to the source database, and generating and executing an SQL table building statement based on the converted metadata information to obtain the metadata information corresponding to the target database.
8. A data migration apparatus, comprising:
the acquisition unit is used for acquiring metadata information respectively corresponding to the source database and the target database;
a matching unit, configured to match metadata information corresponding to the source database with metadata information corresponding to the target database, and generate matching information between the source database and the target database;
the creating unit is used for creating data table tasks in batches based on the matching information;
and the migration unit is used for executing the data table task and migrating the data in the source database to the target database.
9. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 7 when executed by the processor.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110720293.0A CN113434482A (en) | 2021-06-28 | 2021-06-28 | Data migration method and device, computer equipment and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110720293.0A CN113434482A (en) | 2021-06-28 | 2021-06-28 | Data migration method and device, computer equipment and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN113434482A true CN113434482A (en) | 2021-09-24 |
Family
ID=77754993
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202110720293.0A Pending CN113434482A (en) | 2021-06-28 | 2021-06-28 | Data migration method and device, computer equipment and storage medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN113434482A (en) |
Cited By (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113901033A (en) * | 2021-11-10 | 2022-01-07 | 建信金融科技有限责任公司 | Data migration method, apparatus, device and medium |
| CN113961625A (en) * | 2021-10-27 | 2022-01-21 | 北京科杰科技有限公司 | Task migration method for heterogeneous big data management platform |
| CN114357052A (en) * | 2022-01-12 | 2022-04-15 | 中国农业银行股份有限公司 | Heterogeneous database object conversion method and device, electronic equipment and storage medium |
| CN114385623A (en) * | 2021-11-30 | 2022-04-22 | 北京达佳互联信息技术有限公司 | Data sheet acquisition method, apparatus, device, storage medium and program product |
| CN114610782A (en) * | 2022-03-22 | 2022-06-10 | 深圳壹账通智能科技有限公司 | Data source switching method, device, electronic device and storage medium |
| CN114780252A (en) * | 2022-06-15 | 2022-07-22 | 阿里云计算有限公司 | Resource management method and device of data warehouse system |
| CN114816578A (en) * | 2022-05-11 | 2022-07-29 | 上海柯林布瑞信息技术有限公司 | Method, device and equipment for generating program configuration file based on configuration table |
| CN115757456A (en) * | 2022-11-29 | 2023-03-07 | 中国农业银行股份有限公司 | Data processing method and device, electronic equipment and readable storage medium |
| CN116186141A (en) * | 2023-02-24 | 2023-05-30 | 杭州太美星程医药科技有限公司 | Data processing method, device, medium and equipment |
| CN116401247A (en) * | 2023-03-21 | 2023-07-07 | 中航金网(北京)电子商务有限公司 | A data processing method, device and computer equipment |
| CN117009320A (en) * | 2023-08-08 | 2023-11-07 | 平安银行股份有限公司 | Methods and devices, equipment and storage media for financial data migration |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140279890A1 (en) * | 2013-03-14 | 2014-09-18 | Microsoft Corporation | Data migration framework |
| CN104281704A (en) * | 2014-10-22 | 2015-01-14 | 新华瑞德(北京)网络科技有限公司 | Database data copying method and device |
| CN111241071A (en) * | 2020-02-14 | 2020-06-05 | 苏州浪潮智能科技有限公司 | A data migration method, system, device and computer-readable storage medium |
-
2021
- 2021-06-28 CN CN202110720293.0A patent/CN113434482A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20140279890A1 (en) * | 2013-03-14 | 2014-09-18 | Microsoft Corporation | Data migration framework |
| CN104281704A (en) * | 2014-10-22 | 2015-01-14 | 新华瑞德(北京)网络科技有限公司 | Database data copying method and device |
| CN111241071A (en) * | 2020-02-14 | 2020-06-05 | 苏州浪潮智能科技有限公司 | A data migration method, system, device and computer-readable storage medium |
Cited By (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113961625A (en) * | 2021-10-27 | 2022-01-21 | 北京科杰科技有限公司 | Task migration method for heterogeneous big data management platform |
| CN113961625B (en) * | 2021-10-27 | 2022-06-07 | 北京科杰科技有限公司 | Task migration method for heterogeneous big data management platform |
| CN113901033A (en) * | 2021-11-10 | 2022-01-07 | 建信金融科技有限责任公司 | Data migration method, apparatus, device and medium |
| CN114385623A (en) * | 2021-11-30 | 2022-04-22 | 北京达佳互联信息技术有限公司 | Data sheet acquisition method, apparatus, device, storage medium and program product |
| CN114385623B (en) * | 2021-11-30 | 2025-04-08 | 北京达佳互联信息技术有限公司 | Data table acquisition method, device, apparatus, storage medium and program product |
| CN114357052A (en) * | 2022-01-12 | 2022-04-15 | 中国农业银行股份有限公司 | Heterogeneous database object conversion method and device, electronic equipment and storage medium |
| CN114610782A (en) * | 2022-03-22 | 2022-06-10 | 深圳壹账通智能科技有限公司 | Data source switching method, device, electronic device and storage medium |
| CN114816578B (en) * | 2022-05-11 | 2024-05-17 | 上海柯林布瑞信息技术有限公司 | Program configuration file generation method, device and equipment based on configuration table |
| CN114816578A (en) * | 2022-05-11 | 2022-07-29 | 上海柯林布瑞信息技术有限公司 | Method, device and equipment for generating program configuration file based on configuration table |
| CN114780252A (en) * | 2022-06-15 | 2022-07-22 | 阿里云计算有限公司 | Resource management method and device of data warehouse system |
| CN114780252B (en) * | 2022-06-15 | 2022-11-18 | 阿里云计算有限公司 | Resource management method and device of data warehouse system |
| CN115757456A (en) * | 2022-11-29 | 2023-03-07 | 中国农业银行股份有限公司 | Data processing method and device, electronic equipment and readable storage medium |
| CN116186141A (en) * | 2023-02-24 | 2023-05-30 | 杭州太美星程医药科技有限公司 | Data processing method, device, medium and equipment |
| CN116401247A (en) * | 2023-03-21 | 2023-07-07 | 中航金网(北京)电子商务有限公司 | A data processing method, device and computer equipment |
| CN117009320A (en) * | 2023-08-08 | 2023-11-07 | 平安银行股份有限公司 | Methods and devices, equipment and storage media for financial data migration |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN113434482A (en) | Data migration method and device, computer equipment and storage medium | |
| CN111797134B (en) | Data query method, device and storage medium of distributed database | |
| CN110292775B (en) | Method and device for acquiring difference data | |
| US20190004875A1 (en) | Artificial Creation Of Dominant Sequences That Are Representative Of Logged Events | |
| CN114328632A (en) | User data analysis method and device based on bitmap and computer equipment | |
| CN111125229B (en) | Data blood edge generation method and device and electronic equipment | |
| CN107870949B (en) | Data analysis job dependency relationship generation method and system | |
| CN113312539B (en) | A method, device, equipment and medium for providing retrieval services | |
| CN112883030A (en) | Data collection method and device, computer equipment and storage medium | |
| CN111427577A (en) | Code processing method and device and server | |
| CN112052222A (en) | Heterogeneous object storage cluster access method, device, equipment and storage medium | |
| CN118939839A (en) | Data lineage full-link monitoring method, system, terminal and storage medium | |
| CN107423037B (en) | Application program interface positioning method and device | |
| CN110134663B (en) | Organization structure data processing method and device and electronic equipment | |
| CN116204554B (en) | Data processing method, system, electronic device and storage medium | |
| CN111274004B (en) | Process instance management method and device and computer storage medium | |
| CN110222046B (en) | List data processing method, device, server and storage medium | |
| CN111949845A (en) | Method, apparatus, computer device and storage medium for processing mapping information | |
| CN116560683A (en) | Software updating method, device, equipment and storage medium | |
| CN111078671A (en) | Method, device, equipment and medium for modifying data table field | |
| CN109697234B (en) | Multi-attribute information query method, device, server and medium for entity | |
| CN111666302A (en) | User ranking query method, device, equipment and storage medium | |
| CN111782834A (en) | Image retrieval method, apparatus, device, and computer-readable storage medium | |
| CN112463896B (en) | Archive cataloging data processing methods, devices, computing equipment and storage media | |
| CN112181796B (en) | Information acquisition method, device, server and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210924 |
|
| RJ01 | Rejection of invention patent application after publication |