CN113434482A

CN113434482A - Data migration method and device, computer equipment and storage medium

Info

Publication number: CN113434482A
Application number: CN202110720293.0A
Authority: CN
Inventors: 徐方来
Original assignee: Ping An International Smart City Technology Co Ltd
Current assignee: Ping An International Smart City Technology Co Ltd
Priority date: 2021-06-28
Filing date: 2021-06-28
Publication date: 2021-09-24

Abstract

The invention discloses a data migration method, a data migration device, computer equipment and a storage medium, relates to the technical field of information, and mainly aims to create data table tasks in batches, improve data migration efficiency and reduce the workload of storage engineers. The method comprises the following steps: acquiring metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; based on the matching information, creating data table tasks in batches; and executing the data table task, and migrating the data in the source database to the target database. The invention is suitable for data migration.

Description

Data migration method and device, computer equipment and storage medium

Technical Field

The present invention relates to the field of information technologies, and in particular, to a data migration method and apparatus, a computer device, and a storage medium.

Background

In the technical field of data warehouse and the like, data in an external database is generally required to be imported into a data center for data warehouse construction, and then data analysis is performed on the data in the data warehouse.

Currently, during the process of importing data into a data warehouse, a data warehouse engineer usually creates a task manually for each data table, so as to complete data migration according to the created task. However, if a large amount of data exists in the external database, for example, a database has thousands of tables, the warehousing engineer needs to manually complete the task creation of the thousands of tables, which greatly increases the workload of the warehousing engineer and has low task creation efficiency, thereby seriously affecting the data migration efficiency.

Disclosure of Invention

The invention provides a data migration method, a data migration device, computer equipment and a storage medium, which mainly aim at creating a data table task in batch, improving the data migration efficiency and reducing the workload of a storage engineer.

According to a first aspect of the present invention, there is provided a data migration method comprising:

acquiring metadata information respectively corresponding to a source database and a target database;

matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database;

based on the matching information, creating data table tasks in batches;

and executing the data table task, and migrating the data in the source database to the target database.

According to a second aspect of the present invention, there is provided a data migration apparatus comprising:

the acquisition unit is used for acquiring metadata information respectively corresponding to the source database and the target database;

a matching unit, configured to match metadata information corresponding to the source database with metadata information corresponding to the target database, and generate matching information between the source database and the target database;

the creating unit is used for creating data table tasks in batches based on the matching information;

and the migration unit is used for executing the data table task and migrating the data in the source database to the target database.

According to a third aspect of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of:

based on the matching information, creating data table tasks in batches;

According to a fourth aspect of the present invention, there is provided a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the program:

based on the matching information, creating data table tasks in batches;

Compared with the current mode that a warehouse engineer manually creates tasks for each data table, the data migration method, the data migration device, the computer equipment and the storage medium can acquire metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; meanwhile, based on the matching information, creating data table tasks in batches; and finally executing the data table task, and migrating the data in the source database to the target database, so that the data table task can be created in batch based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of a warehousing engineer and improving the data migration efficiency.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:

FIG. 1 is a flow chart of a data migration method according to an embodiment of the present invention;

FIG. 2 is a flow chart of another data migration method provided by the embodiment of the invention;

FIG. 3 is a schematic structural diagram of a data migration apparatus according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of another data migration apparatus provided in the embodiment of the present invention;

fig. 5 shows a physical structure diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

In order to solve the above problem, an embodiment of the present invention provides a data migration method, as shown in fig. 1, the method includes:

101. and acquiring metadata information respectively corresponding to the source database and the target database.

The target database is used for receiving and storing data in the source database, the metadata information corresponding to the source database includes data table information and field information corresponding to each first data table in the source database, the metadata information corresponding to the target database includes data table information and field information corresponding to each second data table in the target database, the data table information may specifically be a data table name, and the field information may specifically be a field type, a field name, and the like. In order to overcome the defects of large workload of a storage engineer and low data migration efficiency in the prior art, the embodiment of the invention can create data table tasks in batches based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of the storage engineer and improving the data migration efficiency. The embodiment of the invention is mainly applied to a scene of importing data from an external database to a data center. The execution subject of the embodiment of the present invention is a device or an apparatus capable of data migration, and may be specifically set on the server side.

For the embodiment of the present invention, after the configuration of the source database and the target database is completed, when the user triggers the data migration instruction, the server collects metadata information corresponding to the source database, and establishes a corresponding data table and field in the target database according to the collected metadata information corresponding to the source database, thereby being capable of acquiring the metadata information corresponding to the target database.

102. And matching the metadata information corresponding to the source database with the metadata information corresponding to the target database to generate matching information between the source database and the target database.

The matching information between the source database and the target database comprises table matching information and field matching information between the source database and the target database, wherein the table matching information is table matching information between a first data table in the source database and a second data table in the target database, and specifically comprises marking the first data table and the second data table as matching or not matching; the field matching information is field matching information between two fields in the first data table and the second data table which are matched, and specifically comprises marking the fields in the first data table and the fields in the second data table as being matched or not matched.

For the embodiment of the present invention, in order to create the data table tasks in batch, matching information between the source database and the target database needs to be acquired, specifically, data table information corresponding to each first data table in the source database is matched with data table information corresponding to each second data table in the target database, if the data table information corresponding to a certain first data table in the source database is different from the data table information corresponding to each second data table in the target database, it is determined that there is no second data table matching with the first data table, at this time, a second data table closest to the first data table needs to be found through similarity calculation, a mapping relationship between the two data tables is established, and the two data tables are marked as unmatched, where a specific calculation process of the similarity is shown in step 203; if the data table information corresponding to a certain first data table in the source database is the same as the data table information corresponding to a certain second data table in the target database, determining that the first data table in the source database is matched with the second data table in the target database, establishing a mapping relation between the first data table and the second data table, marking the first data table and the second data table as being matched, and further generating table matching information.

Further, after it is determined that a first data table in the source database is matched with a second data table in the target database, matching field information corresponding to each field in the first data table with field information corresponding to each field in the second data table, if the field information corresponding to a field in the first data table is different from the field information corresponding to each field in the second data table, determining that no field matched with a field in the first data table exists in the second data table, at this time, a field closest to the field needs to be found from the second data table through similarity calculation, a mapping relation between the two fields needs to be established, and the two fields are marked as unmatched; and if the field information corresponding to a certain field in the first data table is the same as the field information corresponding to a certain field in the second data, determining that the two fields in the first data table are matched with the two fields in the second data table, establishing a mapping relation between the two fields, marking the mapping relation as matching, and further generating field matching information, and further determining the matching information between the source database and the target database according to the table matching information and the field matching information.

103. And creating data table tasks in batches based on the matching information.

For the embodiment of the invention, the matched data tables and fields in the source database and the target database are determined according to the matching information, and the data table tasks are established in batch aiming at the matched data tables and fields. For example, according to the matching information, it is determined that the data table a and the data table B in the source database match the data table a and the data table B in the target database, respectively, and the field 1 and the field 2 in the data table a match the field 3 and the field 4 in the data table a, respectively, and the field 5 and the field 6 in the data table B match the field 7 and the field 8 in the data table B, respectively, then according to the matching information, a data migration task between the data table a and a data migration task between the data table B and the data table B are created.

104. And executing the data table task, and migrating the data in the source database to the target database.

For the embodiment of the invention, when a user clicks a button of a one-key library, the ETL tool can execute the created data table tasks in batch, specifically, the created data table tasks can be put into the queue and executed in a thread starting mode, and specifically, if the background server resources are enough, the tasks in the queue can be executed together; if the background server resources are insufficient, the tasks in the queue can be executed in batches, and the data in the source database is written into the target database through the execution of the tasks, so that the data migration is completed. Further, the user can also view the execution result, and for the task with the migration failure, the user can continue the execution by setting the increment field.

Compared with the mode that a data warehouse engineer manually creates tasks for each data table at present, the data migration method provided by the embodiment of the invention can acquire the metadata information respectively corresponding to the source database and the target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; meanwhile, based on the matching information, creating data table tasks in batches; and finally executing the data table task, and migrating the data in the source database to the target database, so that the data table task can be created in batch based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of a warehousing engineer and improving the data migration efficiency.

Further, in order to better describe the process of the above task of creating a data table in bulk, as a refinement and extension to the above embodiment, an embodiment of the present invention provides another data migration method, as shown in fig. 2, where the method includes:

201. and respectively configuring database information corresponding to the source database and the target database, and collecting metadata information corresponding to the source database based on the database information corresponding to the source database.

For the embodiment of the present invention, before creating a data table task in a data migration process in batch, a source database and a destination database need to be configured, which specifically include information such as an ip address, a port, a database name, a user name, a password, and a schema corresponding to the source database, and information such as an ip address, a port, a database name, a user name, a password, and a schema corresponding to the destination database, and further, metadata corresponding to the source database is collected to obtain metadata information corresponding to the source database, which mainly includes data table information, field information, view information, and the like.

202. And generating metadata information corresponding to the target database based on the metadata information corresponding to the source database.

For the embodiment of the present invention, in order to generate metadata information corresponding to the target database, step 202 specifically includes: if the database type corresponding to the source database is the same as the database type corresponding to the target database, generating and executing an SQL table building statement according to the metadata information corresponding to the source database to obtain the metadata information corresponding to the target database; if the database type corresponding to the source database is different from the database type corresponding to the target database, performing field type conversion on metadata information corresponding to the source database according to a preset field type mapping relation to obtain converted metadata information corresponding to the source database, and generating and executing an SQL table building statement based on the converted metadata information to obtain the metadata information corresponding to the target database.

Specifically, after the metadata information corresponding to the source database is collected, a corresponding data table and a corresponding field are established on the target database according to the metadata information corresponding to the source database. Specifically, if the source database and the target database are databases of the same type, generating a table building SQL statement directly according to metadata information corresponding to the source database, and building a corresponding data table and field in the target database by executing the table building SQL statement; if the source database and the target database are heterogeneous databases, the field type of the data table in the source database is converted into the field type supported by the target database based on a preset field type mapping relationship, for example, the source database is an Oacle database, the supported field type is VARCHAR2(n), the target database is a Mysql database, and the supported field type is VARCHAR (n), so that in the process of building the table by the target database, the field type of the VARCHAR2(n) in the source database needs to be converted into the field type of VARCHAR (n), and then a table building SQL statement is generated according to the data table information corresponding to the source database and the converted field information, and the corresponding data table and field are built in the target data by executing the table building SQL statement.

Further, after a corresponding data table is created in the target database, when a data migration instruction triggered by a user is received, metadata information corresponding to a source database and metadata information corresponding to the target database selected by the user are obtained, that is, the data table information and the field information corresponding to the source database and the data table information and the field information corresponding to the target database are automatically read, so that the metadata information corresponding to the source database and the metadata information corresponding to the target database are matched, and according to a matching result, data table tasks are created and executed in batches, so that the data migration efficiency is improved.

203. And matching the metadata information corresponding to the source database with the metadata information corresponding to the target database to generate matching information between the source database and the target database.

The metadata information corresponding to the source database includes data table information and field information corresponding to each first data table in the source database, and the metadata information corresponding to the target database includes data table information and field information corresponding to each second data table in the target database.

For the embodiment of the present invention, in order to obtain matching information between the source database and the target database, step 203 specifically includes: matching data table information corresponding to a target first data table in each first data table with data table information corresponding to each second data; if a target second data table matched with the target first data table exists in each second data table, establishing a table mapping relation between the target first data table and the target second data table, and generating table matching information between the target first data and the target second data table based on the table mapping relation; matching field information corresponding to each field in the target first data table with field information corresponding to each field in the target second data table, and generating field matching information according to a matching result; and determining matching information between the source database and the target database according to the table matching information and the field matching information. The target first data table refers to any one of the first data tables in the source database.

Specifically, data table information corresponding to any one of first data tables in the source database (target first data table) is matched with data table information corresponding to second data tables in the target database, whether a target second data table matched with the target first data table exists in each second data table is judged, if the data table information corresponding to a certain second data table is the same as the data table information corresponding to the target first data table, the second data table is determined to be the target second data table, namely the target first data table is matched with the target second data table, a mapping relation between the target first data table and the target second data table is established and marked as matching, and then table matching information between the target first data table and the target second data table is generated. Further, matching field information corresponding to each field in the target first data table with a field corresponding to each field in the target second data table, for example, matching field information corresponding to a field a in the target first data table with field information corresponding to each field in the target second data table, if a target field identical to the field information corresponding to the field a exists in the target second data table, determining that the target field is matched with the field a, establishing a mapping relation between the two fields, marking the two fields as matching, and further generating field matching information.

Further, as an optional implementation manner, for a specific process of generating the field matching information, the method includes: if the field type corresponding to the target field in each field of the target first data table is a primary key field, determining the target field as a target increment field, matching field information corresponding to the target increment field with field information corresponding to an increment field in the target second data table, and generating primary key field matching information according to a matching result; and if the field type corresponding to the target field is a non-primary key field, matching the field information corresponding to the target field with the field information corresponding to the non-increment field in the target second data table, and generating non-primary key field matching information according to a matching result. The target field is any one of fields of the target first data table.

For example, if it is determined that a first data table a in the source database is matched with a second data table a in the target database, a field matched with the first data table a and the second data table a needs to be further determined, in the field matching process, a field type needs to be determined first, and if the target field in the first data table a is a common field, that is, a non-primary key field, field information (field name) is directly used for matching; if the target field in the first data table is the primary key field, the default is to match the delta field in the second data table a as the delta field.

In a specific application scenario, in order to facilitate a user to adjust a matching relationship between data tables, when a second data table matching a target first data table does not exist in a target database, similarity between the target first data table and each second data table may be calculated, and a second data table closest to the target first data table is selected according to the similarity, based on which, after matching data table information corresponding to the target first data table in each first data table with data table information corresponding to each second data, the method further includes: if the target second data table matched with the target first data table does not exist in each second data table, respectively calculating the similarity between the data table information corresponding to each second data table and the data table information corresponding to the target first data table, and determining the second data table which is closest to the target first data table in each second data table based on the similarity; and establishing a table mapping relation between the target first data table and the closest second data table, and generating table matching information between the target first data and the closest second data table based on the table mapping relation.

For example, the second data table closest to the target first data table is determined as a second data table a, a table mapping relationship between the target first data table and the second data table a is established, the target first data table and the second data table a are marked as being unmatched, and then table matching information between the target first data table and the second data table a is generated. After the matching information between all the first data tables and the second data tables is generated, the matching information is displayed to the user in a web form, and the user can adjust the marks between the data tables, for example, adjust the marks between the target first data table and the second data table a to be matched, or also can re-generate the table matching information between the target first data table and the second data table a by modifying the data table information corresponding to the second data table a to be matched.

Further, in the process of calculating the similarity between each of the second data tables and the target first data table, if the data table information (data table name) corresponding to the first data table and the second data table is a character string, respectively calculating the edit distance between the character string corresponding to each second data table and the character string corresponding to the target first data table, selecting the first data table which is most similar to the target first data table from each second data table based on the edit distance, wherein the larger the edit distance, the smaller the similarity between the explanatory data tables, and conversely, the smaller the edit distance, the larger the similarity between the explanatory data tables, therefore, the second data table corresponding to the minimum edit distance can be selected from the calculated edit distances, and determining the second data table corresponding to the minimum editing distance as the second data table closest to the target first data table.

For example, the character string a corresponding to the second data table is adfgc, the character string B corresponding to the target first data table is aefbc, when the edit distance between the character string a and the character string B is calculated, each character in the character string a is compared with each character in the character string B, the number of characters in the character string a different from the character string B is determined, since the second character "d" in the character string a is different from the second character "e" in the character string B, and the fourth character "g" in the character string a is different from the fourth character "B" in the character string B, the edit distance between the character string a and the character string B can be determined to be 2, that is, the edit distance between the second data table and the target first data table is 2, thereby the edit distance between each second data table and the target first data table in the target database can be calculated in the above manner, and then the minimum editing distance can be selected, and the second data table corresponding to the minimum editing distance is determined as the second data table closest to the target first data table.

In a specific application scenario, if data table information (data table names) corresponding to a first data table and a second data table are fields, a word vector method of word2vec in the prior art may be utilized to respectively determine vectors corresponding to the second data table and a target first data table, and then respectively calculate a euclidean distance between the vector corresponding to each second data table and the vector corresponding to the target first data table, where the greater the euclidean distance, the smaller the similarity between the description data tables is, and conversely, the smaller the euclidean distance, the greater the similarity between the description data tables is, so that a second data table corresponding to a minimum euclidean distance may be selected from the calculated euclidean distances, and the second data table corresponding to the minimum euclidean distance is determined to be the second data table closest to the target first data table.

204. And creating data table tasks in batches based on the matching information.

For the embodiment of the present invention, the step 204 specifically includes, for creating the data table task in batch: determining a first data table and a second data table marked as matching according to the table matching information; according to the field matching information, determining matching fields corresponding to the first data table and the second data table which are marked as matching; respectively generating a data reading function and a data writing function based on the first data table and the second data table marked as matching and the matching field; creating execution files in batches based on the data reading function and the data writing function; determining the spreadsheet task based on the execution file.

Specifically, in the process of creating a data table task in batch, execution files need to be generated in batch, each execution file includes a data reading function and a data writing function, the data reading function is used for reading data in a first data table in a source database, the data writing function is used for writing data in the first data table into a corresponding second data table in a target database, the data reading function corresponds to a data reading parameter, the data reading parameter represents a first data table and a field where the read data is located, the data writing function corresponds to a data writing parameter, and the data writing parameter represents a data table and a field where the data is written. In the prior art, the data reading parameters and the data writing parameters need to be manually filled in by a warehousing engineer, so that the efficiency of data migration is low, in the embodiment of the invention, each group of data tables marked as matching and corresponding matching fields thereof can be directly read from generated matching information, then execution files are created in batches according to each group of data tables marked as matching and corresponding matching fields thereof, each execution file corresponds to one group of data tables marked as matching, and further, the data migration can be completed by executing the created execution files.

For example, it is determined by the matching information that the data table 1 in the source database and the data table 2 in the target database are marked as matching, the field a in the data table 1 and the field b in the data table 2 are marked as matching, in the process of creating the execution file for the data table 1 and the data table 2, the data table 1 and the field a are used as data reading parameters to generate a data reading function for the data table 1, similarly, the data table 2 and the field b are used as data writing parameters to generate a data writing function for the data table 2, the execution file for the data table 1 and the data table 2 is created based on the generated data reading function and data writing function, and then a data migration task between the data table 1 and the data table 2 can be created.

205. And executing the data table task, and migrating the data in the source database to the target database.

For the embodiment of the present invention, after the task of creating the data table in batch, the data in the source database is migrated to the target database, and the data migration process is completely the same as that in step 104, and is not described herein again.

Compared with the current mode that a warehouse engineer manually creates tasks for each data table, the data migration method provided by the embodiment of the invention can acquire metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; meanwhile, based on the matching information, creating data table tasks in batches; and finally executing the data table task, and migrating the data in the source database to the target database, so that the data table task can be created in batch based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of a warehousing engineer and improving the data migration efficiency.

Further, as a specific implementation of fig. 1, an embodiment of the present invention provides a data migration apparatus, as shown in fig. 3, the apparatus includes: an acquisition unit 31, a matching unit 32, a creation unit 33, and a migration unit 34.

The obtaining unit 31 may be configured to obtain metadata information corresponding to the source database and the target database, respectively.

The matching unit 32 may be configured to match metadata information corresponding to the source database with metadata information corresponding to the target database, so as to generate matching information between the source database and the target database.

The creating unit 33 may be configured to create data table tasks in batches based on the matching information.

The migration unit 34 may be configured to execute the data table task and migrate the data in the source database to the target database.

In a specific application scenario, the metadata information corresponding to the source database includes data table information and field information corresponding to each first data table in the source database, and the metadata information corresponding to the target database includes data table information and field information corresponding to each second data table in the target database, as shown in fig. 4, the matching unit 32 includes: a matching module 321, a generating module 322 and a determining module 323.

The matching module 321 may be configured to match data table information corresponding to a target first data table in each first data table with data table information corresponding to each second data.

The generating module 322 may be configured to, if a target second data table that matches the target first data table exists in each second data table, establish a table mapping relationship between the target first data table and the target second data table, and generate table matching information between the target first data and the target second data table based on the table mapping relationship.

The generating module 322 may be further configured to match field information corresponding to each field in the target first data table with field information corresponding to each field in the target second data table, and generate field matching information according to a matching result.

The determining module 323 may be configured to determine matching information between the source database and the target database according to the table matching information and the field matching information.

Further, the matching unit 32 further includes: a calculation module 324.

The calculating module 324 may be configured to, if there is no target second data table matching the target first data table in the second data tables, respectively calculate similarity between data table information corresponding to the second data tables and data table information corresponding to the target first data table, and determine, based on the similarity, a second data table that is closest to the target first data table in the second data tables.

The generating module 322 may be further configured to establish a table mapping relationship between the target first data table and the closest second data table, and generate table matching information between the target first data and the closest second data table based on the table mapping relationship.

Further, the book search generation module 322 may be specifically configured to determine, if a field type corresponding to a target field in each field of the target first data table is a primary key field, the target field as a target increment field, match field information corresponding to the target increment field with field information corresponding to an increment field in the target second data table, and generate primary key field matching information according to a matching result; and if the field type corresponding to the target field is a non-primary key field, matching the field information corresponding to the target field with the field information corresponding to the non-increment field in the target second data table, and generating non-primary key field matching information according to a matching result.

In a specific application scenario, the creating unit 33 includes: a determination module 331, a generation module 332, and a creation module 333.

The determining module 331 may be configured to determine the first data table and the second data table marked as matching according to the table matching information.

The determining module 331 may be further configured to determine, according to the field matching information, a matching field corresponding to the first data table and the second data table that are marked as matching.

The generating module 332 may be configured to generate a data reading function and a data writing function based on the first data table and the second data table marked as matching and the matching field, respectively.

The creating module 333 may be configured to create the execution file in batch based on the data reading function and the data writing function.

The determining module 331 may be further configured to determine the spreadsheet task based on the execution file.

Further, the acquiring unit 31 includes: a configuration module 311, an acquisition module 312, and a generation module 313.

The configuration module 311 may be configured to configure database information corresponding to the source database and the target database, respectively.

The collection module 312 may be configured to collect metadata information corresponding to the source database based on the database information corresponding to the source database.

The generating module 313 may be configured to generate metadata information corresponding to the target database based on the metadata information corresponding to the source database.

Further, the generating module 313 may be specifically configured to generate and execute an SQL table building statement according to the metadata information corresponding to the source database to obtain the metadata information corresponding to the target database if the database type corresponding to the source database is the same as the database type corresponding to the target database; if the database type corresponding to the source database is different from the database type corresponding to the target database, performing field type conversion on metadata information corresponding to the source database according to a preset field type mapping relation to obtain converted metadata information corresponding to the source database, and generating and executing an SQL table building statement based on the converted metadata information to obtain the metadata information corresponding to the target database.

It should be noted that other corresponding descriptions of the functional modules related to the data migration apparatus provided in the embodiment of the present invention may refer to the corresponding description of the method shown in fig. 1, and are not described herein again.

Based on the method shown in fig. 1, correspondingly, an embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the following steps: acquiring metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; based on the matching information, creating data table tasks in batches; and executing the data table task, and migrating the data in the source database to the target database.

Based on the above embodiments of the method shown in fig. 1 and the apparatus shown in fig. 3, an embodiment of the present invention further provides an entity structure diagram of a computer device, as shown in fig. 5, where the computer device includes: a processor 41, a memory 42, and a computer program stored on the memory 42 and executable on the processor, wherein the memory 42 and the processor 41 are both arranged on a bus 43 such that when the processor 41 executes the program, the following steps are performed: acquiring metadata information respectively corresponding to a source database and a target database; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; based on the matching information, creating data table tasks in batches; and executing the data table task, and migrating the data in the source database to the target database.

By the technical scheme, the metadata information corresponding to the source database and the target database can be acquired; matching metadata information corresponding to the source database with metadata information corresponding to the target database to generate matching information between the source database and the target database; meanwhile, based on the matching information, creating data table tasks in batches; and finally executing the data table task, and migrating the data in the source database to the target database, so that the data table task can be created in batch based on matching information by matching the metadata information corresponding to the source database with the metadata information corresponding to the target database, thereby reducing the workload of a warehousing engineer and improving the data migration efficiency.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method of data migration, comprising:

based on the matching information, creating data table tasks in batches;

2. The method according to claim 1, wherein the metadata information corresponding to the source database includes data table information and field information corresponding to each first data table in the source database, the metadata information corresponding to the target database includes data table information and field information corresponding to each second data table in the target database, and the matching the metadata information corresponding to the source database with the metadata information corresponding to the target database to generate matching information between the source database and the target database includes:

matching data table information corresponding to a target first data table in each first data table with data table information corresponding to each second data;

if a target second data table matched with the target first data table exists in each second data table, establishing a table mapping relation between the target first data table and the target second data table, and generating table matching information between the target first data and the target second data table based on the table mapping relation;

matching field information corresponding to each field in the target first data table with field information corresponding to each field in the target second data table, and generating field matching information according to a matching result;

and determining matching information between the source database and the target database according to the table matching information and the field matching information.

3. The method according to claim 2, wherein after the matching the data table information corresponding to the target first data table in each first data table with the data table information corresponding to each second data, the method further comprises:

if the target second data table matched with the target first data table does not exist in each second data table, respectively calculating the similarity between the data table information corresponding to each second data table and the data table information corresponding to the target first data table, and determining the second data table which is closest to the target first data table in each second data table based on the similarity;

and establishing a table mapping relation between the target first data table and the closest second data table, and generating table matching information between the target first data and the closest second data table based on the table mapping relation.

4. The method according to claim 2, wherein the matching the field information corresponding to each field in the target first data table with the field information corresponding to each field in the target second data table, and generating the field matching information according to the matching result includes:

if the field type corresponding to the target field in each field of the target first data table is a primary key field, determining the target field as a target increment field, matching field information corresponding to the target increment field with field information corresponding to an increment field in the target second data table, and generating primary key field matching information according to a matching result;

and if the field type corresponding to the target field is a non-primary key field, matching the field information corresponding to the target field with the field information corresponding to the non-increment field in the target second data table, and generating non-primary key field matching information according to a matching result.

5. The method of claim 2, wherein the batch creation of spreadsheet tasks based on the matching information comprises:

determining a first data table and a second data table marked as matching according to the table matching information;

according to the field matching information, determining matching fields corresponding to the first data table and the second data table which are marked as matching;

respectively generating a data reading function and a data writing function based on the first data table and the second data table marked as matching and the matching field;

creating execution files in batches based on the data reading function and the data writing function;

determining the spreadsheet task based on the execution file.

6. The method according to claim 1, wherein the obtaining metadata information corresponding to the source database and the target database respectively comprises:

respectively configuring database information corresponding to the source database and the target database;

collecting metadata information corresponding to the source database based on the database information corresponding to the source database;

and generating metadata information corresponding to the target database based on the metadata information corresponding to the source database.

7. The method of claim 6, wherein the generating metadata information corresponding to the target database based on the metadata information corresponding to the source database comprises:

if the database type corresponding to the source database is the same as the database type corresponding to the target database, generating and executing an SQL table building statement according to the metadata information corresponding to the source database to obtain the metadata information corresponding to the target database;

if the database type corresponding to the source database is different from the database type corresponding to the target database, performing field type conversion on metadata information corresponding to the source database according to a preset field type mapping relation to obtain converted metadata information corresponding to the source database, and generating and executing an SQL table building statement based on the converted metadata information to obtain the metadata information corresponding to the target database.

8. A data migration apparatus, comprising:

9. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 7 when executed by the processor.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.