[go: up one dir, main page]

CN102110121B - A kind of data processing method and system thereof - Google Patents

A kind of data processing method and system thereof Download PDF

Info

Publication number
CN102110121B
CN102110121B CN200910260173.6A CN200910260173A CN102110121B CN 102110121 B CN102110121 B CN 102110121B CN 200910260173 A CN200910260173 A CN 200910260173A CN 102110121 B CN102110121 B CN 102110121B
Authority
CN
China
Prior art keywords
data
data record
data records
database
records
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN200910260173.6A
Other languages
Chinese (zh)
Other versions
CN102110121A (en
Inventor
覃健祥
常国斌
张宋景
李翀
朱明君
全鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN200910260173.6A priority Critical patent/CN102110121B/en
Publication of CN102110121A publication Critical patent/CN102110121A/en
Priority to HK11108936.7A priority patent/HK1155236B/en
Application granted granted Critical
Publication of CN102110121B publication Critical patent/CN102110121B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses a kind of data processing method and system thereof, the method comprises: after getting the request upgraded the data record in database, according to ask upgrade data record newly insert the identical data record of an object identity in the database, and version information is added in the new data record inserted, described version information is for showing the time sequencing of the update of the data record with same object mark; When by described database replication to another database time, utilize the replicate run of multiple thread parallel performing database.Adopt above-mentioned data processing method and system thereof, under the prerequisite of data consistency can being ensured to a certain extent, improve the efficiency of database replication operation.

Description

Data processing method and system
Technical Field
The present application relates to data processing technologies in the field of communications, and in particular, to a data processing method and system.
Background
Mainstream relational databases (e.g., MySQL, PostgreSQL, Oracle) have master-slave Replication (Replication) functionality. Assuming that a is a master server, B is a slave server, and the database-based service system deploys a function of "database server a" to "database server B", when data add (Insert), Delete (Delete), and Update (Update) operations occur on the database server a, these operations are synchronized to the database server B, which is called Replication (Replication).
In general, when people deal with database-based services, four operations of adding (Insert), deleting (Delete), changing (Update) and searching (Select) data records in a database are often used. After the database replication is deployed, the operations (including addition, deletion, and modification) related to the data record writing operation are synchronized by the master database server to the slave database server in sequence.
But such synchronization operations must be ordered. Assume that in a business system in which user signatures are recorded, a series of operations occur on a master database server DB1, a slave database server DB2, and a DB 1:
scene one: the user with ID 12065 sets his signature to "dog in leisure;
the corresponding SQL statement 1 is: update user _ sign set nickname = "dog in love management" whereuserid = 12065;
the corresponding database operation is: update the nickname of the user with ID 12065 to "dog in leisure;
scene two: the user with ID 12065 sets his signature to "fruit in midsummer";
the corresponding SQL statement 2 is: update user _ sign set nickname = "summer fruit" whereuserid = 12065;
the corresponding database operation is: update the nickname of the user with the ID of 12065 to "fruit in midsummer";
scene three: the user with ID 12065 sets his signature to "about winter";
the corresponding SQL statement 3 is: update user _ sign set nickname = "greenuserid =12065 about in winter;
the corresponding database operation is: the nickname of the user having ID 12065 is updated to "approximately in winter".
When the DB1 performs a copy operation to the DB2, the SQL statement 1, SQL statement 2, and SQL statement 3 for updating the data records must be executed in sequence, so that the data on the DB1 and the DB2 can be consistent. If the execution order on DB2 is: SQL statement 1, SQL statement 3, and SQL statement 2, the nickname of 12065 on DB1 is "about winter", and the nickname of 12065 on DB2 is "fruit in midsummer", and the data in the master and slave database servers are inconsistent.
Since the operations of updating the data records must be executed in sequence during the replication process to ensure the consistency of the data in the master and slave database servers, in order to ensure the consistency of the data in the master and slave database servers, the replication operation between the master and slave database servers is usually performed in a single thread, and the execution efficiency is low.
Content of application
The embodiment of the application provides a database operation method and a system thereof, which are used for solving the problem of low database copy operation efficiency in the existing data processing technology.
The technical scheme provided by the embodiment of the application comprises the following steps:
a method of data processing comprising the steps of:
after a request for updating data records in a database is acquired, newly inserting data records with the same object identifier into the database according to the data records requested to be updated, and adding version information into the newly inserted data records, wherein the version information is used for indicating the time sequence of the insertion operation of the data records with the same object identifier;
performing a database copy operation in parallel with a plurality of threads when copying the database to another database; wherein,
newly inserting a data record with the same object identifier in the database according to the data record requested to be updated, and comprising the following steps:
inquiring all data records with the same object identification as the data record requested to be updated in the database, and determining the data record inserted last in the data records according to the version information in the data records;
and merging and generating a data record with the same object identifier according to the updated data content carried in the request and the data content in the last inserted data record, and inserting the generated data record into the database.
A data processing system comprising:
the acquisition module is used for acquiring a request for updating the data records in the database;
the updating processing module is used for newly inserting a data record with the same object identifier into the database according to the data record required to be updated after acquiring a request for updating the data record, and adding version information into the newly inserted data record, wherein the version information is used for indicating the time sequence of the inserting operation of the data record with the same object identifier;
the copy processing module is used for executing the database copy operation in parallel by utilizing a plurality of threads when the database is copied to another database; wherein,
when a data record with the same object identifier is newly inserted into the update processing module, all data records with the same object identifier as the data record requested to be updated are inquired in the database, and the data record inserted last in the data records is determined according to version information in the data records; and then combining the updated data content carried in the request and other data content in the data record inserted at last to generate a data record, and inserting the generated data record into the database.
In the embodiment of the application, when a request for updating the data record in the database is acquired, a data record with the same object identifier is newly inserted into the database, and version information is added into the newly inserted data record, so that the data updating operation is replaced by a data inserting operation; when a database is copied, a plurality of threads are used for executing the database copying operation in parallel. On one hand, the efficiency of the database copy operation can be improved by using the multithread parallel execution of the database copy operation; on the other hand, since the operation of updating the data record is replaced by the operation of inserting the data record, even if the data record is not copied before and after the time of the data record inserting operation strictly due to the fact that the copying operation is executed by using multiple threads, the data consistency of the master database and the slave database is not damaged.
Drawings
FIG. 1 is a block diagram of a data processing system according to an embodiment of the present application;
FIG. 2 is a second schematic diagram of a data processing system according to an embodiment of the present application;
fig. 3 is a third schematic structural diagram of a data processing system according to an embodiment of the present application.
Detailed Description
Aiming at the problems in the prior art, the embodiment of the application replaces the data updating operation with the data inserting operation and performs the database copying operation in parallel by using multiple threads, so that the database copying efficiency is improved on the premise of ensuring the data consistency of the master database and the slave database.
The data processing system of the embodiment of the application can operate the data records in the database, and comprises the following steps: insert (Insert), Delete (Delete), query (Select) operations for data records, copy (Replication) of data from the master database to the slave database, etc. The database tables in the master database and the slave database comprise a field for recording version information, wherein the version information of the data record is recorded. A data record in a database table typically includes an identification field for describing the unique identity of an object, such as a user ID, while other fields are used for describing attributes of the object, such as a user name, age, etc., and may also include attributes describing the data record, such as version information of the data record.
The data processing system of the embodiment of the application can acquire various data processing requests input by a user through a human-computer interaction interface provided by the data processing system, such as: a request to create, update, delete or query a data record, or a request to make a database copy;
when a system acquires a request for creating a data record, generating a statement for inserting the data record, inserting the data record into a database by executing the statement, and adding version information into the newly inserted data record;
when a system acquires a request for updating a data record, generating a statement for inserting the data record, and inserting the data record into a database by executing the statement, wherein version information is added into the newly inserted data record and has the same object identifier as the data record requested to be updated;
when the system acquires a request for inquiring the data record, generating a statement for inquiring the data record according to the identification of the inquired object, and returning the content of the data record which is newly inserted in all the data records with the identification of the object by executing the statement;
when the system performs the database replication operation according to the deployed database replication policy (such as replication cycle) or the obtained replication request, the multithread parallel execution database replication operation can be used in the replication process, and further, the multithread parallel execution database replication operation can be realized by adopting multi-server distribution.
The data processing system of the embodiment of the application may further include a data record deletion function, and when the system obtains a request for deleting a data record, a statement for inserting the data record is generated, and a data record having the same object identifier as the data record requested to be deleted is inserted into the database by executing the statement, and version information indicating the insertion time sequence of the data record is added to the data record, and the data record is marked as deleted. Correspondingly, when the system acquires a data record query request, firstly, the newly inserted data record is searched according to the identifier of the object requested to be queried, then, whether the data record is marked to be deleted is judged, if so, the not searched query result is returned, and otherwise, the content of the newly inserted data record is returned as the query result.
The version information described above may be generated by the data processing system when a data record is inserted. The version information is used for indicating the chronological order of the insertion operations of the data records with the same object identification. Because the data processing system of the embodiment of the application converts the updating operation of the data record into the inserting operation of the data record, for the data record of the same object identifier, if a subsequent corresponding user requests for updating the content of the data record for many times, a corresponding number of new data records can be inserted into the database table. In order to provide accurate query results to a user (i.e. a plurality of data records identified for the same object only return the latest inserted data record therein as a query result) when querying a database, version information of each data record needs to be identified to indicate the time sequence of data record insertion. The version information may be a sequence number which is sequentially increased, or may be a timestamp, or other character combinations which can indicate the chronological order.
Because the data processing system of the embodiment of the application replaces the update operation and the deletion of the data records with the insertion operation of the data records, and each data record contains the version information, when the database is copied, the database is not required to be copied according to the time sequence executed by each data operation strictly, and therefore the database copying operation can be performed in a multithreading concurrent mode and a multi-server distributed mode, the data consistency is guaranteed to a certain extent, and meanwhile the efficiency of the database copying operation is improved.
Embodiments of the present application are described in detail below with reference to the accompanying drawings.
Referring to fig. 1, a schematic structural diagram of a data processing system according to an embodiment of the present application is shown, where the system may be implemented by software programming. The database operating system includes: the obtaining module 101, the updating processing module 103, and the copying processing module 106 may further include: an insertion processing module 102, a deletion processing module 104, and a query processing module 105.
The main functions of each functional module in the data processing system comprise:
the acquisition module 101: various data processing requests input by a user through a man-machine interaction interface provided by the data processing system can be acquired, such as: a request for creating, updating, deleting or inquiring data records or a request for copying a database is made, and a corresponding data processing functional module can be triggered to perform data processing operation according to the received data processing request;
the insertion processing module 102: a statement for inserting a data record can be generated according to a request for creating a data record acquired by the acquisition module 101, and a data record is inserted into the database by executing the statement, wherein version information is added to the newly inserted data record;
the update processing module 103: a statement for inserting a data record may be generated according to a request for updating the data record acquired by the acquisition module 101, and a data record is inserted into the database by executing the statement, where version information is added to the newly inserted data record and the newly inserted data record has the same object identifier as the data record requested to be updated;
the deletion processing module 104: a statement for inserting a data record can be generated according to a request for deleting the data record acquired by the acquisition module 101, a new data record is inserted by executing the statement, and the data record is marked as deleted;
the query processing module 105: a statement for querying the data record can be generated according to the request for querying the data record obtained by the obtaining module 101, and a query result is returned by executing the statement;
the copy processing module 106: the database replication operation may be performed according to a request for replicating the database acquired by the acquisition module 101 or according to a deployed database replication policy (e.g., a replication cycle), and in the replication process, the database replication operation may be performed in a multi-server distributed manner and in a multi-thread parallel manner.
Various data processing flows of the embodiments of the present application will be described in detail below by taking as an example a database table structure of a data processing system (which may also be referred to as a business system) for recording a user signature shown in table 1. The business system for recording the user signature has the acquisition module 101, the insertion processing module 102, the update processing module 103, the query processing module 105 and the copy processing module 106 as described above, and may further include the deletion processing module 104.
Table 1: structure of database table user _ sign for recording user signature
userid signature revision
In table 1, userid, signature, and review are field names of the database table, and the meaning of each field is as follows:
and (3) userid: a user Identification (ID);
signature: signature content (or called nickname);
and (3) revision: version information in the form of a serial number (hereinafter referred to as version number).
When a user requests registration, registration information such as a user name, a password, and a user signature (for example, the user signature is "dog in leisure) may be input through a human-computer interface provided by the system and a registration command is submitted, the obtaining module 101 receives the registration information and the registration command and submits the registration information and the registration command to the insertion processing module 102, the insertion processing module 102 assigns a user ID (for example, 12065) to the user, and then generates a data record insertion statement:
INSERT INTO user_sign
(userid, signature, vision) VALUES (12065, "dog in leisure, 1)
After the statement is executed, a new data record is inserted into the user _ sign database table, and the database table into which the data record is inserted is shown in table 2, wherein the version number allocated to the data record by the system is 1 because the data record is initially created.
Table 2: database table user _ sign containing initial user signature of userid =12065
userid signature revision
12065 Dog in love of leisure 1
When the user wants to update his user signature, he can log in the system, input the updated user signature "fruit in summer" through the man-machine interface provided by the system, and then select the update command and submit. The obtaining module 101 of the system receives the update command and the updated user signature and submits the update command and the updated user signature to the update processing module 103, and the update processing module 103 can obtain the user ID of the login user, and then generates a data record insertion statement according to the obtained user ID and the updated user signature:
INSERT INTO user_sign
(userid, signature, vision) VALUES (12065, "fruit of midsummer", 2)
After executing the statement, a new data record is inserted into the use _ sign database table, and the database table after the data record is inserted is shown in table 3, wherein the update processing module 103 may query all the data records with the user ID of 12065 and query the maximum version number thereof before generating the data record insertion statement, and then increment the version number to obtain the version series number "2" of the data record to be inserted. Of course, if the version information is represented by a time stamp, it is sufficient to write the current time directly in the statement inserted into the data record as the version information.
Table 3: user with userid =12065 updates the database table user _ sign after the user signature for the first time
userid signature revision
12065 Dog in love of leisure 1
12065 Fruit of midsummer 2
When the user wants to update his user signature again, he can log in the system, enter the updated user signature "about winter" through the man-machine interface provided by the system, and then select and submit the update command. The acquisition module 101 of the system receives the update command and the updated user signature and submits the update command and the updated user signature to the update processing module 103, and the update processing module 103 generates the following data record insertion statements:
INSERT INTO user_sign
(userid, signature, vision) VALUES (12065, "about winter", 3)
After the statement is executed, a new data record is inserted into the user _ sign database table, and the database table into which the data record is inserted is shown in table 4:
table 4: user with userid =12065 updates the user-signed database table user _ sign for the second time
userid signature revision
12065 Dog in love of leisure 1
12065 Fruit of midsummer 2
12065 About in winter 3
The user-signed query operation is typically performed by a system administrator. When a system administrator inputs a userid of 12065 through a management operation interface provided by the system and selects a query command to submit, the obtaining module 101 of the system receives the command and a user ID and submits the command to the query processing module 105, and the query processing module 105 generates a query statement of the following data records:
SELECT signature FROM user_sign
WHERE userid=12065
ORDER BY revision DESC LIMIT 1
during the execution of the statement, the signature of the user with userid =12065 is inquired in a user _ sign database table shown in table 4, and is sorted in descending order of version number, and only the first sorted data record is returned. Since the version number corresponding to "about winter" is the largest in all data records of userid =12065, the query result is "about winter".
If the data processing system also provides the function of deleting data records, it is also necessary to convert the delete operation into an operation of inserting a data record, i.e. inserting a new record and marking the record as deleted. The operation of deleting data records from the user _ sign database table is typically performed by a system administrator. If the current user _ sign is as shown in table 4, when a system administrator inputs or selects a user signature "about in winter" through a management operation interface provided by the system, and selects a delete command to submit, the obtaining module 101 of the system receives the command and a user ID and submits the command to the delete processing module 104, and the delete processing module 104 obtains that the user ID of "about in winter" is 12065, and generates the following data record insertion statement:
INSERT INTO user_sign
(userid, signature, deleted, reproduction) VALUES (12065, "about winter", 1,4)
Where 4 is the version number assigned to the newly inserted data record, and deleted is a deletion flag field, whose value is 1, indicating that the inserted data record is an inserted record generated according to the deletion request, indicating that the user-signed data record whose user ID is 12065 has been deleted.
After the statement is executed, a new data record is inserted into the user _ sign database table, and the database table into which the data record is inserted is shown in table 5:
table 5: executing the database table user _ sign after deleting the user signature record for the user with userid =12065
userid signature revision deleted
12065 Dog in love of leisure 1
12065 Fruit of midsummer 2
12065 About in winter 3
12065 About in winter 4 1
Aiming at a data processing system with a deleting function, when a data record is queried, a newly inserted data record is found according to a query condition, then whether the data record is marked as deleted is judged, if not, a query result is generated according to the data record, otherwise, a query result indicating that the requested data is not queried is returned. For example, for the database table user _ sign shown in table 5, when a system administrator inputs a user signature "about in winter" through a management operation interface provided by the system, and selects a query command to submit, the obtaining module 101 of the system receives the command and a user ID and submits the command to the query processing module 105, the query processing module 105 queries userid =12065 "about in winter" through a generated query statement, and queries a data record with a version number of 4, which is a newly inserted data record, in the data records with userid =12065, and indicates "not found" because the deleted field value of the record is 1.
When the system is deployed with a master and slave database replication policy, the system may perform database replication by the replication processing module 106 according to the deployed replication policy (e.g., execution cycle of replication operation, etc.). During the replication process, data replication operations can be performed in a multi-thread parallel and multi-server distributed manner. As in the above example, during the database replication process, two data records with version number 2 and version number 3 may be synchronized into the slave database server in any order, the corresponding 2 data record insertion operation records may also be synchronized into the slave database server in any order, and finally, the data in the master and slave database servers are consistent.
The user _ sign database table only has one signature field except for the user ID and version information fields, and in practical application, the database table may contain more fields. Table 6 gives an example of a user _ sign database table containing more fields, where an age field is added to the user _ sign database table:
table 6: user _ sign table with more fields
userid signature age revision
12065 Dog in love of leisure 18 1
12065 Fruit of midsummer 18 2
For the database table shown in table 6, the processes and principles of the data record insertion operation when the user registers, the operation of deleting the user record, and the database copy operation are the same as the corresponding flows described above. When a user requests to update the data record, the process is as follows:
when the user wants to update his user signature, he can log in the system, enter the updated user signature "about winter" through the man-machine interface provided by the system, and then select the update command and submit. The obtaining module 101 of the system receives the update command and the updated user signature and submits the update command and the updated user signature to the update processing module 103, the update processing module 103 may obtain the user ID of the login user as 12065, then query the data record of userid =12065 from the user _ sign table according to the obtained user ID, then use the age field value "18" in the updated user signature "about in winter" and the queried data record of userid =12065 as the corresponding field value of the new data record, and generate the following data record insertion statement:
INSERT INTO user_sign
(userid, signature, age, vision) VALUES (12065, "about winter", 18,3)
After the statement is executed, a new data record is inserted into the user _ sign database table, and the database table into which the data record is inserted is shown in table 7:
table 7: user _ sign table after user signature is updated
userid signature age revision
12065 Dog in love of leisure 18 1
12065 Fruit of midsummer 18 2
12065 About in winter 18 3
It can be seen that, when generating the INSERT statement, for each field, if there is such a field value in the data submitted by the update request, the value of the field is subject to the field value submitted in the update request, and if not, the corresponding field value in the latest version of the queried data record.
Because the data processing system according to the embodiment of the present application replaces the operation of updating the data record with the operation of inserting the data record, redundant data records may be stored in the database table for the same object identifier, and for example, for the user _ sign database table shown in table 7, for a data record with userid =12065, multiple versions of user signatures (signatures) may be stored in the database table. In order to ensure the capacity and performance of the database, in the embodiment of the present application, only a certain number of data records may be saved for the data records of the same object identifier, and other redundant data records may be deleted. There are various ways to delete redundant data records, such as:
the first method is as follows: timing cleaning data
A cleaning program may be programmed in advance and then executed periodically. The cleaning program can traverse the database table and delete all or part of the redundant data records in the database table. For example, for the above database table named user _ sign, each time a cleaning program is executed, the cleaning program traverses the data records in the table, and finds that a plurality of data records with userid of 12065 exist, so that according to the preset setting, only the data record with the maximum version number is reserved, and the data records with other version numbers are deleted; alternatively, when the threshold of the number of data records is 2, 2 data records are retained and the other data records are deleted from the largest version number after being sorted in the descending order of the version numbers.
The second method comprises the following steps: cleaning data each time a data record is inserted
Before or after each insertion of a data record, the database table may be traversed and the redundant data records therein deleted, in whole or in part. For example, for the above database table named user _ sign, after a user inserts a data record by requesting to update a user signature each time, traversing the data records in the table, finding that a plurality of data records with userid of 12065 exist, so that according to the preset setting, only the data record with the maximum version number can be reserved, and the data records with other version numbers can be deleted; alternatively, when the threshold of the number of data records is 2, 2 data records are retained and the other data records are deleted from the largest version number after being sorted in the descending order of the version numbers.
The cleaning operation can be implemented by corresponding functional modules, such as adding a first cleaning processing module 107 in the data processing system of the embodiment of the present application, as shown in fig. 2, or adding a second cleaning processing module 108, as shown in fig. 3. The first cleaning processing module 107 can implement a function of cleaning data at regular time, and the second cleaning processing module 108 can clean data each time a data record is inserted.
Compared with the two modes, the cleaning is carried out when the data record is inserted every time, so that the coupling degree of the insertion operation is improved, the response speed is reduced, and therefore, in practical application, the data can be cleaned at fixed time, and better reliability and shorter response time can be obtained.
The embodiment of the application is particularly suitable for application in the aspect of internet content services which have low requirements on data redundancy and higher requirements on concurrency. In terms of database type, the embodiment of the present application is particularly suitable for a database stored by using key-value (i.e. key value pair), such as the database of the business system for recording user signature.
The database operating system of the embodiment of the present application may provide the following functions in addition to the above functions: when the data on the main database server is damaged, the data can be recovered by the data of the slave database server; when the main database server is down, the slave database server can take over the work of the main database server, the higher the efficiency of the copying operation between the main database server and the slave database server is, the better the real-time performance of the data on the slave database server is; the slave database server and the master database server can share the load together to realize load balance. The higher the copying operation efficiency between the master database server and the slave database server is, the better the consistency of the data on the master database server and the slave database server is, and the better the effect of the functions is.
It should be noted that, although the data processing system shown in fig. 1, fig. 2 or fig. 3 is taken as a basis for describing the data processing process, those skilled in the art should understand that the functional module division manner of the data processing system shown in fig. 1, fig. 2 and fig. 3 is only one of the functional module division manners, and as long as the data processing system can implement the data processing process described in the embodiment of the present invention and has the functions of the data processing system in the embodiment of the present invention, no matter how the functional module division manner is adopted, the data processing system should be within the scope of the present invention.
In summary, in the embodiment of the present application, data insertion operation is used to replace data update operation, so that there is no update operation type in the data operation type, and therefore there is no substitution and coverage of data, as long as these insertion statements are finally synchronized to the slave database server, it can be ensured that the data in the master database server and the slave database server are consistent, and the order of statement execution has no influence on the data consistency, so that database replication can be executed in parallel in multiple threads, even in a distributed manner, thereby greatly improving execution efficiency and eliminating blocking.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (13)

1. A data processing method, characterized by comprising the steps of:
after a request for updating data records in a database is acquired, newly inserting data records with the same object identifier into the database according to the data records requested to be updated, and adding version information into the newly inserted data records, wherein the version information is used for indicating the time sequence of the insertion operation of the data records with the same object identifier;
performing a database copy operation in parallel with a plurality of threads when copying the database to another database; wherein,
newly inserting a data record with the same object identifier in the database according to the data record requested to be updated, and comprising the following steps:
inquiring all data records with the same object identification as the data record requested to be updated in the database, and determining the data record inserted last in the data records according to the version information in the data records;
and merging and generating a data record with the same object identifier according to the updated data content carried in the request and the data content in the last inserted data record, and inserting the generated data record into the database.
2. The data processing method of claim 1, further comprising: when the preset time or period is reached, inquiring the data records in the database, if a plurality of data records corresponding to the same object identifier exist, retaining the last inserted data record in the inquired data records according to the version information, and deleting the rest data records; or,
when the preset time or period is reached, inquiring the data records in the database, if a plurality of data records corresponding to the same object identifier exist, sorting the inquired data records according to the version information, and when the number of the inquired data records exceeds a set threshold, keeping the data records of the number specified by the threshold from the last inserted data record in the sorted data records, and deleting the rest data records.
3. The data processing method of claim 1, further comprising, before or after inserting a new data record:
inquiring all data records with the same object identification as the new data record, reserving the data record inserted in the inquired data record according to the version information, and deleting the rest data records; or,
inquiring all data records with the same object identification as the new data record, and sequencing the inquired data records according to the version information; if the number of the inquired data records exceeds the set threshold, keeping the number of the data records specified by the threshold from the last inserted data record in the sorted data records, and deleting the rest data records.
4. The data processing method of claim 1, further comprising:
after a request for inquiring the data records is obtained, inquiring all the data records with the object identification according to the object identification of the data records required to be inquired; and returning the data record which is inserted last in the inquired data record as an inquiry result according to the version information in the data record.
5. The method of claim 1, further comprising:
when a request for deleting the data record in the database is acquired, a data record with the same object identifier is newly inserted into the database according to the data record requested to be deleted, version information is added into the newly inserted data record, and the data record is marked as deleted.
6. The method of claim 5, further comprising:
after a request for inquiring the data records is obtained, inquiring all the data records with the object identification according to the object identification of the data records required to be inquired; and returning a query result indicating that the data record requested to be queried is not searched when judging that the data record inserted last in the queried data record is marked as deleted according to the version information in the data record.
7. The data processing method of any of claims 1-6, wherein the version information is a sequence number or a timestamp;
if the version information is a serial number, adding the version information in the newly inserted data record, specifically:
inquiring all data records with the same object identification as the data record requested to be updated in the database, and determining the data record inserted last in the data records according to the serial number of the version information field in the data records;
and increasing the sequence number of the version information field of the last inserted data record, and taking the increased sequence number as the sequence number of the version information field of the generated data record.
8. A data processing system, comprising:
the acquisition module is used for acquiring a request for updating the data records in the database;
the updating processing module is used for newly inserting a data record with the same object identifier into the database according to the data record required to be updated after acquiring a request for updating the data record, and adding version information into the newly inserted data record, wherein the version information is used for indicating the time sequence of the inserting operation of the data record with the same object identifier;
the copy processing module is used for executing the database copy operation in parallel by utilizing a plurality of threads when the database is copied to another database; wherein,
when a data record with the same object identifier is newly inserted into the update processing module, all data records with the same object identifier as the data record requested to be updated are inquired in the database, and the data record inserted last in the data records is determined according to version information in the data records; and then combining the updated data content carried in the request and other data content in the data record inserted at last to generate a data record, and inserting the generated data record into the database.
9. The data processing system of claim 8, further comprising:
a first clearing processing module, configured to query all data records with the same object identifier as the new data record before or after the new data record is inserted into the update processing module, and retain the last inserted data record in the queried data record according to the version information, and delete the rest data records; or, the queried data records are sorted according to the version information, and if the number of the queried data records exceeds a set threshold, the number of the data records specified by the threshold is reserved from the last inserted data record in the sorted data records, and the rest data records are deleted.
10. The data processing system of claim 8, further comprising:
the second clearing processing module is used for inquiring the data records in the database when the preset time or period is reached, if a plurality of data records corresponding to the same object identifier exist, the data records inserted last in the inquired data records are reserved according to the version information, and the rest data records are deleted; or, inquiring the data records with the same object identification, and sequencing the inquired data records according to the version information; if the number of the inquired data records exceeds the set threshold, keeping the number of the data records specified by the threshold from the last inserted data record in the sorted data records, and deleting the rest data records.
11. The data processing system of claim 8, wherein the obtaining module is further configured to obtain a request to query a data record;
the data processing system further comprises:
the query processing module is used for querying all data records with the object identification according to the object identification in the data records requested to be queried after acquiring a request for querying the data records; and then, returning the data record which is inserted last in the inquired data record as an inquiry result according to the version information in the data record.
12. The system of claim 8, wherein the obtaining module is further configured to obtain a request to delete a data record;
the data processing system further comprises:
and the deletion processing module is used for newly inserting a data record with the same object identifier into the database according to the data record required to be deleted, adding version information into the newly inserted data record and marking the data record as deleted after acquiring a request for deleting the data record in the database.
13. The system of claim 12, wherein the obtaining module is further configured to obtain a request to query a data record;
the data processing system further comprises:
the query processing module is used for querying all data records with the object identification according to the object identification of the data record requested to be queried after acquiring a request for querying the data record; and returning a query result indicating that the data record requested to be queried is not searched when judging that the data record inserted last in the queried data record is marked as deleted according to the version information in the data record.
CN200910260173.6A 2009-12-24 2009-12-24 A kind of data processing method and system thereof Active CN102110121B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN200910260173.6A CN102110121B (en) 2009-12-24 2009-12-24 A kind of data processing method and system thereof
HK11108936.7A HK1155236B (en) 2011-08-24 A data processing method and system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200910260173.6A CN102110121B (en) 2009-12-24 2009-12-24 A kind of data processing method and system thereof

Publications (2)

Publication Number Publication Date
CN102110121A CN102110121A (en) 2011-06-29
CN102110121B true CN102110121B (en) 2015-09-23

Family

ID=44174283

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910260173.6A Active CN102110121B (en) 2009-12-24 2009-12-24 A kind of data processing method and system thereof

Country Status (1)

Country Link
CN (1) CN102110121B (en)

Families Citing this family (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102231161A (en) * 2011-06-30 2011-11-02 北京新媒传信科技有限公司 Method for synchronously verifying and monitoring databases
CN103020058B (en) * 2011-09-21 2016-07-06 阿里巴巴集团控股有限公司 A kind of multi-version data acquisition method and device
CN103092840B (en) * 2011-10-28 2015-09-16 上海邮电设计咨询研究院有限公司 Multi-source is from increasing massive data files real-time collecting method
CN102663045B (en) * 2012-03-29 2013-11-06 苏州阔地网络科技有限公司 Method and system for processing data information
CN103390041B (en) * 2013-07-18 2016-05-04 杭州东信北邮信息技术有限公司 A kind of method and system that data, services is provided based on middleware
CN103455677B (en) * 2013-09-04 2017-06-09 广东电网公司电力调度控制中心 Environmental simulation method and system
CN103744906A (en) * 2013-12-26 2014-04-23 乐视网信息技术(北京)股份有限公司 System, method and device for data synchronization
CN104239476B (en) * 2014-09-04 2018-09-25 上海天脉聚源文化传媒有限公司 A kind of method, apparatus and system of database synchronization
CN105893393B (en) * 2015-01-26 2019-11-05 阿里巴巴集团控股有限公司 Data save method and device
US9996563B2 (en) 2015-03-23 2018-06-12 International Business Machines Corporation Efficient full delete operations
CN104699541B (en) * 2015-03-30 2018-07-10 北京奇虎科技有限公司 Method, apparatus, data transfer components and the system of synchrodata
CN106156070B (en) * 2015-03-31 2019-07-12 华为技术有限公司 A kind of querying method, file mergences method and relevant apparatus
WO2017181430A1 (en) * 2016-04-22 2017-10-26 华为技术有限公司 Method and device for duplicating database in distributed system
CN107733957B (en) * 2016-08-12 2020-10-16 北京融聚世界网络科技有限公司 Distributed service configuration system and version number distribution method
CN106326425B (en) * 2016-08-24 2019-11-05 明算科技(北京)股份有限公司 Data classification treating method and apparatus
CN107861959A (en) * 2016-09-22 2018-03-30 阿里巴巴集团控股有限公司 Data processing method, apparatus and system
CN108073596B (en) * 2016-11-10 2020-08-14 北京国双科技有限公司 Data deletion method and device for OLAP database
CN108804442B (en) * 2017-04-27 2022-06-07 北京京东尚科信息技术有限公司 Serial number generation method and device
CN110583004B (en) * 2017-05-02 2022-08-30 国际商业机器公司 Method, processing system and storage medium for server to provide data values
US10540282B2 (en) 2017-05-02 2020-01-21 International Business Machines Corporation Asynchronous data store operations including selectively returning a value from cache or a value determined by an asynchronous computation
CN110069487A (en) * 2017-09-28 2019-07-30 北京国双科技有限公司 A kind of data processing method, apparatus and system
CN108399259A (en) * 2018-03-09 2018-08-14 深圳市汗青文化传媒有限公司 A kind of data processing method and system
CN109144980A (en) * 2018-08-21 2019-01-04 成都四方伟业软件股份有限公司 Metadata management method, device and electronic equipment
CN109408589B (en) * 2018-09-14 2020-08-14 新华三大数据技术有限公司 Data synchronization method and device
CN113287099B (en) * 2019-01-23 2024-05-28 株式会社斯凯拉 Tamper-detectable system
CN112114839A (en) * 2019-06-20 2020-12-22 上海安吉星信息服务有限公司 Method and system for rapid upgrade of standby environment
CN110544086A (en) * 2019-07-17 2019-12-06 金华苏夏信息技术有限公司 Non-networking selective payment method for hotel sales counter
CN111367893A (en) * 2020-03-31 2020-07-03 中国建设银行股份有限公司 Method and device for database version iteration
CN111488483B (en) * 2020-04-16 2023-10-24 北京雷石天地电子技术有限公司 Method, device, terminal and non-transitory computer readable storage medium for updating a library
CN112015819A (en) * 2020-08-31 2020-12-01 杭州欧若数网科技有限公司 Data updating method, device, equipment and medium for distributed graph database
WO2022111733A1 (en) * 2020-11-30 2022-06-02 百果园技术(新加坡)有限公司 Message processing method and apparatus, and electronic device
CN112214479B (en) * 2020-12-01 2021-07-13 陕西亚创医软信息科技有限公司 Medical data management system and method based on big data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1856785A (en) * 2004-03-29 2006-11-01 微软公司 Systems and methods for versioning based triggers
CN101506766A (en) * 2005-05-10 2009-08-12 微软公司 Database corruption recovery systems and methods

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005026413A (en) * 2003-07-01 2005-01-27 Renesas Technology Corp Semiconductor wafer, semiconductor device, and its manufacturing method
CN101408864B (en) * 2007-10-09 2011-08-24 群联电子股份有限公司 Data protection method for power failure and controller using the method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1856785A (en) * 2004-03-29 2006-11-01 微软公司 Systems and methods for versioning based triggers
CN101506766A (en) * 2005-05-10 2009-08-12 微软公司 Database corruption recovery systems and methods

Also Published As

Publication number Publication date
HK1155236A1 (en) 2012-05-11
CN102110121A (en) 2011-06-29

Similar Documents

Publication Publication Date Title
CN102110121B (en) A kind of data processing method and system thereof
US10901861B2 (en) Systems and methods of restoring a dataset of a database for a point in time
US11436194B1 (en) Storage system for file system objects
CN108694195B (en) Management method and system of distributed data warehouse
US8626717B2 (en) Database backup and restore with integrated index reorganization
US9817835B2 (en) Efficient data synchronization for storage containers
CN110019469B (en) Distributed database data processing method and device, storage medium and electronic device
CN109891402A (en) The conversion of revocable and on-line mode
WO2015034827A1 (en) Replication of snapshots and clones
CN113094442B (en) Full data synchronization method, device, equipment and medium
CN104301360A (en) Method, log server and system for recording log data
GB2529436A (en) Data processing apparatus and method
CN107665219B (en) Log management method and device
CN106161193B (en) Mail processing method, device and system
CN111930716A (en) Database capacity expansion method, device and system
US10452496B2 (en) System and method for managing storage transaction requests
US10732840B2 (en) Efficient space accounting mechanisms for tracking unshared pages between a snapshot volume and its parent volume
CN114090538A (en) Data backtracking method and device
CN113835613A (en) A file reading method, device, electronic device and storage medium
CN110413617B (en) Method for dynamically adjusting hash table group according to size of data volume
HK1155236B (en) A data processing method and system thereof
US20150052107A1 (en) Object dependency management
CN110851445B (en) Method for safely storing data based on block chain technology
CN115455012A (en) Data acquisition method, electronic equipment and storage medium
CN116756149A (en) Star-shaped parallel single-theme multi-source data fusion method, medium, equipment and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1155236

Country of ref document: HK

C14 Grant of patent or utility model
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1155236

Country of ref document: HK