CN102110121B - A kind of data processing method and system thereof - Google Patents
A kind of data processing method and system thereof Download PDFInfo
- Publication number
- CN102110121B CN102110121B CN200910260173.6A CN200910260173A CN102110121B CN 102110121 B CN102110121 B CN 102110121B CN 200910260173 A CN200910260173 A CN 200910260173A CN 102110121 B CN102110121 B CN 102110121B
- Authority
- CN
- China
- Prior art keywords
- data
- data record
- data records
- database
- records
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 10
- 238000000034 method Methods 0.000 claims abstract description 18
- 238000012163 sequencing technique Methods 0.000 claims abstract 3
- 238000012545 processing Methods 0.000 claims description 93
- 238000003780 insertion Methods 0.000 claims description 23
- 230000037431 insertion Effects 0.000 claims description 23
- 238000012217 deletion Methods 0.000 claims description 9
- 230000037430 deletion Effects 0.000 claims description 9
- 230000010076 replication Effects 0.000 abstract description 31
- 230000006870 function Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 12
- 235000013399 edible fruits Nutrition 0.000 description 11
- 230000008569 process Effects 0.000 description 10
- 238000004140 cleaning Methods 0.000 description 9
- 238000004590 computer program Methods 0.000 description 5
- 230000001360 synchronised effect Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of data processing method and system thereof, the method comprises: after getting the request upgraded the data record in database, according to ask upgrade data record newly insert the identical data record of an object identity in the database, and version information is added in the new data record inserted, described version information is for showing the time sequencing of the update of the data record with same object mark; When by described database replication to another database time, utilize the replicate run of multiple thread parallel performing database.Adopt above-mentioned data processing method and system thereof, under the prerequisite of data consistency can being ensured to a certain extent, improve the efficiency of database replication operation.
Description
Technical Field
The present application relates to data processing technologies in the field of communications, and in particular, to a data processing method and system.
Background
Mainstream relational databases (e.g., MySQL, PostgreSQL, Oracle) have master-slave Replication (Replication) functionality. Assuming that a is a master server, B is a slave server, and the database-based service system deploys a function of "database server a" to "database server B", when data add (Insert), Delete (Delete), and Update (Update) operations occur on the database server a, these operations are synchronized to the database server B, which is called Replication (Replication).
In general, when people deal with database-based services, four operations of adding (Insert), deleting (Delete), changing (Update) and searching (Select) data records in a database are often used. After the database replication is deployed, the operations (including addition, deletion, and modification) related to the data record writing operation are synchronized by the master database server to the slave database server in sequence.
But such synchronization operations must be ordered. Assume that in a business system in which user signatures are recorded, a series of operations occur on a master database server DB1, a slave database server DB2, and a DB 1:
scene one: the user with ID 12065 sets his signature to "dog in leisure;
the corresponding SQL statement 1 is: update user _ sign set nickname = "dog in love management" whereuserid = 12065;
the corresponding database operation is: update the nickname of the user with ID 12065 to "dog in leisure;
scene two: the user with ID 12065 sets his signature to "fruit in midsummer";
the corresponding SQL statement 2 is: update user _ sign set nickname = "summer fruit" whereuserid = 12065;
the corresponding database operation is: update the nickname of the user with the ID of 12065 to "fruit in midsummer";
scene three: the user with ID 12065 sets his signature to "about winter";
the corresponding SQL statement 3 is: update user _ sign set nickname = "greenuserid =12065 about in winter;
the corresponding database operation is: the nickname of the user having ID 12065 is updated to "approximately in winter".
When the DB1 performs a copy operation to the DB2, the SQL statement 1, SQL statement 2, and SQL statement 3 for updating the data records must be executed in sequence, so that the data on the DB1 and the DB2 can be consistent. If the execution order on DB2 is: SQL statement 1, SQL statement 3, and SQL statement 2, the nickname of 12065 on DB1 is "about winter", and the nickname of 12065 on DB2 is "fruit in midsummer", and the data in the master and slave database servers are inconsistent.
Since the operations of updating the data records must be executed in sequence during the replication process to ensure the consistency of the data in the master and slave database servers, in order to ensure the consistency of the data in the master and slave database servers, the replication operation between the master and slave database servers is usually performed in a single thread, and the execution efficiency is low.
Content of application
The embodiment of the application provides a database operation method and a system thereof, which are used for solving the problem of low database copy operation efficiency in the existing data processing technology.
The technical scheme provided by the embodiment of the application comprises the following steps:
a method of data processing comprising the steps of:
after a request for updating data records in a database is acquired, newly inserting data records with the same object identifier into the database according to the data records requested to be updated, and adding version information into the newly inserted data records, wherein the version information is used for indicating the time sequence of the insertion operation of the data records with the same object identifier;
performing a database copy operation in parallel with a plurality of threads when copying the database to another database; wherein,
newly inserting a data record with the same object identifier in the database according to the data record requested to be updated, and comprising the following steps:
inquiring all data records with the same object identification as the data record requested to be updated in the database, and determining the data record inserted last in the data records according to the version information in the data records;
and merging and generating a data record with the same object identifier according to the updated data content carried in the request and the data content in the last inserted data record, and inserting the generated data record into the database.
A data processing system comprising:
the acquisition module is used for acquiring a request for updating the data records in the database;
the updating processing module is used for newly inserting a data record with the same object identifier into the database according to the data record required to be updated after acquiring a request for updating the data record, and adding version information into the newly inserted data record, wherein the version information is used for indicating the time sequence of the inserting operation of the data record with the same object identifier;
the copy processing module is used for executing the database copy operation in parallel by utilizing a plurality of threads when the database is copied to another database; wherein,
when a data record with the same object identifier is newly inserted into the update processing module, all data records with the same object identifier as the data record requested to be updated are inquired in the database, and the data record inserted last in the data records is determined according to version information in the data records; and then combining the updated data content carried in the request and other data content in the data record inserted at last to generate a data record, and inserting the generated data record into the database.
In the embodiment of the application, when a request for updating the data record in the database is acquired, a data record with the same object identifier is newly inserted into the database, and version information is added into the newly inserted data record, so that the data updating operation is replaced by a data inserting operation; when a database is copied, a plurality of threads are used for executing the database copying operation in parallel. On one hand, the efficiency of the database copy operation can be improved by using the multithread parallel execution of the database copy operation; on the other hand, since the operation of updating the data record is replaced by the operation of inserting the data record, even if the data record is not copied before and after the time of the data record inserting operation strictly due to the fact that the copying operation is executed by using multiple threads, the data consistency of the master database and the slave database is not damaged.
Drawings
FIG. 1 is a block diagram of a data processing system according to an embodiment of the present application;
FIG. 2 is a second schematic diagram of a data processing system according to an embodiment of the present application;
fig. 3 is a third schematic structural diagram of a data processing system according to an embodiment of the present application.
Detailed Description
Aiming at the problems in the prior art, the embodiment of the application replaces the data updating operation with the data inserting operation and performs the database copying operation in parallel by using multiple threads, so that the database copying efficiency is improved on the premise of ensuring the data consistency of the master database and the slave database.
The data processing system of the embodiment of the application can operate the data records in the database, and comprises the following steps: insert (Insert), Delete (Delete), query (Select) operations for data records, copy (Replication) of data from the master database to the slave database, etc. The database tables in the master database and the slave database comprise a field for recording version information, wherein the version information of the data record is recorded. A data record in a database table typically includes an identification field for describing the unique identity of an object, such as a user ID, while other fields are used for describing attributes of the object, such as a user name, age, etc., and may also include attributes describing the data record, such as version information of the data record.
The data processing system of the embodiment of the application can acquire various data processing requests input by a user through a human-computer interaction interface provided by the data processing system, such as: a request to create, update, delete or query a data record, or a request to make a database copy;
when a system acquires a request for creating a data record, generating a statement for inserting the data record, inserting the data record into a database by executing the statement, and adding version information into the newly inserted data record;
when a system acquires a request for updating a data record, generating a statement for inserting the data record, and inserting the data record into a database by executing the statement, wherein version information is added into the newly inserted data record and has the same object identifier as the data record requested to be updated;
when the system acquires a request for inquiring the data record, generating a statement for inquiring the data record according to the identification of the inquired object, and returning the content of the data record which is newly inserted in all the data records with the identification of the object by executing the statement;
when the system performs the database replication operation according to the deployed database replication policy (such as replication cycle) or the obtained replication request, the multithread parallel execution database replication operation can be used in the replication process, and further, the multithread parallel execution database replication operation can be realized by adopting multi-server distribution.
The data processing system of the embodiment of the application may further include a data record deletion function, and when the system obtains a request for deleting a data record, a statement for inserting the data record is generated, and a data record having the same object identifier as the data record requested to be deleted is inserted into the database by executing the statement, and version information indicating the insertion time sequence of the data record is added to the data record, and the data record is marked as deleted. Correspondingly, when the system acquires a data record query request, firstly, the newly inserted data record is searched according to the identifier of the object requested to be queried, then, whether the data record is marked to be deleted is judged, if so, the not searched query result is returned, and otherwise, the content of the newly inserted data record is returned as the query result.
The version information described above may be generated by the data processing system when a data record is inserted. The version information is used for indicating the chronological order of the insertion operations of the data records with the same object identification. Because the data processing system of the embodiment of the application converts the updating operation of the data record into the inserting operation of the data record, for the data record of the same object identifier, if a subsequent corresponding user requests for updating the content of the data record for many times, a corresponding number of new data records can be inserted into the database table. In order to provide accurate query results to a user (i.e. a plurality of data records identified for the same object only return the latest inserted data record therein as a query result) when querying a database, version information of each data record needs to be identified to indicate the time sequence of data record insertion. The version information may be a sequence number which is sequentially increased, or may be a timestamp, or other character combinations which can indicate the chronological order.
Because the data processing system of the embodiment of the application replaces the update operation and the deletion of the data records with the insertion operation of the data records, and each data record contains the version information, when the database is copied, the database is not required to be copied according to the time sequence executed by each data operation strictly, and therefore the database copying operation can be performed in a multithreading concurrent mode and a multi-server distributed mode, the data consistency is guaranteed to a certain extent, and meanwhile the efficiency of the database copying operation is improved.
Embodiments of the present application are described in detail below with reference to the accompanying drawings.
Referring to fig. 1, a schematic structural diagram of a data processing system according to an embodiment of the present application is shown, where the system may be implemented by software programming. The database operating system includes: the obtaining module 101, the updating processing module 103, and the copying processing module 106 may further include: an insertion processing module 102, a deletion processing module 104, and a query processing module 105.
The main functions of each functional module in the data processing system comprise:
the acquisition module 101: various data processing requests input by a user through a man-machine interaction interface provided by the data processing system can be acquired, such as: a request for creating, updating, deleting or inquiring data records or a request for copying a database is made, and a corresponding data processing functional module can be triggered to perform data processing operation according to the received data processing request;
the insertion processing module 102: a statement for inserting a data record can be generated according to a request for creating a data record acquired by the acquisition module 101, and a data record is inserted into the database by executing the statement, wherein version information is added to the newly inserted data record;
the update processing module 103: a statement for inserting a data record may be generated according to a request for updating the data record acquired by the acquisition module 101, and a data record is inserted into the database by executing the statement, where version information is added to the newly inserted data record and the newly inserted data record has the same object identifier as the data record requested to be updated;
the deletion processing module 104: a statement for inserting a data record can be generated according to a request for deleting the data record acquired by the acquisition module 101, a new data record is inserted by executing the statement, and the data record is marked as deleted;
the query processing module 105: a statement for querying the data record can be generated according to the request for querying the data record obtained by the obtaining module 101, and a query result is returned by executing the statement;
the copy processing module 106: the database replication operation may be performed according to a request for replicating the database acquired by the acquisition module 101 or according to a deployed database replication policy (e.g., a replication cycle), and in the replication process, the database replication operation may be performed in a multi-server distributed manner and in a multi-thread parallel manner.
Various data processing flows of the embodiments of the present application will be described in detail below by taking as an example a database table structure of a data processing system (which may also be referred to as a business system) for recording a user signature shown in table 1. The business system for recording the user signature has the acquisition module 101, the insertion processing module 102, the update processing module 103, the query processing module 105 and the copy processing module 106 as described above, and may further include the deletion processing module 104.
Table 1: structure of database table user _ sign for recording user signature
userid | signature | revision |
In table 1, userid, signature, and review are field names of the database table, and the meaning of each field is as follows:
and (3) userid: a user Identification (ID);
signature: signature content (or called nickname);
and (3) revision: version information in the form of a serial number (hereinafter referred to as version number).
When a user requests registration, registration information such as a user name, a password, and a user signature (for example, the user signature is "dog in leisure) may be input through a human-computer interface provided by the system and a registration command is submitted, the obtaining module 101 receives the registration information and the registration command and submits the registration information and the registration command to the insertion processing module 102, the insertion processing module 102 assigns a user ID (for example, 12065) to the user, and then generates a data record insertion statement:
INSERT INTO user_sign
(userid, signature, vision) VALUES (12065, "dog in leisure, 1)
After the statement is executed, a new data record is inserted into the user _ sign database table, and the database table into which the data record is inserted is shown in table 2, wherein the version number allocated to the data record by the system is 1 because the data record is initially created.
Table 2: database table user _ sign containing initial user signature of userid =12065
userid | signature | revision |
12065 | Dog in love of leisure | 1 |
When the user wants to update his user signature, he can log in the system, input the updated user signature "fruit in summer" through the man-machine interface provided by the system, and then select the update command and submit. The obtaining module 101 of the system receives the update command and the updated user signature and submits the update command and the updated user signature to the update processing module 103, and the update processing module 103 can obtain the user ID of the login user, and then generates a data record insertion statement according to the obtained user ID and the updated user signature:
INSERT INTO user_sign
(userid, signature, vision) VALUES (12065, "fruit of midsummer", 2)
After executing the statement, a new data record is inserted into the use _ sign database table, and the database table after the data record is inserted is shown in table 3, wherein the update processing module 103 may query all the data records with the user ID of 12065 and query the maximum version number thereof before generating the data record insertion statement, and then increment the version number to obtain the version series number "2" of the data record to be inserted. Of course, if the version information is represented by a time stamp, it is sufficient to write the current time directly in the statement inserted into the data record as the version information.
Table 3: user with userid =12065 updates the database table user _ sign after the user signature for the first time
userid | signature | revision |
12065 | Dog in love of leisure | 1 |
12065 | Fruit of midsummer | 2 |
When the user wants to update his user signature again, he can log in the system, enter the updated user signature "about winter" through the man-machine interface provided by the system, and then select and submit the update command. The acquisition module 101 of the system receives the update command and the updated user signature and submits the update command and the updated user signature to the update processing module 103, and the update processing module 103 generates the following data record insertion statements:
INSERT INTO user_sign
(userid, signature, vision) VALUES (12065, "about winter", 3)
After the statement is executed, a new data record is inserted into the user _ sign database table, and the database table into which the data record is inserted is shown in table 4:
table 4: user with userid =12065 updates the user-signed database table user _ sign for the second time
userid | signature | revision |
12065 | Dog in love of leisure | 1 |
12065 | Fruit of midsummer | 2 |
12065 | About in winter | 3 |
The user-signed query operation is typically performed by a system administrator. When a system administrator inputs a userid of 12065 through a management operation interface provided by the system and selects a query command to submit, the obtaining module 101 of the system receives the command and a user ID and submits the command to the query processing module 105, and the query processing module 105 generates a query statement of the following data records:
SELECT signature FROM user_sign
WHERE userid=12065
ORDER BY revision DESC LIMIT 1
during the execution of the statement, the signature of the user with userid =12065 is inquired in a user _ sign database table shown in table 4, and is sorted in descending order of version number, and only the first sorted data record is returned. Since the version number corresponding to "about winter" is the largest in all data records of userid =12065, the query result is "about winter".
If the data processing system also provides the function of deleting data records, it is also necessary to convert the delete operation into an operation of inserting a data record, i.e. inserting a new record and marking the record as deleted. The operation of deleting data records from the user _ sign database table is typically performed by a system administrator. If the current user _ sign is as shown in table 4, when a system administrator inputs or selects a user signature "about in winter" through a management operation interface provided by the system, and selects a delete command to submit, the obtaining module 101 of the system receives the command and a user ID and submits the command to the delete processing module 104, and the delete processing module 104 obtains that the user ID of "about in winter" is 12065, and generates the following data record insertion statement:
INSERT INTO user_sign
(userid, signature, deleted, reproduction) VALUES (12065, "about winter", 1,4)
Where 4 is the version number assigned to the newly inserted data record, and deleted is a deletion flag field, whose value is 1, indicating that the inserted data record is an inserted record generated according to the deletion request, indicating that the user-signed data record whose user ID is 12065 has been deleted.
After the statement is executed, a new data record is inserted into the user _ sign database table, and the database table into which the data record is inserted is shown in table 5:
table 5: executing the database table user _ sign after deleting the user signature record for the user with userid =12065
userid | signature | revision | deleted |
12065 | Dog in love of leisure | 1 | |
12065 | Fruit of midsummer | 2 | |
12065 | About in winter | 3 | |
12065 | About in winter | 4 | 1 |
Aiming at a data processing system with a deleting function, when a data record is queried, a newly inserted data record is found according to a query condition, then whether the data record is marked as deleted is judged, if not, a query result is generated according to the data record, otherwise, a query result indicating that the requested data is not queried is returned. For example, for the database table user _ sign shown in table 5, when a system administrator inputs a user signature "about in winter" through a management operation interface provided by the system, and selects a query command to submit, the obtaining module 101 of the system receives the command and a user ID and submits the command to the query processing module 105, the query processing module 105 queries userid =12065 "about in winter" through a generated query statement, and queries a data record with a version number of 4, which is a newly inserted data record, in the data records with userid =12065, and indicates "not found" because the deleted field value of the record is 1.
When the system is deployed with a master and slave database replication policy, the system may perform database replication by the replication processing module 106 according to the deployed replication policy (e.g., execution cycle of replication operation, etc.). During the replication process, data replication operations can be performed in a multi-thread parallel and multi-server distributed manner. As in the above example, during the database replication process, two data records with version number 2 and version number 3 may be synchronized into the slave database server in any order, the corresponding 2 data record insertion operation records may also be synchronized into the slave database server in any order, and finally, the data in the master and slave database servers are consistent.
The user _ sign database table only has one signature field except for the user ID and version information fields, and in practical application, the database table may contain more fields. Table 6 gives an example of a user _ sign database table containing more fields, where an age field is added to the user _ sign database table:
table 6: user _ sign table with more fields
userid | signature | age | revision |
12065 | Dog in love of leisure | 18 | 1 |
12065 | Fruit of midsummer | 18 | 2 |
For the database table shown in table 6, the processes and principles of the data record insertion operation when the user registers, the operation of deleting the user record, and the database copy operation are the same as the corresponding flows described above. When a user requests to update the data record, the process is as follows:
when the user wants to update his user signature, he can log in the system, enter the updated user signature "about winter" through the man-machine interface provided by the system, and then select the update command and submit. The obtaining module 101 of the system receives the update command and the updated user signature and submits the update command and the updated user signature to the update processing module 103, the update processing module 103 may obtain the user ID of the login user as 12065, then query the data record of userid =12065 from the user _ sign table according to the obtained user ID, then use the age field value "18" in the updated user signature "about in winter" and the queried data record of userid =12065 as the corresponding field value of the new data record, and generate the following data record insertion statement:
INSERT INTO user_sign
(userid, signature, age, vision) VALUES (12065, "about winter", 18,3)
After the statement is executed, a new data record is inserted into the user _ sign database table, and the database table into which the data record is inserted is shown in table 7:
table 7: user _ sign table after user signature is updated
userid | signature | age | revision |
12065 | Dog in love of leisure | 18 | 1 |
12065 | Fruit of midsummer | 18 | 2 |
12065 | About in winter | 18 | 3 |
It can be seen that, when generating the INSERT statement, for each field, if there is such a field value in the data submitted by the update request, the value of the field is subject to the field value submitted in the update request, and if not, the corresponding field value in the latest version of the queried data record.
Because the data processing system according to the embodiment of the present application replaces the operation of updating the data record with the operation of inserting the data record, redundant data records may be stored in the database table for the same object identifier, and for example, for the user _ sign database table shown in table 7, for a data record with userid =12065, multiple versions of user signatures (signatures) may be stored in the database table. In order to ensure the capacity and performance of the database, in the embodiment of the present application, only a certain number of data records may be saved for the data records of the same object identifier, and other redundant data records may be deleted. There are various ways to delete redundant data records, such as:
the first method is as follows: timing cleaning data
A cleaning program may be programmed in advance and then executed periodically. The cleaning program can traverse the database table and delete all or part of the redundant data records in the database table. For example, for the above database table named user _ sign, each time a cleaning program is executed, the cleaning program traverses the data records in the table, and finds that a plurality of data records with userid of 12065 exist, so that according to the preset setting, only the data record with the maximum version number is reserved, and the data records with other version numbers are deleted; alternatively, when the threshold of the number of data records is 2, 2 data records are retained and the other data records are deleted from the largest version number after being sorted in the descending order of the version numbers.
The second method comprises the following steps: cleaning data each time a data record is inserted
Before or after each insertion of a data record, the database table may be traversed and the redundant data records therein deleted, in whole or in part. For example, for the above database table named user _ sign, after a user inserts a data record by requesting to update a user signature each time, traversing the data records in the table, finding that a plurality of data records with userid of 12065 exist, so that according to the preset setting, only the data record with the maximum version number can be reserved, and the data records with other version numbers can be deleted; alternatively, when the threshold of the number of data records is 2, 2 data records are retained and the other data records are deleted from the largest version number after being sorted in the descending order of the version numbers.
The cleaning operation can be implemented by corresponding functional modules, such as adding a first cleaning processing module 107 in the data processing system of the embodiment of the present application, as shown in fig. 2, or adding a second cleaning processing module 108, as shown in fig. 3. The first cleaning processing module 107 can implement a function of cleaning data at regular time, and the second cleaning processing module 108 can clean data each time a data record is inserted.
Compared with the two modes, the cleaning is carried out when the data record is inserted every time, so that the coupling degree of the insertion operation is improved, the response speed is reduced, and therefore, in practical application, the data can be cleaned at fixed time, and better reliability and shorter response time can be obtained.
The embodiment of the application is particularly suitable for application in the aspect of internet content services which have low requirements on data redundancy and higher requirements on concurrency. In terms of database type, the embodiment of the present application is particularly suitable for a database stored by using key-value (i.e. key value pair), such as the database of the business system for recording user signature.
The database operating system of the embodiment of the present application may provide the following functions in addition to the above functions: when the data on the main database server is damaged, the data can be recovered by the data of the slave database server; when the main database server is down, the slave database server can take over the work of the main database server, the higher the efficiency of the copying operation between the main database server and the slave database server is, the better the real-time performance of the data on the slave database server is; the slave database server and the master database server can share the load together to realize load balance. The higher the copying operation efficiency between the master database server and the slave database server is, the better the consistency of the data on the master database server and the slave database server is, and the better the effect of the functions is.
It should be noted that, although the data processing system shown in fig. 1, fig. 2 or fig. 3 is taken as a basis for describing the data processing process, those skilled in the art should understand that the functional module division manner of the data processing system shown in fig. 1, fig. 2 and fig. 3 is only one of the functional module division manners, and as long as the data processing system can implement the data processing process described in the embodiment of the present invention and has the functions of the data processing system in the embodiment of the present invention, no matter how the functional module division manner is adopted, the data processing system should be within the scope of the present invention.
In summary, in the embodiment of the present application, data insertion operation is used to replace data update operation, so that there is no update operation type in the data operation type, and therefore there is no substitution and coverage of data, as long as these insertion statements are finally synchronized to the slave database server, it can be ensured that the data in the master database server and the slave database server are consistent, and the order of statement execution has no influence on the data consistency, so that database replication can be executed in parallel in multiple threads, even in a distributed manner, thereby greatly improving execution efficiency and eliminating blocking.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (13)
1. A data processing method, characterized by comprising the steps of:
after a request for updating data records in a database is acquired, newly inserting data records with the same object identifier into the database according to the data records requested to be updated, and adding version information into the newly inserted data records, wherein the version information is used for indicating the time sequence of the insertion operation of the data records with the same object identifier;
performing a database copy operation in parallel with a plurality of threads when copying the database to another database; wherein,
newly inserting a data record with the same object identifier in the database according to the data record requested to be updated, and comprising the following steps:
inquiring all data records with the same object identification as the data record requested to be updated in the database, and determining the data record inserted last in the data records according to the version information in the data records;
and merging and generating a data record with the same object identifier according to the updated data content carried in the request and the data content in the last inserted data record, and inserting the generated data record into the database.
2. The data processing method of claim 1, further comprising: when the preset time or period is reached, inquiring the data records in the database, if a plurality of data records corresponding to the same object identifier exist, retaining the last inserted data record in the inquired data records according to the version information, and deleting the rest data records; or,
when the preset time or period is reached, inquiring the data records in the database, if a plurality of data records corresponding to the same object identifier exist, sorting the inquired data records according to the version information, and when the number of the inquired data records exceeds a set threshold, keeping the data records of the number specified by the threshold from the last inserted data record in the sorted data records, and deleting the rest data records.
3. The data processing method of claim 1, further comprising, before or after inserting a new data record:
inquiring all data records with the same object identification as the new data record, reserving the data record inserted in the inquired data record according to the version information, and deleting the rest data records; or,
inquiring all data records with the same object identification as the new data record, and sequencing the inquired data records according to the version information; if the number of the inquired data records exceeds the set threshold, keeping the number of the data records specified by the threshold from the last inserted data record in the sorted data records, and deleting the rest data records.
4. The data processing method of claim 1, further comprising:
after a request for inquiring the data records is obtained, inquiring all the data records with the object identification according to the object identification of the data records required to be inquired; and returning the data record which is inserted last in the inquired data record as an inquiry result according to the version information in the data record.
5. The method of claim 1, further comprising:
when a request for deleting the data record in the database is acquired, a data record with the same object identifier is newly inserted into the database according to the data record requested to be deleted, version information is added into the newly inserted data record, and the data record is marked as deleted.
6. The method of claim 5, further comprising:
after a request for inquiring the data records is obtained, inquiring all the data records with the object identification according to the object identification of the data records required to be inquired; and returning a query result indicating that the data record requested to be queried is not searched when judging that the data record inserted last in the queried data record is marked as deleted according to the version information in the data record.
7. The data processing method of any of claims 1-6, wherein the version information is a sequence number or a timestamp;
if the version information is a serial number, adding the version information in the newly inserted data record, specifically:
inquiring all data records with the same object identification as the data record requested to be updated in the database, and determining the data record inserted last in the data records according to the serial number of the version information field in the data records;
and increasing the sequence number of the version information field of the last inserted data record, and taking the increased sequence number as the sequence number of the version information field of the generated data record.
8. A data processing system, comprising:
the acquisition module is used for acquiring a request for updating the data records in the database;
the updating processing module is used for newly inserting a data record with the same object identifier into the database according to the data record required to be updated after acquiring a request for updating the data record, and adding version information into the newly inserted data record, wherein the version information is used for indicating the time sequence of the inserting operation of the data record with the same object identifier;
the copy processing module is used for executing the database copy operation in parallel by utilizing a plurality of threads when the database is copied to another database; wherein,
when a data record with the same object identifier is newly inserted into the update processing module, all data records with the same object identifier as the data record requested to be updated are inquired in the database, and the data record inserted last in the data records is determined according to version information in the data records; and then combining the updated data content carried in the request and other data content in the data record inserted at last to generate a data record, and inserting the generated data record into the database.
9. The data processing system of claim 8, further comprising:
a first clearing processing module, configured to query all data records with the same object identifier as the new data record before or after the new data record is inserted into the update processing module, and retain the last inserted data record in the queried data record according to the version information, and delete the rest data records; or, the queried data records are sorted according to the version information, and if the number of the queried data records exceeds a set threshold, the number of the data records specified by the threshold is reserved from the last inserted data record in the sorted data records, and the rest data records are deleted.
10. The data processing system of claim 8, further comprising:
the second clearing processing module is used for inquiring the data records in the database when the preset time or period is reached, if a plurality of data records corresponding to the same object identifier exist, the data records inserted last in the inquired data records are reserved according to the version information, and the rest data records are deleted; or, inquiring the data records with the same object identification, and sequencing the inquired data records according to the version information; if the number of the inquired data records exceeds the set threshold, keeping the number of the data records specified by the threshold from the last inserted data record in the sorted data records, and deleting the rest data records.
11. The data processing system of claim 8, wherein the obtaining module is further configured to obtain a request to query a data record;
the data processing system further comprises:
the query processing module is used for querying all data records with the object identification according to the object identification in the data records requested to be queried after acquiring a request for querying the data records; and then, returning the data record which is inserted last in the inquired data record as an inquiry result according to the version information in the data record.
12. The system of claim 8, wherein the obtaining module is further configured to obtain a request to delete a data record;
the data processing system further comprises:
and the deletion processing module is used for newly inserting a data record with the same object identifier into the database according to the data record required to be deleted, adding version information into the newly inserted data record and marking the data record as deleted after acquiring a request for deleting the data record in the database.
13. The system of claim 12, wherein the obtaining module is further configured to obtain a request to query a data record;
the data processing system further comprises:
the query processing module is used for querying all data records with the object identification according to the object identification of the data record requested to be queried after acquiring a request for querying the data record; and returning a query result indicating that the data record requested to be queried is not searched when judging that the data record inserted last in the queried data record is marked as deleted according to the version information in the data record.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200910260173.6A CN102110121B (en) | 2009-12-24 | 2009-12-24 | A kind of data processing method and system thereof |
HK11108936.7A HK1155236B (en) | 2011-08-24 | A data processing method and system thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200910260173.6A CN102110121B (en) | 2009-12-24 | 2009-12-24 | A kind of data processing method and system thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102110121A CN102110121A (en) | 2011-06-29 |
CN102110121B true CN102110121B (en) | 2015-09-23 |
Family
ID=44174283
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200910260173.6A Active CN102110121B (en) | 2009-12-24 | 2009-12-24 | A kind of data processing method and system thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102110121B (en) |
Families Citing this family (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102231161A (en) * | 2011-06-30 | 2011-11-02 | 北京新媒传信科技有限公司 | Method for synchronously verifying and monitoring databases |
CN103020058B (en) * | 2011-09-21 | 2016-07-06 | 阿里巴巴集团控股有限公司 | A kind of multi-version data acquisition method and device |
CN103092840B (en) * | 2011-10-28 | 2015-09-16 | 上海邮电设计咨询研究院有限公司 | Multi-source is from increasing massive data files real-time collecting method |
CN102663045B (en) * | 2012-03-29 | 2013-11-06 | 苏州阔地网络科技有限公司 | Method and system for processing data information |
CN103390041B (en) * | 2013-07-18 | 2016-05-04 | 杭州东信北邮信息技术有限公司 | A kind of method and system that data, services is provided based on middleware |
CN103455677B (en) * | 2013-09-04 | 2017-06-09 | 广东电网公司电力调度控制中心 | Environmental simulation method and system |
CN103744906A (en) * | 2013-12-26 | 2014-04-23 | 乐视网信息技术(北京)股份有限公司 | System, method and device for data synchronization |
CN104239476B (en) * | 2014-09-04 | 2018-09-25 | 上海天脉聚源文化传媒有限公司 | A kind of method, apparatus and system of database synchronization |
CN105893393B (en) * | 2015-01-26 | 2019-11-05 | 阿里巴巴集团控股有限公司 | Data save method and device |
US9996563B2 (en) | 2015-03-23 | 2018-06-12 | International Business Machines Corporation | Efficient full delete operations |
CN104699541B (en) * | 2015-03-30 | 2018-07-10 | 北京奇虎科技有限公司 | Method, apparatus, data transfer components and the system of synchrodata |
CN106156070B (en) * | 2015-03-31 | 2019-07-12 | 华为技术有限公司 | A kind of querying method, file mergences method and relevant apparatus |
WO2017181430A1 (en) * | 2016-04-22 | 2017-10-26 | 华为技术有限公司 | Method and device for duplicating database in distributed system |
CN107733957B (en) * | 2016-08-12 | 2020-10-16 | 北京融聚世界网络科技有限公司 | Distributed service configuration system and version number distribution method |
CN106326425B (en) * | 2016-08-24 | 2019-11-05 | 明算科技(北京)股份有限公司 | Data classification treating method and apparatus |
CN107861959A (en) * | 2016-09-22 | 2018-03-30 | 阿里巴巴集团控股有限公司 | Data processing method, apparatus and system |
CN108073596B (en) * | 2016-11-10 | 2020-08-14 | 北京国双科技有限公司 | Data deletion method and device for OLAP database |
CN108804442B (en) * | 2017-04-27 | 2022-06-07 | 北京京东尚科信息技术有限公司 | Serial number generation method and device |
CN110583004B (en) * | 2017-05-02 | 2022-08-30 | 国际商业机器公司 | Method, processing system and storage medium for server to provide data values |
US10540282B2 (en) | 2017-05-02 | 2020-01-21 | International Business Machines Corporation | Asynchronous data store operations including selectively returning a value from cache or a value determined by an asynchronous computation |
CN110069487A (en) * | 2017-09-28 | 2019-07-30 | 北京国双科技有限公司 | A kind of data processing method, apparatus and system |
CN108399259A (en) * | 2018-03-09 | 2018-08-14 | 深圳市汗青文化传媒有限公司 | A kind of data processing method and system |
CN109144980A (en) * | 2018-08-21 | 2019-01-04 | 成都四方伟业软件股份有限公司 | Metadata management method, device and electronic equipment |
CN109408589B (en) * | 2018-09-14 | 2020-08-14 | 新华三大数据技术有限公司 | Data synchronization method and device |
CN113287099B (en) * | 2019-01-23 | 2024-05-28 | 株式会社斯凯拉 | Tamper-detectable system |
CN112114839A (en) * | 2019-06-20 | 2020-12-22 | 上海安吉星信息服务有限公司 | Method and system for rapid upgrade of standby environment |
CN110544086A (en) * | 2019-07-17 | 2019-12-06 | 金华苏夏信息技术有限公司 | Non-networking selective payment method for hotel sales counter |
CN111367893A (en) * | 2020-03-31 | 2020-07-03 | 中国建设银行股份有限公司 | Method and device for database version iteration |
CN111488483B (en) * | 2020-04-16 | 2023-10-24 | 北京雷石天地电子技术有限公司 | Method, device, terminal and non-transitory computer readable storage medium for updating a library |
CN112015819A (en) * | 2020-08-31 | 2020-12-01 | 杭州欧若数网科技有限公司 | Data updating method, device, equipment and medium for distributed graph database |
WO2022111733A1 (en) * | 2020-11-30 | 2022-06-02 | 百果园技术(新加坡)有限公司 | Message processing method and apparatus, and electronic device |
CN112214479B (en) * | 2020-12-01 | 2021-07-13 | 陕西亚创医软信息科技有限公司 | Medical data management system and method based on big data |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1856785A (en) * | 2004-03-29 | 2006-11-01 | 微软公司 | Systems and methods for versioning based triggers |
CN101506766A (en) * | 2005-05-10 | 2009-08-12 | 微软公司 | Database corruption recovery systems and methods |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005026413A (en) * | 2003-07-01 | 2005-01-27 | Renesas Technology Corp | Semiconductor wafer, semiconductor device, and its manufacturing method |
CN101408864B (en) * | 2007-10-09 | 2011-08-24 | 群联电子股份有限公司 | Data protection method for power failure and controller using the method |
-
2009
- 2009-12-24 CN CN200910260173.6A patent/CN102110121B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1856785A (en) * | 2004-03-29 | 2006-11-01 | 微软公司 | Systems and methods for versioning based triggers |
CN101506766A (en) * | 2005-05-10 | 2009-08-12 | 微软公司 | Database corruption recovery systems and methods |
Also Published As
Publication number | Publication date |
---|---|
HK1155236A1 (en) | 2012-05-11 |
CN102110121A (en) | 2011-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102110121B (en) | A kind of data processing method and system thereof | |
US10901861B2 (en) | Systems and methods of restoring a dataset of a database for a point in time | |
US11436194B1 (en) | Storage system for file system objects | |
CN108694195B (en) | Management method and system of distributed data warehouse | |
US8626717B2 (en) | Database backup and restore with integrated index reorganization | |
US9817835B2 (en) | Efficient data synchronization for storage containers | |
CN110019469B (en) | Distributed database data processing method and device, storage medium and electronic device | |
CN109891402A (en) | The conversion of revocable and on-line mode | |
WO2015034827A1 (en) | Replication of snapshots and clones | |
CN113094442B (en) | Full data synchronization method, device, equipment and medium | |
CN104301360A (en) | Method, log server and system for recording log data | |
GB2529436A (en) | Data processing apparatus and method | |
CN107665219B (en) | Log management method and device | |
CN106161193B (en) | Mail processing method, device and system | |
CN111930716A (en) | Database capacity expansion method, device and system | |
US10452496B2 (en) | System and method for managing storage transaction requests | |
US10732840B2 (en) | Efficient space accounting mechanisms for tracking unshared pages between a snapshot volume and its parent volume | |
CN114090538A (en) | Data backtracking method and device | |
CN113835613A (en) | A file reading method, device, electronic device and storage medium | |
CN110413617B (en) | Method for dynamically adjusting hash table group according to size of data volume | |
HK1155236B (en) | A data processing method and system thereof | |
US20150052107A1 (en) | Object dependency management | |
CN110851445B (en) | Method for safely storing data based on block chain technology | |
CN115455012A (en) | Data acquisition method, electronic equipment and storage medium | |
CN116756149A (en) | Star-shaped parallel single-theme multi-source data fusion method, medium, equipment and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1155236 Country of ref document: HK |
|
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: GR Ref document number: 1155236 Country of ref document: HK |