CN113297205B

CN113297205B - Index construction and data access processing method, device, equipment and medium

Info

Publication number: CN113297205B
Application number: CN202010753698.XA
Authority: CN
Inventors: 肖文健; 蔡杰明; 杨策; 杨世泉
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2025-02-18
Anticipated expiration: 2040-07-30
Also published as: CN113297205A

Abstract

The application provides an index construction and data access processing method, device, equipment and medium. The index construction method comprises the steps of distinguishing a first type file and a second type file, wherein the second type file comprises data after at least one item of data in the first type file is updated, constructing indexes for the first type file and the second type file in parallel, respectively generating corresponding index files, determining the sequence of the index files according to the first type file or the second type file corresponding to the index files, and determining target index files from at least two index files comprising target data in sequence, wherein the target data in the target index files comprise response data. The application constructs the indexes for the first type files and the second type files in parallel, avoids constructing the indexes for the second type files after constructing the indexes for the first type files or constructing the indexes for the first type files after constructing the indexes for the second type files, and reduces the time delay of constructing the indexes for the first type files or the second type files.

Description

Index construction and data access processing method, device, equipment and medium

Technical Field

The present application relates to the field of database technologies, and in particular, to a method, an apparatus, a system, a device, and a medium for index construction and data access processing.

Background

The database stores data, when the database receives the access request to the data, the data can be accessed according to the main keys, the access speed is higher because the main keys in the database are usually ordered according to the sequence, and when the data is accessed according to the attribute, the access speed is lower because the attribute columns are not ordered according to the sequence. In order to solve the problem of slow access speed of the attribute column, an index may be constructed for the attribute column.

In the prior art, the scheme for constructing the index mainly comprises two types, namely an asynchronous index and a synchronous index. For asynchronous indexing, the existing stock data is first indexed, and during this construction process, new data, i.e., delta data, is allowed to be written, but the delta data needs to be constructed after the end of the index construction of the stock data, i.e., the stock data and the delta data are serially constructed in time. For synchronous indexing, the existing stock data is preferentially constructed, and in order to ensure the correctness of the index, new data is prevented from being written in a certain time even in the whole process of constructing the index by the stock data.

The applicant has studied the two schemes described above and found that asynchronous indexing and synchronous indexing result in a large time delay for constructing the index of incremental data.

Disclosure of Invention

The embodiment of the application provides an index construction method and a data access processing method, which are used for reducing the time delay for constructing an index file.

Correspondingly, the embodiment of the application also provides an index construction and data access processing device, equipment and medium, which are used for ensuring the realization and application of the method.

In order to solve the problems, the embodiment of the application discloses an index construction method, which comprises the steps of distinguishing a first type file and a second type file, wherein the second type file comprises data obtained after updating at least one item of data in the first type file, constructing indexes for the first type file and the second type file in parallel, respectively generating corresponding index files, and determining the sequence of the index files according to the first type file or the second type file corresponding to the index files, wherein the sequence is used for determining a target index file from at least two index files comprising target data, and the target data in the target index file comprises response data.

The embodiment of the application also discloses a data access processing method, which comprises the steps of receiving a data access request, determining response data corresponding to the data access request from at least two index files according to the sequence of the index files, wherein the at least two index files comprise index files corresponding to a first type of files and index files corresponding to a second type of files, the second type of files comprise data after updating at least one item of data in the first type of files, the index files are constructed in parallel, and the sequence of the index files is determined by the first type of files or the second type of files corresponding to the index files.

The embodiment of the application also discloses another index construction method, which comprises the steps of distinguishing stock files from increment files, constructing indexes for the stock files and the increment files in parallel, respectively generating corresponding index files, and determining the sequence of the index files according to the stock files or the increment files corresponding to the index files, wherein the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data.

The embodiment of the application also discloses another index construction method, which comprises the steps of carrying out snapshot processing on the stock file to obtain second snapshot information and generating third snapshot information of an initial state, constructing indexes on the stock file and the increment file in parallel to respectively generate corresponding index files, recording the index files corresponding to the stock file into the third snapshot information according to the second snapshot information, and determining the sequence of the index files according to the third snapshot information, wherein the sequence is used for determining a target index file from at least two index files comprising target data, and the target data in the target index file comprises response data.

The embodiment of the application also discloses an index construction device which comprises a first file distinguishing module, a first index construction file module and a first index file sequence determining module, wherein the first file distinguishing module is used for distinguishing a first type file and a second type file, the second type file comprises data obtained by updating at least one item of data in the first type file, the first index construction file module is used for parallelly constructing indexes for the first type file and the second type file and respectively generating corresponding index files, the first index file sequence determining module is used for determining the sequence of the index files according to the first type file or the second type file corresponding to the index files, the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data.

The embodiment of the application also discloses a data access processing device which comprises a data access request receiving module, a response data determining module and a data access processing module, wherein the data access request receiving module is used for receiving data access requests, response data corresponding to the data access requests are determined from at least two index files according to the sequence of the index files, the at least two index files comprise index files corresponding to a first type of files and index files corresponding to a second type of files, the second type of files comprise data after updating at least one item of data in the first type of files, the index files are parallelly constructed, and the sequence of the index files is determined by the first type of files or the second type of files corresponding to the index files.

The embodiment of the application also discloses another index construction device which comprises a second file distinguishing module, a second index construction module and a second index file sequence determining module, wherein the second file distinguishing module is used for distinguishing stock files and increment files, the second index construction module is used for parallelly constructing indexes for the stock files and the increment files and respectively generating corresponding index files, the second index file sequence determining module is used for determining the sequence of the index files according to the stock files or the increment files corresponding to the index files, the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data.

The embodiment of the application also discloses another index construction device which comprises a snapshot information generation module, a third index construction module, an inventory index recording module and a third index file sequence determination module, wherein the snapshot information generation module is used for carrying out snapshot processing on the inventory files to obtain second snapshot information and generating third snapshot information of initial states, the third index construction module is used for parallelly constructing indexes for the inventory files and the increment files and respectively generating corresponding index files, the inventory index recording module is used for recording the index files corresponding to the inventory files into the third snapshot information according to the second snapshot information, the third index file sequence determination module is used for determining the sequence of the index files according to the third snapshot information, the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data.

The embodiment of the application also discloses electronic equipment, which comprises a processor and a memory, wherein executable codes are stored on the memory, and when the executable codes are executed, the processor is caused to execute the method according to any one of the embodiment of the application.

Embodiments of the application also disclose one or more machine readable media having executable code stored thereon that, when executed, cause a processor to perform a method according to any of the embodiments of the application.

Compared with the prior art, the embodiment of the application has the following advantages:

In the embodiment of the application, the index can be built for the first type file and the second type file in parallel, and the index which is required to be built after the second type file by the first type file or the index which is required to be built after the first type file by the second type file does not exist, so that the time delay for building the index by the first type file or the second type file is avoided.

Drawings

FIG. 1 is a schematic diagram of a non-relational database according to the present application;

FIG. 2 is a schematic diagram of a log-structured merge tree based storage process of the present application;

FIG. 3 is a flowchart of the steps of an embodiment of an index building method of the present application;

FIG. 4 is a flow chart of steps of an embodiment of a data access processing method of the present application;

FIG. 5 is a flowchart of the steps of another embodiment of an index building method of the present application;

FIG. 6 is a flowchart of the steps of another embodiment of an index building method of the present application;

FIG. 7 is a flow diagram of an index building process of the present application;

FIG. 8 is a block diagram of an embodiment of an index building means of the present application;

FIG. 9 is a block diagram of an embodiment of a data access processing apparatus of the present application;

FIG. 10 is a block diagram of another embodiment of an index building means of the present application;

FIG. 11 is a block diagram of another embodiment of an index building means of the present application;

Fig. 12 is a schematic structural diagram of an apparatus according to an embodiment of the present application.

Detailed Description

In order that the above-recited objects, features and advantages of the present application will become more readily apparent, a more particular description of the application will be rendered by reference to the appended drawings and appended detailed description.

The embodiment of the application can be applied to index construction of data files with time sequence in a database, wherein the data files are files for storing data, and the data of one data table are split into a plurality of data file stores. A typical application scenario of the present application is the NoSQL (Not Only Structed Query Language, not just structured query language) database, also known as a non-relational database. The traditional relational database cannot guarantee the ultra-large scale and high concurrency data storage, and the NoSQL database can realize the ultra-large scale and multi-data type data storage.

In a NoSQL database, data in one data table can be stored in a partitioned manner, and one or more Master servers and one or more Worker servers exist in the NoSQL database, and typically, the number of the Worker servers is large. The Master server may also be referred to as a main server, the Worker server may also be referred to as a working server, the main server may be configured to schedule the data after partitioning to each working server according to a preset scheduling policy, the working server stores the data after receiving the data, and the working server is configured to provide a data service, such as a data reading service or a data writing service, for the outside. As shown in the schematic structural diagram of the non-relational database in fig. 1, there are two main service ends and five working service ends, and each main service end can schedule data into each working service end.

The storage structure of the data in the working server is LSM (Log Structured MERGE TREES), as shown in the storage process diagram based on the Log Structured merging tree shown in fig. 2, when receiving a data, the data is written into the memory of the working server in the form of a memory file, and the data is written into the Log file located in the disk, when the memory file reaches the preset capacity threshold, a memory file is first created to receive the data, and then the memory file reaching the preset capacity threshold is written into the disk, and a data file is formed in the disk.

In the above process, only one memory file is always used for receiving data, the data files in the disk are increased along with the increase of the number of times of writing into the disk, as in fig. 2, five data files are formed, namely, F1, F2, F3, F4 and F5, the generation time T1 of the data file F1 is earlier than the generation time T2 of the data file F2, the generation time T2 of the data file F2 is earlier than the generation time T3 of the data file F3, the generation time T3 of the data file F3 is earlier than the generation time T4 of the data file F4, and the generation time T4 of the data file F4 is earlier than the generation time T5 of the data file F5.

After the data is stored in the above manner, when a data access request is received, the corresponding data can be read from the memory file first, and if the corresponding data does not exist in the memory file, the corresponding data is acquired according to the time sequence. For example, the corresponding data is read from the data file F1 with the earliest generation time, and if the corresponding data does not exist in the data file F1, the corresponding data is read from the data file F2, so that the corresponding data exists in one of the data files, or the corresponding data does not exist in the last data file F5.

It can be understood that the data in the log file is a backup of the data of the memory file, and when the working server side suddenly abnormal, the data in the log file can be written into the memory file, so as to avoid the data loss caused by the sudden abnormal working server side, and the log file is deleted after the memory file is written into the disk, so that the storage resource of the disk is saved.

As is clear from the above data storage process, new data files are always generated as data is written, and data files with earlier time are not overwritten, so that data of the same primary key may exist in different data files, and data of the same primary key may also exist in the same data file. Thus, in order to acquire accurate data, the last written data needs to be acquired in the order of the data files.

Based on the data storage process shown in fig. 2, the application constructs an index for the data files in the disk generated by the process to obtain index files, and also needs to ensure that the sequence of the index files is consistent with that of the data files, so that the last written data can be found according to the sequence of the index files when the data are read from the index files, but not the old data.

Referring to FIG. 3, a flowchart of the steps of an embodiment of an index building method of the present application is shown.

Step 101, distinguishing a first type file and a second type file, wherein the second type file comprises data after updating at least one item of data in the first type file.

In practical application, after receiving the index construction request, it is first necessary to distinguish between at least one already existing data file and at least one newly generated data file, part or all of the data in the newly generated data file being updated data for part or all of the data in the already existing data file, and then to start constructing an index in parallel for the at least one data file and the at least one newly generated data file. In the present application, at least one data file that already exists may be referred to as a first type file, and at least one data file that is newly generated may be referred to as a second type file.

Based on the foregoing data storage process, it is known that some or all of the data in the second type file is updated to some or all of the data in the first type file. For example, as shown in fig. 2, if the index construction request is received after the data file F4 is generated, the existing data files F1 to F4 may be used as the first type files, and the data file F5 newly generated after the index construction request is received may be used as the second type files.

The first type of file and the second type of file of the present application may be any data file having an order, and the first type of file precedes or follows the second type of file, the application can construct indexes for two types of files with sequence in parallel, and ensures that the sequence of the index files generated after constructing the indexes is consistent with the sequence of the first type of files and the second type of files.

In order to achieve the above-mentioned order consistency, the present application needs to distinguish which data files are the first type of files and which files are the second type of files before constructing the index. In one example, the identity of at least one data file that already exists may be recorded, the identity of the at least one data file that already exists being the identity of the first type of file, and the data files that are not recorded being the second type of file. In another example, at least one data file that is already present may be migrated to a preset storage space, such that data files in the preset storage space are of a first type and data files not in the preset storage space are of a second type. In yet another example, at least one data file that already exists may be marked such that the data file with this marking is a first type of file and the data file without this marking is a second type of file.

And 102, constructing indexes for the first type files and the second type files in parallel, and respectively generating corresponding index files.

The index is a structure for ordering values of one or more columns in a data table in the database, and can be specifically obtained by calculation according to the values of the columns, so that the query efficiency can be greatly improved. For the first type of files, the generated index files contain all or part of columns in the first type of files, and for the second type of files, the generated index files contain all or part of columns in the second type of files. The column of the index file may be specified in the index build request.

It will be appreciated that in the present application, since the first type of file and the second type of file construct the index in parallel, the generated index file of the first type of file and the index file of the second type of file may be crossed in sequence. For example, as shown in FIG. 2, the data files F1 to F4 are the first type of files, the data file F5 is the second type of files, and the index files generated by the index construction method of the present application may be the index file of the data file F1, the index file of the data file F5, the index file of the data file F2, the index file of the data file F3, and the index file of the data file F4.

In the prior art, instead of constructing indexes for the first type file and the second type file in parallel, the indexes generated by the index constructing method in the prior art may be the indexes of the data file F1, the data file F2, the data file F3, the data file F4 and the data file F5.

And step 103, determining the sequence of the index files according to the first type files or the second type files corresponding to the index files, wherein the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data.

Specifically, in the first type of files, indexes are built for each first type of files according to the sequence of the first type of files, the sequence between the built index files of the first type of files is consistent with the sequence between the first type of files, in the second type of files, indexes are built for each second type of files according to the sequence of the second type of files, the sequence between the built index files of the second type of files is consistent with the sequence between the second type of files, and the index files of the first type of files are ordered before or after the index files of the second type of files between the first type of files and the second type of files.

It will be appreciated that when the first type of files and the second type of files are arranged in ascending order of time, the index files of the first type of files may be ordered before the index files of the second type of files, so that the index files of the earlier generated data files are ordered before the index files of the later generated data files are ordered after the index files of the later generated data files, and when the first type of files and the second type of files are arranged in descending order of time, the index files of the first type of files may be ordered after the index files of the second type of files, so that the index files of the later generated data files are ordered before the index files of the earlier generated data files are ordered after the index files of the later generated data files.

Based on the index file having the above order, when a data access request is received, the latest data can be acquired from the index file as the response data of the data access request. And when the target data accessed by the data access request has at least two pieces of data with the same main key in all index files, the target data is updated at least once, and if the at least two pieces of data are sourced from different index files, the target data in the index file generated at the latest is used as the response data of the data access request according to the sequence of the index files, wherein the index file generated at the latest can be called as the target index file in the application.

In addition, if at least two pieces of data originate from the same index file, the target data with the latest time can be determined from the time information of each piece of data in the index file as the response data of the data access request.

Optionally, the distinguishing the first type file from the second type file includes sub-step 1011:

sub-step 1011 records a first identification of the first type of file to take the unrecorded file as a second type of file.

Wherein the first identifier is a unique identifier of the data file, and is determined according to the generation time when the data file is generated, so that the first identifier may represent the order of the data file. For example, the first identifier may be numbered from 0 and add 1 to the current identifier as the first identifier of the data file when one data file is generated, and the first identifier is initialized to 0 as the first identifier of the data file, as shown in fig. 2, when the data file F1 is generated, add 1 to the current identifier 0 as the first identifier 1 of the data file F1 and the first identifier 1 of the data file F1 as the current identifier, when the data file F2 is generated, add 1 to the current identifier 1 as the first identifier 2 of the data file F2 and the first identifier 2 of the data file F2 as the current identifier, when the data file F3 is generated, add 1 to the current identifier 2 as the first identifier 3 of the data file F3 and the first identifier 3 of the data file F3 as the current identifier, when the data file F4 is generated, add 1 to the first identifier 4 of the data file F4 as the first identifier 4 and the first identifier 4 of the data file F5 as the first identifier, and when the data file F5 is generated, add 1 to the first identifier 5 of the data file F5 as the first identifier 5.

In the application, the purpose of distinguishing the first type files from the second type files can be realized by recording the first identification of the first type files, so that the files which are not recorded are the second type files. For any one data file, acquiring a first identifier of the data file, if the first identifier of the data file is one of the first identifiers recorded, determining that the data file is a first type file, and if the first identifier of the data file is not the first identifier of any one recorded, determining that the data file is a second type file. For example, as shown in fig. 2, if the index construction request is received after the data file F4 is generated, the first identifications of the already existing data files F1 to F4 may be recorded, so that the data file F5 that is not recorded is a second type file.

Optionally, the recording the first identification of the first type of file includes sub-step 1011:

In a substep 10111, snapshot processing is performed on the first type file to obtain first snapshot information, where the first snapshot information includes a first identifier of the first type file.

Wherein a snapshot is a method of recording the status of information at a certain moment. In the application, when an index construction request is received, snapshot processing can be performed on the partition where the data file is located, and the obtained snapshot information is called first snapshot information, wherein the first snapshot information contains a first identifier of the first type file. The application adopts the first snapshot information to record the data files existing at the moment of receiving the index construction request so as to realize the purpose of distinguishing the first type files from the second type files, so that the files which are not in the first snapshot information are the second type files. And for any one data file, acquiring a first identifier of the data file, if the first identifier of the data file is in the first snapshot information, determining that the data file is a first type file, and if the first identifier of the data file is not in the first snapshot information, determining that the data file is a second type file.

Optionally, the method further comprises step 1012:

Step 1012, deleting the first identifier of the record after the construction of the index for the first type of file is finished.

After the index is built on the first type file, the identification of the first type file representing the record is not needed, so that the first identification of the record can be deleted, and the storage resource of the working server side is saved.

Optionally, the distinguishing the first type file from the second type file includes sub-step 1013:

Sub-step 1013, migrating the first type file from the first storage area to the second storage area to take the file in the second storage area as the second type file.

The first storage area is a default storage area of the first type of files, and the second storage area is a storage area different from the first storage area. In practical application, two storage areas, namely a first storage area and a second storage area, can be arranged in a disk of a working server, so that when data in a memory file is written into the disk of the working server, a data file is generated in the first storage area, and the data is written into the data file.

In the application, the purpose of distinguishing the first type files from the second type files can be realized by migrating the first type files into the second storage area, so that the data files in the second storage area are the first type files, and the data files in the first storage area are the second type files. For example, as shown in fig. 2, if an index construction request is received after the data file F4 is generated, after the index construction request is received, the already existing data files F1 to F4 in the first storage area are migrated into the second storage area, so that the data file F5 subsequently written into the first storage area is a second-type file.

Optionally, the method further comprises sub-step 1014:

sub-step 1014, after the indexing of the first type of file is completed, migrating the first type of file from the second storage area to the second type of file in the first storage area before or after.

Specifically, the first type file can be migrated from the second storage area back to the first storage area, if the first type file is originally located before the second type file, the first type file is migrated back to the second type file, and if the first type file is originally located after the second type file, the first type file is migrated back to the second type file.

Optionally, the distinguishing the first type of file from the second type of file includes the substep 1015:

In step 1015, a first preset flag is added to the first type of file, so that a file to which the first preset flag is not added is used as a second type of file.

In the present application, a first preset mark may be added to an existing data file after receiving an index construction request, where the first preset mark may be a number, a letter, a special character, etc., and for example, the first preset mark may be 1. So that the data file to which the first preset mark is not added is the second type file. For example, as shown in fig. 2, if an index construction request is received after the generation of the data file F4, after the index construction request is received, the first preset flag 1 is added to the already existing data files F1 to F4, and the subsequently generated data file F5 is not added with the first preset flag, so that it is the second type file.

Optionally, the method further comprises a sub-step 1016:

sub-step 1016, deleting the first preset mark of the first type file after the indexing of the first type file is completed.

It can be understood that, although the first preset flag may be set as a flag that occupies a small storage space, it still occupies storage resources, in order to save storage resources of the working server, after the construction of the index of all the files of the first type is finished, the first preset flag may be deleted to release the storage space storing the first preset flag.

Optionally, the determining the order of the index files according to the first type files or the second type files corresponding to the index files includes substeps 1031 to 1032:

Sub-step 1031, recording a second identification of the index file corresponding to the first type file.

The second identifier is a unique identifier of the index file, and is determined according to the generation time when the index file is generated, so that the second identifier can represent the sequence of the index files corresponding to the same type of file, namely the sequence between the index files corresponding to the first type of file and the sequence between the index files corresponding to the second type of file. For example, the second identifier may be numbered from 0, and when one index file is generated, the current identifier is added by 1 to be the second identifier of the index file, and the second identifier is initialized to 0, when the index file F1' is generated, the current identifier 0 is added by 1 to be the second identifier 1 of the index file F1', and the second identifier 1 of the index file F1' is used as the current identifier, when the index file F2' is generated, the current identifier 1 is added by 1 to be the second identifier 2 of the index file F2', and the second identifier 2 of the index file F2' is used as the current identifier, when the index file F3' is generated, the current identifier 2 is added by 1 to be the second identifier 3 of the index file F3', and the second identifier 3 of the index file F3' is used as the current identifier, when the index file F4' is generated, the current identifier 3 is added by 1 to be the second identifier 4 of the index file F4', and the second identifier 4 of the index file F4' is used as the current identifier, when the index file F2' is generated, the second identifier 2 is added by 1 to be the second identifier 2 of the index file F2', and the second identifier 2' is used as the current identifier is used as the second identifier 5.

In the application, the index files of the first type of files and the index files of the second type of files can be distinguished by recording the second identifications of the index files corresponding to the first type of files, so that the index files which are not recorded are the index files of the second type of files, and further the ordering is carried out between the index files of the first type of files and the index files of the second type of files. And for any one index file, acquiring a second identifier of the index file, if the second identifier of the index file is one of the second identifiers recorded, determining that the index file is the index file of the first type file, and if the second identifier of the index file is not the second identifier of any one recorded, determining that the index file is the index file of the second type file. For example, if the index construction request is received after the data file F4 is generated, after the index generation index files are constructed for the first type files F1 to F4, the second identification thereof may be recorded so that the index file that is not recorded is the index file of the second type file F5.

Sub-step 1032, determining whether the recorded index file of the first type file precedes or follows the unrecorded index file.

The application can distinguish the index files of all the first type files from the index files of all the second type files by recording the index files of the first type files, and can realize sorting between the index files of the first type files and the index files of the second type files based on the distinguished index files.

Optionally, the recording the second identifier of the index file corresponding to the first type of file includes the substep 10311:

and step 10311, performing snapshot processing on the index file corresponding to the first type file to obtain second snapshot information, where the second snapshot information includes a second identifier of the index file corresponding to the first type file.

Specifically, when receiving a request for constructing an index, snapshot processing is performed on a partition of the index file, at this time, since the index file is not yet generated, so that constructed second snapshot information is empty, and when generating the index file of the first type file, a second identifier of the index file is added to the second snapshot information. The application adopts the second snapshot information to record the index files of the first type files so as to realize the purpose of distinguishing the index files of the first type files from the index files of the second type files, so that the index files which are not in the second snapshot information are the index files of the second type files.

Optionally, the method further comprises a substep 1033:

Sub-step 1033, after determining the order of the index files, deletes the second identification of the record.

In the application, the second identifier of the record is used for distinguishing the index files of the first type of files from the index files of the second type of files so as to realize the sorting of the index files, so that the second identifier of the record is not needed to be used after the sequence of the index files is determined, and at the moment, the second identifier of the record can be deleted so as to save the storage resources consumed by storing the second identifier.

Optionally, the determining the order of the index files according to the first type files or the second type files corresponding to the index files includes substeps 1034 to 1036:

sub-step 1034, storing the index file corresponding to the first type file in a third storage area.

Specifically, after the index is built for the first type file to generate the index file, the index file of the first type file is directly stored in the third storage area. For example, for the first-type files F1 to F4, after the index files F1 'to F4' are generated thereto, the index files F1 'to F4' may be sequentially stored to the third storage area. The third storage area is a preset area for temporarily storing index files of the first type of files, and can be set according to actual application scenes.

And step 1035, storing the index file corresponding to the second type file in a fourth storage area.

Specifically, after the index is built for the second type file to generate the index file, the index file of the second type file is directly stored in the fourth storage area. The fourth storage area is a default area for storing the index file, and is different from the third storage area. For example, for the second type file F5, after the index file F5 'is generated thereto, the index file F5' may be stored to the fourth storage area.

Sub-step 1036, migrating the index file in the third storage area to before or after the index file in the fourth storage area.

It will be appreciated that in migrating the index files in the third storage area to the fourth storage area, it is necessary to ensure that the order of the index files in the third storage area is unchanged.

According to the application, the index files of the first type of files and the index files of the second type of files are respectively stored in different storage areas, so that the index files of the first type of files and the index files of the second type of files are distinguished, and the sorting between the index files of the first type of files and the index files of the second type of files is realized based on the distinguished index files of the first type of files and the distinguished index files of the second type of files.

Optionally, the determining the order of the index files according to the first type files or the second type files corresponding to the index files includes substeps 1037 to 1039:

in a substep 1037, a second preset flag is added to the index file corresponding to the first type file.

After the index file is generated by constructing the index for the first type of files, a second preset mark is added to the index file of the first type of files, wherein the second preset mark can be a number, a letter, a special character and the like, and can be the same as or different from the first preset mark. For example, the second preset flag may be 1.

And step 1038, adding a third preset mark to the index file corresponding to the second type file.

It will be appreciated that the second preset mark and the third preset mark are two different marks for distinguishing the first type of file from the second type of file. For example, if the first type of file is data files F1 to F4, the second preset mark 1 may be added to the index files of the data files F1 to F4 after the index generation index files are constructed for the data files F1 to F4, and the third preset mark 2 may be added to the index files of the data file F5 after the index generation index files are constructed for the data file F5.

In a special case, the third preset mark may also be an empty character string, so that the index file marked with the third preset mark corresponds to an unmarked one.

Sub-step 1039, determining whether the index file marked with the second preset mark is before or after the index file marked with the third preset mark.

The index files marked with the third preset marks are index files of the second type of files, so that the index files of the first type of files and the index files of the second type of files are ordered according to the second preset marks and the third preset marks.

After the index file is generated, the index files of the first type of files and the index files of the second type of files are distinguished through the second preset mark and the third preset mark, and then after the first type of files are built, the index files marked with the second preset mark are ordered before or after the index files marked with the third preset mark.

Optionally, the second preset mark and the third preset mark are both numerical values, the second preset mark is smaller than the third preset mark, and the determining that the marked index file of the second preset mark is before or after the marked index file of the third preset mark includes substep 10391, and determining that the marked index file of the second preset mark is before or after the marked index file of the third preset mark according to the sizes of the second preset mark and the third preset mark.

It will be appreciated that when the second preset mark and the third preset mark are both numerical values, the index files of the first type file and the index files of the second type file may be sorted according to the sizes of the second preset mark and the third preset mark. For example, the second preset flag may be 1 and the third preset flag may be 2, so that the index files marked with 1 may be sorted before or after the index files marked with 2.

According to the application, the index files can be ordered according to the sizes of the second preset mark and the third preset mark, and when the second preset mark and the third preset mark are numerical values, the numerical value-based ordering algorithm can be directly invoked.

Optionally, after determining the order of the index files according to the first type files or the second type files corresponding to the index files, the method further includes step 104:

and 104, deleting the second preset mark and the third preset mark.

In the application, the second preset mark and the third preset mark are used for distinguishing the index files of the first type of files from the index files of the second type of files so as to realize the sorting of the index files, so that the second preset mark and the third preset mark are not required to be used after the sequence of the index files is determined, and at the moment, the second preset mark and the third preset mark can be deleted so as to save the storage resources consumed by storing the second preset mark and the third preset mark.

Optionally, before the distinguishing the first type of file from the second type of file, steps 105 to 106 are included:

Step 105, an index build request is received.

Specifically, the index build request is triggered by a user executing a database statement that builds an index, where the user may specify a target column to build the index and a column in an index file, and may also specify whether to build an index to an existing data file. If the index is not built for the existing data files, only the index is built for the data files generated after the index building request, so that only the second type of files exist, and the index files of the second type of files can be ordered by adopting the second identifiers of the index files; if the index is constructed for the existing data files, the index is constructed for the existing data files and the data files generated after the index construction request, so that the first type files and the second type files exist, and the index files of the first type files and the index files of the second type files are ordered.

And step 106, responding to the index construction request, determining the current existing data file as a first type file, wherein the second type file is a data file generated after the index construction request is received.

In the application, after receiving the index construction request, the data file existing at the current moment can be determined as a first type file, the first type file can be also called an inventory file, and the data file generated after the index construction request is determined as a second type file, and the second type file can be also called an increment file.

The method and the device can accurately determine the first type files and the second type files according to the received index construction request.

Optionally, after the indexing is configured for the first type file and the second type file in parallel and the corresponding index files are generated respectively, step 107 is further included:

And step 107, generating a second identifier for the index file according to the generation time of the index file.

Wherein the second identifier is a unique identifier of the index file, and is determined according to the generation time when the index file is generated, so that the second identifier may represent the order of the index file. For example, the second identifier may be numbered from 0, and when one index file is generated, the current identifier is added by 1 to be the second identifier of the index file, and the second identifier is initialized to 0, when the index file F1' is generated, the current identifier 0 is added by 1 to be the second identifier 1 of the index file F1', and the second identifier 1 of the index file F1' is used as the current identifier, when the index file F2' is generated, the current identifier 1 is added by 1 to be the second identifier 2 of the index file F2', and the second identifier 2 of the index file F2' is used as the current identifier, when the index file F3' is generated, the current identifier 2 is added by 1 to be the second identifier 3 of the index file F3', and the second identifier 3 of the index file F3' is used as the current identifier, when the index file F4' is generated, the current identifier 3 is added by 1 to be the second identifier 4 of the index file F4', and the second identifier 4 of the index file F4' is used as the current identifier, when the index file F2' is generated, the second identifier 2 is added by 1 to be the second identifier 2 of the index file F2', and the second identifier 2' is used as the current identifier is used as the second identifier 5.

The index files F1', F2', F3', F4' and F5' may be generated by first creating an index file F1' for the first type file F1, then creating an index file F2' for the second type file F5, then creating an index file F3' for the first type file F2, then creating an index file F4' for the first type file F3, and finally creating an index file F5' for the first type file F4, so that the second identifiers of the index file F1', F2', F3', F4' and F5' are 1,2,3, 4 and 5, respectively.

In a first example, the present application may implement sorting of all index files based on the second identifier of the index file in combination with the second identifier of the record. For example, the second identifiers 1, 3, 4 and 5 of the first type of files are recorded, and the second identifiers of the second identifiers 1, 3, 4 and 5 of the second type of files are not recorded, so that the recorded second identifiers 1, 3, 4 and 5 of the index files F1', the index files F3', the index files F4 'and the index files F5' may be ordered before the second identifier 2 of the index files F2 is not recorded, and then the ascending order is performed between the index files F1', the index files F3', the index files F4 'and the index files F5' according to the second identifiers, and the final order is that the index files F1', the index files F3', the index files F4', the index files F5', and the order is consistent with the order of the corresponding data files F1, the data files F2, the data files F3, the data files F4 and the data files F5.

In a second example, the present application may implement sorting of all index files by combining the aforementioned third storage area and fourth storage area on the basis of the second identifier of the index file. For example, the index files F1', F3', F4' and F5' of the first type of files are stored in the third storage area, and the index files F2' of the second type of files are stored in the fourth storage area, so that the index files F1', F3', F4' and F5' in the third storage area may be sorted first, and then sorted in ascending order between the index files F1', F3', F4' and F5' according to the second identifier before the index files F2' in the fourth storage area, and the resulting order is that the index files F1', F3', F4', F5', and F2' are identical to the corresponding order of the data files F1, F2, F3, F4 and F5.

In a third example, based on the second identifier of the index file, the present application may implement sorting of all index files by combining the aforementioned second preset flag and third preset flag. For example, the second preset mark is 1, the third preset mark is 2, the index files F1', F3', F4', and F5' of the first type of files are corresponding to the second preset mark 1, the index files F2 'of the second type of files are corresponding to the third preset mark 2, if a number (the second preset mark, the second mark) is set for the index file of each first type of file, a number (the third preset mark, the second mark) is set for the index file of each second type of file, and the numbers (1, 1), (2), (1, 3), (1, 4), (1, 5) of the index files F1', F2', F3', F4', and F5' can be obtained.

Based on the above numbers, the index files F1', F2', F3', F4', and F5' may be ordered, where the index files F1', F3', F4', and F5' marked with the second preset mark 1 are ordered first before the index files F2' marked with the third preset mark 2, and then the ascending order is performed between the index files F1', F3', F4', and F5' according to the second mark, where the final order is the index files F1', F3', F4', F5', and F2', which is consistent with the order of the corresponding data files F1, F2, F3, F4, and F5.

Step 108, sorting among the index files corresponding to the first type files according to the second identifiers of the index files corresponding to the first type files.

It will be appreciated that the index files of the first type of files are some of them, and that the index files of the first type of files may be arranged in descending or ascending order according to the second identifier.

And step 109, sorting among the index files corresponding to the second type files according to the second identifiers of the index files corresponding to the second type files.

It will be appreciated that the index files of the second type of file are another portion of the index files, and may be arranged in descending or ascending order according to the second identifier between the index files of the second type of file.

Optionally, the first type file and the second type file are stored in a non-relational database, the non-relational database stores the first type file and the second type file in a partition mode through a log-structured merging tree, data with the same main key exists in the first type file and/or the second type file, and data with the same main key exists in an index file corresponding to the first type file and/or an index file corresponding to the second type file.

When the memory file reaches a preset capacity threshold, firstly, a memory file is newly built to receive the data, then the memory file reaching the preset capacity threshold is written into the disk, and a data file is formed in the disk. The first type of file may be a previously generated data file and the second type of file may be a later generated data file.

The typical application scene of the application is that the newly written data does not cover the old data in the writing process of the data, so that the data with the same main key possibly exists in the first type of files and the second type of files or the data with the same main key exists in the same data file. Thus, the index files of the first type of files and the index files of the second type of files can be provided with data of the same main key or data of the same main key. After receiving the data access request, target data of the same main key may be acquired from a plurality of index files or the same index file, but it is necessary to determine which target data is the latest data as response data to the data access request.

In order to ensure that the acquired response data is up-to-date after receiving the data access request, the application sorts the index files of the first type of files, sorts the index files of the second type of files and sorts the index files of the first type of files and the index files of the second type of files so as to accurately determine the sequence of the index files, and ensures that when the target data of the same main key is read from different index files, the latest data can be determined as the response data according to the sequence of the index files.

In summary, the application provides an index construction method, which comprises the steps of distinguishing a first type file and a second type file, wherein the second type file comprises data obtained by updating at least one item of data in the first type file, constructing indexes for the first type file and the second type file in parallel, respectively generating corresponding index files, determining the sequence of the index files according to the first type file or the second type file corresponding to the index files, and determining the target index files from at least two index files comprising target data according to the sequence, wherein the target data in the target index files are response data. The application constructs the indexes for the first type files and the second type files in parallel, avoids constructing the indexes for the second type files after constructing the indexes for the first type files or constructing the indexes for the first type files after constructing the indexes for the second type files, and reduces the time delay of constructing the indexes for the first type files or the second type files.

Referring to fig. 4, a flowchart of the steps of an embodiment of a data access processing method of the present application is shown.

Step 201, a data access request is received.

Where the data access request is a request to retrieve response data, the response data is typically determined by conditions in the data access request including, but not limited to, filtering conditions for the primary key or one or more of the columns.

And 202, determining response data corresponding to the data access request from at least two index files according to the sequence of the index files, wherein the at least two index files comprise index files corresponding to a first type of files and index files corresponding to a second type of files, the second type of files comprise data after updating at least one item of data in the first type of files, the index files are constructed in parallel, and the sequence of the index files is determined by the first type of files or the second type of files corresponding to the index files.

In practical application, response data corresponding to the data access request can be obtained from the latest index file according to the sequence of the index files until the oldest index file. For example, for index files F1', F2', F3', F4', F5 'arranged in ascending order of generation time, the response data may be obtained from the index file F5' until the index file F1', and if the target data of the same primary key exists in the index file F1' and the index file F5', the target data in the index file F5' is taken as the response data.

In summary, the application provides a data access processing method, which comprises the steps of receiving a data access request, determining response data corresponding to the data access request from at least two index files according to the sequence of the index files, wherein the at least two index files comprise index files corresponding to a first type of file and index files corresponding to a second type of file, the second type of file comprises data after updating at least one item of data in the first type of file, the index files are constructed in parallel, and the sequence of the index files is determined by the first type of file or the second type of file corresponding to the index files. According to the application, the index is built on the first type files and the second type files in parallel, so that the first type files are prevented from being built and then the second type files are prevented from being built, or the second type files are prevented from being built and then the first type files are prevented from being built, the time delay of the first type files or the second type files in the index building is reduced, the access to the index files of the two types of files is facilitated, and the access delay to the index files of one type of files is prevented from being larger.

Referring to FIG. 5, a flowchart of the steps of another index building method embodiment of the present application is shown.

Step 301, differentiating stock files from delta files.

In practical applications, after receiving a request for constructing an index, it is first necessary to distinguish between at least one data file that already exists and at least one data file that is newly generated, and then to start constructing an index in parallel for the at least one data file and the at least one data file that is newly generated. In the present application, the at least one data file that already exists may be referred to as an inventory file, and the at least one data file that is newly generated may be referred to as an delta file.

Based on the data storage process in fig. 2, some or all of the data in the delta file is updated with some or all of the data in the stock file. For example, as shown in fig. 2, if the index construction request is received after the data file F4 is generated, the already existing data files F1 to F4 can be used as stock files, and the data file F5 newly generated after the index construction request is received can be used as delta file.

The application can construct the index for the stock files and the increment files with sequence in parallel, and ensures that the sequence of the index files generated after constructing the index is consistent with the sequence of the stock files and the increment files.

In order to achieve the above order consistency, the present application needs to distinguish which data files are stock files and which files are delta files before constructing the index. In one example, the identity of at least one data file that already exists may be recorded, the identity of the at least one data file that already exists being the identity of the stock file, and the data files that are not recorded being delta files. In another example, at least one data file that already exists may be migrated to a preset storage space, such that data files in this preset storage space are stock files and data files not in this preset storage space are delta files. In yet another example, at least one data file that already exists may be marked such that the data file with this mark is an inventory file and the data file without this mark is a delta file.

And 302, constructing indexes for the stock file and the increment file in parallel, and respectively generating corresponding index files.

The index is a structure for ordering values of one or more columns in a data table in the database, so that the query efficiency can be greatly improved. For the stock file, the generated index file contains all or part of columns in the stock file, and for the increment file, the generated index file contains all or part of columns in the increment file. The column of the index file may be specified in the index build request.

It will be appreciated that in the present application, since the stock file and the delta file construct the index in parallel, the index file of the generated stock file and the index file of the delta file may be crossed in sequence, for example, as shown in fig. 2, the data files F1 to F4 are stock files, the data file F5 is a delta file, and the index file generated by the index constructing method of the present application may be in the sequence of the index file of the data file F1, the index file of the data file F5, the index file of the data file F2, the index file of the data file F3, and the index file of the data file F4.

In the prior art, instead of constructing indexes for the stock file and the delta file in parallel, the indexes generated by the index constructing method in the prior art may be the indexes of the data file F1, the data file F2, the data file F3, the data file F4 and the data file F5.

And step 303, determining the sequence of the index files according to the stock files or the increment files corresponding to the index files, wherein the sequence is used for determining a target index file from at least two index files comprising target data, and the target data in the target index file comprises response data.

In practical application, in the stock files, indexes are built for each stock file according to the sequence of the stock files, the sequence among the index files of the built stock files is consistent with the sequence of the stock files, in the increment files, indexes are built for each increment file according to the sequence of the increment files, the sequence among the index files of the built increment files is consistent with the sequence of the increment files, and the index files of the stock files are ordered before or after the index files of the increment files between the stock files and the increment files.

It will be appreciated that when the stock files and delta files are arranged in ascending order of time, the index files of the stock files may be ordered before the index files of the delta files such that the index files of earlier generated data files are ordered before the index files of later generated data files are ordered after the index files of the delta files, and that when the stock files and delta files are arranged in ascending order of time, the index files of the stock files may be ordered after the index files of the delta files such that the index files of later generated data files are ordered before the index files of earlier generated data files are ordered after the index files of later generated data files.

In summary, the application provides an index construction method, which comprises the steps of distinguishing stock files from increment files, constructing indexes for the stock files and the increment files in parallel, respectively generating corresponding index files, and determining the sequence of the index files according to the stock files or the increment files corresponding to the index files, wherein the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data. The application constructs the index for the stock file and the increment file in parallel, avoids constructing the index for the increment file after constructing the index for the stock file or constructing the index for the stock file after constructing the index for the increment file, and reduces the time delay of constructing the index for the stock file or the increment file.

Referring to FIG. 6, a flowchart of the steps of another index building method embodiment of the present application is shown.

In step 401, snapshot processing is performed on the stock file to obtain second snapshot information, and third snapshot information of the initial state is generated.

The second snapshot information includes a first identifier of the stock file, the third snapshot information in the initial state is determined by whether the index file of the stock file exists in the initial state, if the index file of a part of the stock file exists in the initial state, the third snapshot information includes a second identifier of the index file of the existing stock file, and if the index file of the stock file does not exist in the initial state, the third snapshot information does not include the second identifier of the index file of any stock file, which can be understood as the third snapshot information is empty.

And step 402, constructing indexes for the stock file and the increment file in parallel, and respectively generating corresponding index files.

This step may be described in detail with reference to step 302, and will not be described in detail herein.

And step 403, recording an index file corresponding to the stock file into the third snapshot information according to the second snapshot information.

The method comprises the steps of firstly determining a first identification of an inventory file in second snapshot information, then determining a corresponding index file according to the first identification, and finally adding the second identification of the index file into third snapshot information.

Step 404, determining an order of the index files according to the third snapshot information, where the order is used to determine a target index file from at least two index files including target data, where the target data in the target index file includes response data.

Specifically, the index files corresponding to the second identifier in the third snapshot information may be ordered before or after the rest of the index files. After determining the order, the second snapshot information and the third snapshot information may be deleted.

It will be appreciated that step 404 may refer to the detailed description of step 303, and will not be described herein.

The stock file and the incremental file of the present application may be collectively referred to as a master file, the master file being a file in which data in a master table is divided when stored, and the index file being a file in which data in an index table is divided when stored. The flow of the index construction process of the master file may be as shown in fig. 7:

first, snapshot processing is carried out on four existing stock files, namely F1, F2, F3 and F4, wherein 1, 2, 3 and 4 are respectively first identifiers of the four stock files, so that second snapshot information is obtained, wherein the first identifiers represent the time sequence of the four stock files, the smaller the first identifiers are, the earlier the time is, and in addition, third snapshot information is generated, and any second identifiers are not existed.

Then, after the snapshot process is created, there is new data to be stored, which is stored in a new file F5, called a delta file, so that index generation index files F1', F2', F3', F4', and F5 'are constructed in parallel for the stock files F1, F2, F3, and F4 and the delta file F5, where F1' is an index file of F1, F2 'is an index file of F5, F3' is an index file of F2, F4 'is an index file of F3, and F5' is an index file of F4, where 1,2,3, 4, and 5 are respectively second identifications of the index files, and similarly, the second identifications represent the order of five index files in time, the smaller the second identifications, the earlier the second identifications, and the second identifications 1,2,3, 4, and 5 of the index files are added to the third snapshot information.

Then, determining the sequence of the index files according to the third snapshot information, adding a second preset mark to the index files corresponding to the second mark in the third snapshot information, adding a third preset mark to the rest index files, and accordingly, sorting the index files by combining the second mark, the second preset mark and the third preset mark, wherein the second preset mark is 1, the third preset mark is 2, the index files of the second marks 1,3, 4 and 5 are added with the second preset mark 1, the index files of the second mark 2 are added with the third preset mark 2, and if the index files of the second marks and the second preset mark or the third preset mark form the numbers of the index files, the index files F1', F2', F3', F4' and F5 'are updated to be the index files F' (1, 1), F '(2, 2), F' (1, 3), F '(1, 4) and F' (1, 5). The index files may be ordered according to the second preset mark and the third preset mark, and then the index files may be ordered according to the second mark, so that the obtained orders are F ' (1, 1), F ' (1, 5), F ' (1, 3), F ' (1, 4) and F ' (2, 2).

Finally, the snapshot information may be deleted after determining the order, including the second snapshot information and the third snapshot information.

In summary, the application provides an index construction method, which comprises the steps of carrying out snapshot processing on an inventory file to obtain second snapshot information and generating third snapshot information of an initial state, constructing indexes on the inventory file and an increment file in parallel to respectively generate corresponding index files, recording the index files corresponding to the inventory file into the third snapshot information according to the second snapshot information, and determining the sequence of the index files according to the third snapshot information, wherein the sequence is used for determining a target index file from at least two index files comprising target data, and the target data in the target index file comprises response data. The application constructs the index for the stock file and the increment file in parallel, avoids constructing the index for the increment file after constructing the index for the stock file or constructing the index for the stock file after constructing the index for the increment file, and reduces the time delay of constructing the index for the stock file or the increment file.

Referring to fig. 8, a block diagram of an embodiment of an index building means of the present application is shown.

The first file distinguishing module 501 is configured to distinguish a first type file from a second type file, where the second type file includes data obtained by updating at least one item of data in the first type file.

The first index constructing file module 502 is configured to construct indexes for the first type file and the second type file in parallel, and generate corresponding index files respectively.

A first index file sequence determining module 503, configured to determine a sequence of the index files according to a first type file or a second type file corresponding to the index files, where the sequence is used to determine a target index file from at least two index files including target data, where the target data in the target index file includes response data.

Optionally, the first file differentiating module 501 includes a first identification record submodule:

And the first identification recording sub-module is used for recording the first identification of the first type of files so as to take the unrecorded files as the second type of files.

Optionally, the first identification record sub-module includes a first snapshot unit:

The first snapshot unit is used for carrying out snapshot processing on the first type of files to obtain first snapshot information, and the first snapshot information comprises a first identifier of the first type of files.

Optionally, the first file differentiating module 501 includes a first migration submodule:

And the storage migration sub-module is used for migrating the first type files from the first storage area to the second storage area so as to take the files in the second storage area as second type files.

Optionally, the first file differentiating module 501 further includes a second migration submodule:

and the second migration submodule is used for migrating the first-type files from the second storage area to the second-type files in the first storage area before or after the construction of the index for the first-type files is finished.

Optionally, the first file differentiating module 501 includes a first marking sub-module:

And the first marking sub-module is used for adding a first preset mark to the first type of files so as to take the files without the first preset mark as second type of files.

Optionally, the first index file order determining module 503 includes a second identification record sub-module and a first order determining sub-module:

and the second identification recording sub-module is used for recording the second identification of the index file corresponding to the first type of file.

A first order determining sub-module, configured to determine that the recorded index file of the first type of file precedes or follows the unrecorded index file.

Optionally, the second identification record sub-module includes a second snapshot unit:

And the second snapshot unit is used for carrying out snapshot processing on the index files corresponding to the first type files to obtain second snapshot information, wherein the second snapshot information comprises second identifiers of the index files corresponding to the first type files.

Optionally, the first index file order determining module 503 includes a first storage sub-module, a second storage sub-module, and a second order determining sub-module:

And the first storage sub-module is used for storing the index file corresponding to the first type file into a third storage area.

And the second storage sub-module is used for storing the index file corresponding to the second type file into a fourth storage area.

And the second order determining submodule is used for migrating the index files in the third storage area to the front or rear of the index files in the fourth storage area.

Optionally, the first index file order determining module 503 includes a second marking sub-module, a third marking sub-module, and a third order determining sub-module:

And the second marking sub-module is used for adding a second preset mark to the index file corresponding to the first type of file.

And the third marking sub-module is used for adding a third preset mark to the index file corresponding to the second type file.

A third order determining sub-module, configured to determine whether the index file marked with the second preset mark is before or after the index file marked with the third preset mark.

Optionally, the second preset mark and the third preset mark are both numerical values, the second preset mark is smaller than the third preset mark, and the third order determining submodule includes a third order determining unit:

And a third order determining unit, configured to determine, according to the sizes of the second preset mark and the third preset mark, whether the index file marked with the second preset mark is before or after the index file marked with the third preset mark.

Optionally, the apparatus further includes an index build request receiving module and a first type file determining module:

And the index construction request receiving module is used for receiving the index construction request.

And the first type file determining module is used for responding to the index construction request and determining the currently existing data file as a first type file, and the second type file is a data file generated after the index construction request is received.

Optionally, the device further includes a second identifier generating module, a first type index file sorting module, and a second type index file sorting module:

and the second identifier generating module is used for generating a second identifier for the index file according to the generation time of the index file.

And the first type index file ordering module is used for ordering the index files corresponding to the first type files according to the second identifiers of the index files corresponding to the first type files.

And the second type index file ordering module is used for ordering the index files corresponding to the second type files according to the second identifiers of the index files corresponding to the second type files.

In summary, the application provides an index construction device, which comprises a first file distinguishing module, a first index construction file module and a first index file sequence determining module, wherein the first file distinguishing module is used for distinguishing a first type file and a second type file, the second type file comprises data obtained by updating at least one item of data in the first type file, the first index construction file module is used for parallelly constructing indexes for the first type file and the second type file and respectively generating corresponding index files, the first index file sequence determining module is used for determining the sequence of the index files according to the first type file or the second type file corresponding to the index files, the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprises response data. The application constructs the indexes for the first type files and the second type files in parallel, avoids constructing the indexes for the second type files after constructing the indexes for the first type files or constructing the indexes for the first type files after constructing the indexes for the second type files, and reduces the time delay of constructing the indexes for the first type files or the second type files.

Referring to fig. 9, there is shown a block diagram of an embodiment of a data access processing apparatus of the present application.

The data access request receiving module 601 is configured to receive a data access request.

The response data determining module 602 is configured to determine response data corresponding to the data access request from at least two index files according to an order of the index files, where the at least two index files include an index file corresponding to a first type file and an index file corresponding to a second type file, the second type file includes data obtained by updating at least one item of data in the first type file, the index files are constructed in parallel, and the order of the index files is determined by the first type file or the second type file corresponding to the index file.

In summary, the application provides a data access processing device, which comprises a data access request receiving module for receiving a data access request, a response data determining module for determining response data corresponding to the data access request from at least two index files according to the sequence of the index files, wherein the at least two index files comprise index files corresponding to a first type of file and index files corresponding to a second type of file, the second type of file comprises data after updating at least one item of data in the first type of file, the index files are parallelly constructed, and the sequence of the index files is determined by the first type of file or the second type of file corresponding to the index files. According to the application, the index is built on the first type files and the second type files in parallel, so that the first type files are prevented from being built and then the second type files are prevented from being built, or the second type files are prevented from being built and then the first type files are prevented from being built, the time delay of the first type files or the second type files in the index building is reduced, the access to the index files of the two types of files is facilitated, and the access delay to the index files of one type of files is prevented from being larger.

Referring to FIG. 10, a block diagram of another index building means embodiment of the present application is shown.

A second file differentiating module 701, configured to differentiate an inventory file from an increment file.

And a second index construction module 702, configured to construct indexes for the stock file and the incremental file in parallel, and generate corresponding index files respectively.

A second index file order determining module 703, configured to determine an order of the index files according to stock files or delta files corresponding to the index files, where the order is used to determine a target index file from at least two index files including target data, where the target data in the target index file includes response data.

In summary, the application provides an index construction device, which comprises a second file distinguishing module, a second index construction module and a second index file sequence determining module, wherein the second file distinguishing module is used for distinguishing stock files and increment files, the second index construction module is used for parallelly constructing indexes for the stock files and the increment files and respectively generating corresponding index files, the second index file sequence determining module is used for determining the sequence of the index files according to the stock files or the increment files corresponding to the index files, the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data. The application constructs the index for the stock file and the increment file in parallel, avoids constructing the index for the increment file after constructing the index for the stock file or constructing the index for the stock file after constructing the index for the increment file, and reduces the time delay of constructing the index for the stock file or the increment file.

Referring to FIG. 11, a block diagram of another index building means embodiment of the present application is shown.

The snapshot information generating module 801 is configured to perform snapshot processing on the stock file to obtain second snapshot information, and generate third snapshot information in an initial state.

And a third index construction module 802, configured to construct indexes for the stock file and the incremental file in parallel, and generate corresponding index files respectively.

And an inventory index recording module 803, configured to record an index file corresponding to the inventory file into the third snapshot information according to the second snapshot information.

A third index file order determining module 804, configured to determine an order of the index files according to the third snapshot information, where the order is used to determine a target index file from at least two index files including target data, where the target data in the target index file includes response data.

In summary, the application provides an index construction device, which comprises a snapshot information generation module, a third index construction module, an inventory index recording module and a third index file sequence determination module, wherein the snapshot information generation module is used for carrying out snapshot processing on an inventory file to obtain second snapshot information and generating third snapshot information of an initial state, the third index construction module is used for parallelly constructing indexes for the inventory file and an increment file and respectively generating corresponding index files, the inventory index recording module is used for recording the index files corresponding to the inventory file into the third snapshot information according to the second snapshot information, the third index file sequence determination module is used for determining the sequence of the index files according to the third snapshot information, the sequence is used for determining a target index file from at least two index files comprising target data, and the target data in the target index file comprise response data. The application constructs the index for the stock file and the increment file in parallel, avoids constructing the index for the increment file after constructing the index for the stock file or constructing the index for the stock file after constructing the index for the increment file, and reduces the time delay of constructing the index for the stock file or the increment file.

The embodiment of the application also provides a non-volatile readable storage medium, where one or more modules (programs) are stored, where the one or more modules are applied to a device, and the instructions (instructions) of each method step in the embodiment of the application may cause the device to execute.

Embodiments of the application provide one or more machine-readable media having instructions stored thereon that, when executed by one or more processors, cause an electronic device to perform a method as described in one or more of the above embodiments. In the embodiment of the application, the electronic equipment comprises various types of equipment such as terminal equipment, a server (cluster) and the like.

Embodiments of the present disclosure may be implemented as an apparatus for performing a desired configuration using any suitable hardware, firmware, software, or any combination thereof, which may include electronic devices such as terminal devices, servers (clusters), etc. Fig. 12 schematically illustrates an example apparatus 900 that may be used to implement various embodiments described in the present disclosure.

For one embodiment, fig. 12 illustrates an example apparatus 900 having one or more processors 902, a control module (chipset) 904 coupled to at least one of the processor(s) 902, a memory 906 coupled to the control module 904, a non-volatile memory (NVM)/storage 908 coupled to the control module 904, one or more input/output devices 910 coupled to the control module 904, and a network interface 912 coupled to the control module 904.

The processor 902 may include one or more single-core or multi-core processors, and the processor 902 may include any combination of general-purpose or special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In some embodiments, the apparatus 900 may be used as a terminal device, a server (a cluster), or the like in the embodiments of the present application.

In some embodiments, apparatus 900 can include one or more computer-readable media (e.g., memory 906 or NVM/storage 908) with instructions 914 and one or more processors 902 combined with the one or more computer-readable media configured to execute instructions 914 to implement modules to perform the actions described in this disclosure.

For one embodiment, the control module 904 may include any suitable interface controller to provide any suitable interface to at least one of the processor(s) 902 and/or any suitable device or component in communication with the control module 904.

The control module 904 may include a memory controller module to provide an interface to the memory 906. The memory controller modules may be hardware modules, software modules, and/or firmware modules.

Memory 906 may be used to load and store data and/or instructions 914 for device 900, for example. For one embodiment, memory 906 may include any suitable volatile memory, such as, for example, a suitable DRAM. In some embodiments, the memory 906 may comprise a double data rate type four synchronous dynamic random access memory (DDR 4 SDRAM).

For one embodiment, the control module 904 can include one or more input/output controllers to provide an interface to the NVM/storage 908 and the input/output device(s) 910.

For example, NVM/storage 908 may be used to store data and/or instructions 914. NVM/storage 908 may include any suitable nonvolatile memory (e.g., flash memory) and/or may include any suitable nonvolatile storage device(s) (e.g., one or more Hard Disk Drives (HDDs), one or more Compact Disc (CD) drives, and/or one or more Digital Versatile Disc (DVD) drives).

NVM/storage 908 may include storage resources that are physically part of the device on which apparatus 900 is installed, or which may be accessible by the device without necessarily being part of the device. For example, NVM/storage 908 may be accessed over a network via input/output device(s) 910.

Input/output device(s) 910 may provide an interface for apparatus 900 to communicate with any other suitable device, input/output device 910 may include a communication component, an audio component, a sensor component, and the like. Network interface 912 may provide an interface for device 900 to communicate over one or more networks, and device 900 may communicate wirelessly with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols, such as accessing a wireless network based on a communication standard, such as WiFi, 2G, 3G, 4G, 5G, etc., or a combination thereof.

For one embodiment, at least one of the processor(s) 902 may be packaged together with logic of one or more controllers (e.g., memory controller modules) of the control module 904. For one embodiment, at least one of the processor(s) 902 may be packaged together with logic of one or more controllers of the control module 904 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 902 may be integrated on the same die as logic of one or more controllers of the control module 904. For one embodiment, at least one of the processor(s) 902 may be integrated on the same die with logic of one or more controllers of the control module 904 to form a system on chip (SoC).

In various embodiments, apparatus 900 may be, but is not limited to being, a terminal device such as a server, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.). In various embodiments, device 900 may have more or fewer components and/or different architectures. For example, in some embodiments, apparatus 900 includes one or more cameras, keyboards, liquid Crystal Display (LCD) screens (including touch screen displays), non-volatile memory ports, multiple antennas, graphics chips, application Specific Integrated Circuits (ASICs), and speakers.

The device 900 may use a main control chip as a processor or a control module, the sensor data, the position information, and the like are stored in a memory or an NVM/storage device, the sensor group may be used as an input/output device, and the communication interface may include a network interface.

For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other.

Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal device, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article or terminal device comprising the element.

The foregoing describes in detail the method, apparatus, device and medium for index construction and data access processing provided by the present application, and specific examples are provided herein to illustrate the principles and embodiments of the present application, and the above examples are provided to assist in understanding the method and core ideas of the present application, and meanwhile, to those skilled in the art, according to the ideas of the present application, there are changes in the specific embodiments and application scope, so that the disclosure should not be construed as limiting the present application.

Claims

1. An index building method, the method comprising:

Distinguishing a first type file and a second type file, wherein the second type file comprises data after updating at least one item of data in the first type file;

Constructing indexes for the first type files and the second type files in parallel, and respectively generating corresponding index files;

Determining the sequence of the index files according to the first type files or the second type files corresponding to the index files, wherein the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data;

The sequence of the index files of the first type of files is consistent with the sequence of the index files of the first type of files, and the sequence of the index files of the second type of files is consistent with the sequence of the index files of the second type of files.

2. The method of claim 1, wherein distinguishing between the first type of file and the second type of file comprises:

The first identification of the first type of files is recorded, and the unrecorded files are used as the second type of files.

3. The method of claim 2, wherein the recording the first identification of the first type of file comprises:

and carrying out snapshot processing on the first type file to obtain first snapshot information, wherein the first snapshot information comprises a first identifier of the first type file.

4. The method of claim 1, wherein distinguishing between the first type of file and the second type of file comprises:

And migrating the first type of files from the first storage area to a second storage area, wherein the files in the second storage area are the first type of files.

5. The method according to claim 4, wherein the method further comprises:

after the construction of the index for the first type file is finished, the first type file is migrated from the second storage area to the second type file in the first storage area before or after the first type file is migrated.

6. The method of claim 1, wherein distinguishing between the first type of file and the second type of file comprises:

and adding a first preset mark to the first type of files to take the files without the first preset mark as second type of files.

7. The method according to any one of claims 1 to 6, wherein determining the order of the index files according to the first type of files or the second type of files corresponding to the index files includes:

recording a second identifier of the index file corresponding to the first type of file;

And determining that the recorded index files of the first type of files are before or after the unrecorded index files.

8. The method of claim 7, wherein the recording the second identification of the index file corresponding to the first type of file comprises:

and carrying out snapshot processing on the index files corresponding to the first type of files to obtain second snapshot information, wherein the second snapshot information comprises a second identifier of the index files corresponding to the first type of files.

9. The method according to any one of claims 1 to 6, wherein determining the order of the index files according to the first type of files or the second type of files corresponding to the index files includes:

storing the index file corresponding to the first type file into a third storage area;

Storing the index file corresponding to the second type file into a fourth storage area;

and migrating the index file in the third storage area to the front or rear of the index file in the fourth storage area.

10. The method according to any one of claims 1 to 6, wherein determining the order of the index files according to the first type of files or the second type of files corresponding to the index files includes:

Adding a second preset mark to the index file corresponding to the first type of file;

Adding a third preset mark to the index file corresponding to the second type file;

Determining whether the index file marked with the second preset mark is before or after the index file marked with the third preset mark.

11. The method of claim 10, wherein the second preset mark and the third preset mark are both numerical values, the second preset mark is smaller than the third preset mark, and wherein determining the index file marked with the second preset mark before or after the index file marked with the third preset mark comprises:

And determining whether the index file marked with the second preset mark is before or after the index file marked with the third preset mark according to the sizes of the second preset mark and the third preset mark.

12. The method according to any one of claims 1 to 6, wherein before distinguishing the first type of file from the second type of file, comprising:

receiving an index construction request;

and responding to the index construction request, determining the current existing data file as a first type file, wherein the second type file is a data file generated after the index construction request is received.

13. The method according to any one of claims 1 to 6, wherein the indexing the first type of file and the second type of file in parallel, after generating corresponding index files respectively, further comprises:

Generating a second identifier for the index file according to the generation time of the index file;

sorting among the index files corresponding to the first type files according to the second identifiers of the index files corresponding to the first type files;

And sorting among the index files corresponding to the second class files according to the second identifiers of the index files corresponding to the second class files.

14. The method according to any one of claims 1 to 6, wherein the first type of file and the second type of file are stored in a non-relational database, the non-relational database stores the first type of file and the second type of file in a partition mode through a log-structured merging tree, data of the same main key exists in the first type of file and/or the second type of file, and data of the same main key exists in an index file corresponding to the first type of file and/or an index file corresponding to the second type of file.

15. A method of data access processing, the method comprising:

Receiving a data access request;

The method comprises the steps of determining response data corresponding to a data access request from at least two index files according to the sequence of the index files, wherein the at least two index files comprise index files corresponding to a first type of files and index files corresponding to a second type of files, the second type of files comprise data obtained by updating at least one item of data in the first type of files, the index files are parallelly constructed, the sequence of the index files is determined by the first type of files or the second type of files corresponding to the index files, the sequence of the index files of the first type of files is consistent with the sequence of the index files of the first type of files, and the sequence of the index files of the second type of files is consistent with the sequence of the index files of the second type of files.

16. An index construction method, the method comprising:

distinguishing stock files from delta files;

Constructing indexes for the stock files and the increment files in parallel, and respectively generating corresponding index files;

determining the sequence of the index files according to the stock files or the increment files corresponding to the index files, wherein the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data;

The sequence among the index files of the stock files is consistent with the sequence of the stock files, and the sequence among the index files of the increment files is consistent with the sequence of the increment files.

17. An index construction method, the method comprising:

Performing snapshot processing on the stock file to obtain second snapshot information and generating third snapshot information of an initial state, wherein the third snapshot information of the initial state is determined by whether an index file of the stock file exists or not;

Recording an index file corresponding to the stock file into the third snapshot information according to the second snapshot information;

Determining the sequence of the index files according to the third snapshot information, wherein the sequence is used for determining target index files from at least two index files comprising target data, the target data in the target index files comprise response data, the sequence among the index files of the stock files is consistent with the sequence of the stock files, and the sequence among the index files of the increment files is consistent with the sequence of the increment files.

18. An index building apparatus, the apparatus comprising:

The first file distinguishing module is used for distinguishing a first type file and a second type file, wherein the second type file comprises data after updating at least one item of data in the first type file;

The first index construction file module is used for constructing indexes for the first type files and the second type files in parallel and respectively generating corresponding index files;

The first index file sequence determining module is used for determining the sequence of the index files according to the first type files or the second type files corresponding to the index files, wherein the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data;

19. A data access processing apparatus, the apparatus comprising:

the data access request receiving module is used for receiving the data access request;

The response data determining module is used for determining response data corresponding to the data access request from at least two index files according to the sequence of the index files, wherein the at least two index files comprise index files corresponding to a first type of files and index files corresponding to a second type of files, the second type of files comprise data obtained after updating at least one item of data in the first type of files, the index files are parallelly constructed, the sequence of the index files is determined by the first type of files or the second type of files corresponding to the index files, the sequence of the index files of the first type of files is consistent with the sequence of the index files of the first type of files, and the sequence of the index files of the second type of files is consistent with the sequence of the index files of the second type of files.

20. An index building apparatus, the apparatus comprising:

The second file distinguishing module is used for distinguishing stock files from increment files;

The second index construction module is used for constructing indexes for the stock files and the increment files in parallel and respectively generating corresponding index files;

The second index file sequence determining module is used for determining the sequence of the index files according to stock files or increment files corresponding to the index files, wherein the sequence is used for determining target index files from at least two index files comprising target data, and the target data in the target index files comprise response data;

21. An index building apparatus, the apparatus comprising:

the snapshot information generation module is used for carrying out snapshot processing on the stock files to obtain second snapshot information and generating third snapshot information of an initial state, wherein the third snapshot information of the initial state is determined by whether index files of the stock files exist or not;

the third index construction module is used for constructing indexes for the stock files and the increment files in parallel and respectively generating corresponding index files;

The stock index recording module is used for recording an index file corresponding to the stock file into the third snapshot information according to the second snapshot information;

The third index file sequence determining module is used for determining the sequence of the index files according to the third snapshot information, wherein the sequence is used for determining target index files from at least two index files comprising target data, the target data in the target index files comprise response data, the sequence among the index files of the stock files is consistent with the sequence of the stock files, and the sequence among the index files of the increment files is consistent with the sequence of the increment files.

22. An electronic device is characterized by comprising a processor; and

A memory having executable code stored thereon that, when executed, causes the processor to perform the method of any of claims 1-17.

23. One or more machine readable media having executable code stored thereon that, when executed, causes a processor to perform the method of any of claims 1-17.