CN115495483B - Data batch processing method, device, equipment and computer readable storage medium - Google Patents
Data batch processing method, device, equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN115495483B CN115495483B CN202211152486.1A CN202211152486A CN115495483B CN 115495483 B CN115495483 B CN 115495483B CN 202211152486 A CN202211152486 A CN 202211152486A CN 115495483 B CN115495483 B CN 115495483B
- Authority
- CN
- China
- Prior art keywords
- array
- search
- variable
- updating
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a data batch processing method, a device, equipment and a computer readable storage medium, wherein the method comprises the following steps: determining a search variable with a null matching return value as an error variable, and analyzing to obtain index position information of the error variable; receiving an update variable corresponding to the error variable, and obtaining an update return value according to the update variable matching; using the updated return value as a search key, using the updated variable as a target value to construct an updated key value pair, and updating an original search array according to the updated key value pair and the index position information to obtain an updated search array; and executing search according to the updated search array to obtain a library result array. The data batch processing method provided by the invention can cope with the special condition that the return value is empty, and can keep the highly consistent of the database result array generated after correction and the original search array.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a computer readable storage medium for batch processing of data.
Background
Despite the development of information technology and the demand of business offices, attention to mass processing capability of data is increasing in addition to simply pursuing innovation in response speed of data processing processes. In the prior art, in the process of batch data processing, a great deal of time is spent for manual induction and arrangement of data, for example, when enterprise information data is analyzed, an enterprise name is generally used as a keyword to search on a data platform, so as to obtain information associated with the corresponding enterprise name. In the process, on one hand, due to the error of manual input, some wrongly written characters exist in the enterprise name, so that the corresponding result cannot be retrieved; on the other hand, even if the user performs the secondary search or match to correct the error in the first search or match, the result of the secondary search or match cannot be associated with the first search or match, resulting in that the originally designed field or form order is disturbed, requiring manual adjustment.
In addition, the technical solutions provided in the prior art are also focused on performing one-time search or matching of the database, and improving the situation that the return value is not empty, so as to generate a clearer data form, or improve the search efficiency by designing special search logic. However, none of these prior art techniques takes into account the fact that the return value is empty, which results in a careless situation in the actual application and brings many troubles to the user.
Disclosure of Invention
The invention aims to provide a data batch processing method, which solves the technical problems that the prior art cannot cope with the situation that a matching return value is empty and cannot cope with the situation that a result form corresponding to a strict matching is generated in a batch search scene.
One of the purposes of the present invention is to provide a data batch processing device.
One of the objects of the present invention is to provide an electronic device.
It is an object of the present invention to provide a computer-readable storage medium.
In order to achieve one of the above objects, an embodiment of the present invention provides a data batch processing method, including: determining a search variable with a null matching return value as an error variable, and analyzing to obtain index position information of the error variable; receiving an update variable corresponding to the error variable, and obtaining an update return value according to the update variable matching; using the updated return value as a search key, using the updated variable as a target value to construct an updated key value pair, and updating an original search array according to the updated key value pair and the index position information to obtain an updated search array; and executing search according to the updated search array to obtain a library result array.
As a further improvement of an embodiment of the present invention, the method further includes: and receiving the original search array, and executing matching according to the original search array to obtain a plurality of matching return values corresponding to a plurality of search variables in the original search array.
As a further improvement of an embodiment of the present invention, the matching return value is recorded in a preset classified index form, and the classified index form is stored in a remote dictionary service framework of the server center.
As a further improvement of an embodiment of the present invention, the method specifically includes: and executing sequential queue inquiry in the original retrieval array according to the error variable to obtain the index position information.
As a further improvement of an embodiment of the present invention, the method further includes: creating an update input box corresponding to the error variable, retrieving a form based on a preset classification and outputting associated entry information pointing to the error variable.
As a further improvement of an embodiment of the present invention, the method specifically includes: and updating the position of the update key value pair stored in the original retrieval array pointed by the index position information to obtain the update retrieval array.
As a further improvement of an embodiment of the present invention, the method specifically includes: and calling a preset array item modification function, taking the index position information as an index value in the array item modification function, and executing updating by taking the updating variable as an insertion value to obtain the updating search array.
As a further improvement of an embodiment of the present invention, the method specifically includes: searching is carried out in a preset search database by taking a search key in the updating search array as an index value, so that an expansion information value corresponding to each key value pair in the updating search array is obtained; and according to the corresponding search variable and the corresponding expansion information value, and the update variable and the corresponding expansion information value, the library result array is obtained according to the sequence arrangement in the update search array.
As a further improvement of an embodiment of the present invention, the method further includes: receiving an original search array, constructing a remote dictionary service frame at a server center, and storing the original search array into a temporary database constructed based on the remote dictionary service frame; the method further comprises the steps of: and rendering the library result array on a front page.
As a further improvement of an embodiment of the present invention, the method further includes: receiving an original search form, traversing and analyzing the original search form, generating a corresponding JS object numbered musical notation, and taking the JS object numbered musical notation as the original search array; wherein at least part of the original retrieval form is in the form of a spreadsheet file.
As a further improvement of an embodiment of the present invention, the method specifically includes: extracting data in the original retrieval form in batches according to a preset data packaging window to obtain a plurality of original data sets; sequentially performing cleaning, screening and packaging on the original data set to obtain a plurality of original array columns; and executing translation on the original array to generate a corresponding JS object numbered musical notation, and taking the JS object numbered musical notation as the original retrieval array.
In order to achieve one of the above objects, an embodiment of the present invention provides a data batch processing apparatus, including: the null value index unit is used for determining that a search variable with a null matching return value is an error variable and analyzing to obtain index position information of the error variable; the updating matching unit is used for receiving an updating variable corresponding to the error variable and obtaining an updating return value according to the updating variable matching; the array updating unit is used for constructing an updating key value pair by taking the updating return value as a search key and taking the updating variable as a target value, and updating the original search array according to the updating key value pair and the index position information to obtain an updated search array; and the library searching unit is used for executing searching according to the updated searching array to obtain a library result array.
In order to achieve one of the above objects, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; the memory is used for storing a computer program; the processor is configured to implement the steps of the data batch processing method according to any one of the above technical schemes when executing the program stored in the memory.
To achieve one of the above objects, an embodiment of the present invention provides a computer readable storage medium having a computer program stored thereon, which when executed by a processor, implements the steps of the data batch processing method according to any one of the above aspects.
Compared with the prior art, the invention analyzes the index position information of the error variable, matches the update variable with the update return value after receiving the corresponding update variable, and establishes the association relation between the key value pair generated by the association relation and the index position information, thereby correcting the error variable and generating the key value pair capable of executing subsequent retrieval on the premise of keeping the position condition of the error variable in the original retrieval array, effectively matching the special condition that the matching return value is empty during primary retrieval or matching, and ensuring the high consistency of the finally output library result array and the original retrieval array input by a user as much as possible.
Drawings
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of a data batch processing apparatus according to an embodiment of the present invention.
FIG. 3 is a schematic diagram showing steps of a data batch processing method according to an embodiment of the present invention.
Fig. 4 is a schematic diagram showing steps of a first example of a data batch processing method according to an embodiment of the present invention.
FIG. 5 is a schematic diagram showing steps of a second example of a data batch processing method according to an embodiment of the present invention.
FIG. 6 is a partial step schematic diagram of a specific example of a second embodiment of a data batch processing method in accordance with an embodiment of the present invention.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments shown in the drawings. These embodiments are not intended to limit the invention and structural, methodological, or functional modifications of these embodiments that may be made by one of ordinary skill in the art are included within the scope of the invention.
It should be noted that the term "comprises," "comprising," or any other variation thereof is intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Furthermore, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
The processing of batch data information, especially the data searching based on a large number of data labels, and the output of the returned searching result to generate an array or form for the user to refer, becomes the development direction of the field from the beginning of the popularization and development of the Internet. In particular, in the process of performing enterprise data analysis for background investigation or due-job investigation and evaluation, a known enterprise name is generally required to be used as the data tag, and batch search processing is performed according to a large number of interested enterprise names, so that detailed information of a corresponding enterprise is finally obtained for reference of investors. In the process, besides the optimization of the coordination between the interface call and the front end and the back end, the search mode and the error handling mechanism are required to be optimized.
On the one hand, the legibility of the output form or array needs to be improved, and the output content needs to accord with the normal reading habit of the user. On the other hand, it is necessary to assist the user in correction by reflecting special cases such as search failure. The combination of these two aspects, realizing the advantages of output content regularity and special case handling, is a difficulty in the technical development in the field, and is one of the objects of the present invention.
In one embodiment, the present invention provides a computer readable storage medium sufficient to solve the above technical problems and related derivative technical problems, where the computer readable storage medium is provided in a computer and stores a computer program, and the computer readable storage medium may be any available medium that can be accessed by the computer, or may be a storage device such as a server, a data center, or the like that includes one or more integration of available media. The usable medium may be a magnetic medium such as a floppy disk, a hard disk, a magnetic tape, or an optical medium such as a DVD (Digital Video Disc, high-density digital video disc), or a semiconductor medium such as an SSD (Solid STATE DISK ). When the computer program is executed by any processor in a computer, a data batch processing method is implemented to execute: index position analysis of error variable, matching of return values of updated variable, updating of original search array and generation of updated search array, and generation of search and library search array.
An embodiment of the present invention further provides an electronic device 100 as shown in fig. 1, where the electronic device 100 includes a processor 11, a communication interface 12, a memory 13, and a communication bus 14. The processor 11, the communication interface 12, and the memory 13 perform communication with each other via the communication bus 14.
Wherein the memory 13 is used for storing a computer program; the processor 11 is configured to execute a program stored on a memory, which may be a computer program stored on a computer readable storage medium as described above. When executing the program, the processor 11 may implement a data batch processing method, which may specifically include: index position analysis of error variable, matching of return values of updated variable, updating of original search array and generation of updated search array, and generation of search and library search array.
Specifically, the communication bus 14 may be a PCI bus (PERIPHERAL COMPONENT INTERCONNECT, peripheral component interconnect standard) or an EISA bus (Extended Industry Standard Architecture ), or the like. The communication bus 14 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in fig. 1, but not only one bus or one type of bus.
The communication interface 12 is used for communication between the electronic device 100 and other devices described above.
The Memory 13 may include RAM (Random Access Memory ) or NVM (Non-Volatile Memory), such as at least one magnetic disk Memory. Alternatively, the memory 13 may be at least one memory device located remotely from the aforementioned processor 11.
The processor 11 may be a general-purpose processor including a CPU (Central Processing Unit ), NP (Network Processor, network processor) or the like, and may also be a DSP (DIGITAL SIGNAL Processing, digital signal processor), ASIC (Application SPECIFIC INTEGRATED Circuit), FPGA (Field-Programmable GATE ARRAY) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components.
An embodiment of the present invention further provides a data batch processing apparatus 200 as shown in fig. 2, where the data batch processing apparatus 200 includes a null index unit 21, an update matching unit 22, an array update unit 23, and a library retrieval unit 24.
The null value indexing unit 21 is configured to determine that a search variable whose matching return value is null is an error variable, and analyze the index position information of the error variable. The update matching unit 22 is configured to receive an update variable corresponding to the error variable, and obtain an update return value according to the update variable matching. The array updating unit 23 is configured to construct an update key value pair with the update return value as a search key and the update variable as a target value, and perform updating on an original search array according to the update key value pair and the index position information, so as to obtain an updated search array. The library searching unit 24 is configured to perform searching according to the updated search array, so as to obtain a library result array.
The null value indexing unit 21, the update matching unit 22, the array updating unit 23 and the library retrieving unit 24 may be hardware devices with an imaging appearance, or may be function allocation areas with abstract positions in one or more hardware devices. For the former, four units may be connected in sequence to constitute at least part of the data batch processing apparatus 200; in the latter case, the four units may establish communication relationships in order to collectively realize the functional configuration of the data batch processing apparatus 200.
The four units described above may also be used to implement other functions involved in the present invention. For example, in one embodiment, null index element 21 may also be used to perform a queue query in the original search array. In one embodiment, the update matching unit 22 may also be used to create an update input box and/or to retrieve form output associative entry information based on the classification. In one embodiment, the array updating unit 23 may be further configured to update the update key value pair stored in the original search array to obtain the updated search array, and/or to call a preset array entry modification function. In one embodiment, the library retrieval unit 24 may also be configured to perform a search in a retrieval database and obtain an extended information value, and/or to generate a library result array based on the search variable, the update variable, and the extended information value.
The data processing apparatus provided by the invention may also comprise other units for implementing other functions. For example, in one embodiment, the data batch processing apparatus 200 further includes a service building unit for building a remote dictionary service framework and storing the original search array in a temporary database. In one embodiment, the data batch processing apparatus 200 further includes a page rendering unit for calling the library result array stored in the temporary database and performing rendering on the front page. In one embodiment, the data batch processing apparatus 200 further includes a form parsing unit, configured to traverse and parse and correspondingly generate a JS object numbered musical notation (JavaScript Object Notation, JSON) pointing to the original search form, and/or perform at least one of operations such as dividing, cleaning, screening, packaging, and the like on the data in the original search form, and/or perform translation on the original array number sequence to obtain the original search array.
As further shown in fig. 3, an embodiment of the present invention provides a data batch processing method, where a program or an instruction corresponding to the method may be loaded on the computer readable storage medium and/or the electronic device 100 and/or the data batch processing apparatus 200, so as to achieve a technical effect of data batch processing. The method specifically comprises the following steps.
Step 31, determining the search variable with the empty matching return value as the error variable, and analyzing to obtain the index position information of the error variable.
And step 32, receiving an update variable corresponding to the error variable, and obtaining an update return value according to the update variable matching.
And step 33, using the updated return value as a search key, using the updated variable as a target value to construct an updated key value pair, and updating the original search array according to the updated key value pair and the index position information to obtain an updated search array.
And step 34, searching is performed according to the updated search array, so as to obtain a library result array.
Therefore, the updating variable can be subjected to secondary matching, and the original retrieval array can be updated according to the predetermined index position information, so that the sequence and arrangement mode of the original retrieval array can be reserved to the greatest extent, the defect that errors exist in the original retrieval array is overcome, and finally, an accurate and consistent array output result is obtained for reference of a user.
The matching return value may refer to an expansion information value obtained by searching based on a search variable, or may refer to a search return value obtained by matching based on a search variable. The expansion information value and the search return value differ in that they point to data on different levels corresponding to the search variable. In one embodiment, the extended information value may be interpreted as information data recorded in a search database that is desired by a user based on a search variable. In one embodiment, the search return value may be interpreted as data corresponding to a search variable and used by the system to perform an extended information value search. The search return value is not necessarily described in the search database.
Take matching and retrieving of enterprise information as an example. The search variable may be known information used by the user to perform the search, and may include business name information, high-level manager information, or the like. The retrieval return value may be information that the system or platform uses to retrieve, but may be unknown to the user side, may include an enterprise code corresponding to the enterprise name information, an enterprise ID, may include a high-level manager information, etc. The expansion information value may be specific information of an enterprise that the user desires to retrieve based on the retrieval variable, may include at least one of registration status information, legal representative information, registered capital information, established date information, approval date information, province information of the belonging, city information of the belonging, county information of the belonging, and the like, and may include at least one of penmanship information, penmanship time information, investment business information, investment proportion information, and the like, which correspond to the information of the advanced manager.
For different definitions of the match return value formation, a different step may be included before the step of determining that the search variable for which the match return value is empty is the wrong variable in step 31. In one embodiment, if the matching return value is defined as the expansion information value, step 31 may further include a step of first retrieval, that is, step 31 may be preceded by: and receiving the original search array, and executing search according to the original search array to obtain a plurality of matching return values corresponding to a plurality of search variables in the original search array. And specifically may include: matching the corresponding retrieval return value by taking the retrieval variable as a reference; and executing search according to the search return value to obtain an expansion information value corresponding to the search variable as the matching return value.
In another embodiment, the matching return value is defined as the search return value, and step 31 does not include the step of first searching, but includes the step of first matching, that is, may include, before step 31: and receiving the original search array, and executing matching according to the original search array to obtain a plurality of matching return values corresponding to a plurality of search variables in the original search array. And specifically may include: and matching the corresponding retrieval return value by taking the retrieval variable as a reference, and taking the retrieval return value as the matching return value.
The second embodiment is preferably adopted, which is based on the premise that if a retrieval return value such as an enterprise ID can be retrieved in a specific scene, a corresponding expansion information value of enterprise information data such as a legal representative can be necessarily retrieved. Therefore, based on the second embodiment, preliminary screening can be performed, and step redundancy caused by interface calling or database searching of error variables can be avoided. Of course, the present invention does not exclude that the technical solution provided by the first embodiment is adopted in other application scenarios, and the corresponding purpose can be achieved.
Preferably, the matching return value, in particular, the matching return value configured as the search return value may be recorded in a preset classification index table. Therefore, after the original search array is received, matching can be directly carried out in the classified index form according to the search variable, and a corresponding matching return value is obtained. In particular, when the search variable is the business name information and the matching return value is the business ID, the sort index table may be a data table in which a data pair such as "business name information-business ID" is described. In terms of superordinate, the classified index form may be interpreted as a data form that records such data pairs as "pre-stored variables-pre-stored return values".
Further, the categorized index form may be stored at a server farm located between the front-end pages and the back-end database. Therefore, conversion between the enterprise name and the enterprise ID is conveniently executed, and the enterprise ID is utilized to search the follow-up expansion information value. The technical scheme has outstanding effects when dealing with the special situation of a renamed enterprise.
In particular, the server middle station may be built with a remote dictionary service (Remote Dictionary Server, redis) or its framework, so that the classification index form may be stored in the remote dictionary service or its framework, so that the server and the processing process have advantages and functions of high performance, strong data type compatibility, operation atomicity, data persistence, data backup, and the like. Based on the configuration, the data server middle station can be also suitable for expanded application scenes such as session cache, full page cache, queues, ranking list/counter, publishing/subscribing and the like.
The classification index form at least contains two aspects of enterprise ID and enterprise name information, and classification logic of the classification index form can be performed according to registration attribution. Based on this, preferably, after querying the enterprise ID with the enterprise name information as a standard, the attribution information corresponding to different enterprises can also be obtained, so that different search databases are invoked for searching according to the attribution information.
In addition, any of the steps provided in the above two embodiments may be provided between step 41 and step 31 of the example corresponding to fig. 5 below. Such as the order of the steps, and can be adjusted as desired by those skilled in the art.
For the step 31, further, the "analyzing the index position information of the error variable" may further preferably be: and executing sequential queue inquiry in the original retrieval array according to the error variable to obtain the index position information. The index position information obtained in this way reserves the arrangement sequence from the original search variable to the current error variable of the original search array, thereby facilitating the update operation with better consistency to the original search array after the update variable is captured subsequently. Of course, the invention does not exclude other queue query and polling solutions. Preferably, the sequential queue searching can be realized by adopting forEach functions, can be well adapted to array inquiry, and can carry out traversal inquiry on array content, array index and array whole.
Between said steps 31 to 32, in particular, between "determining that the retrieved variable whose match return value is null is an error variable" and "receiving an update variable corresponding to the error variable", the present invention also provides a step of capturing the update variable, which may preferably include: and creating an update input box corresponding to the error variable, and searching and outputting associated entry information pointing to the error variable based on a preset search database.
In one aspect, as can be seen from the above description, the specific step of capturing the updated variable is not strictly set after step 31, and the step may be performed in parallel with or before "analyzing the index position information of the error variable" in step 31, which is not limited by the present invention.
On the other hand, through the configuration of the steps, the integrated error variable can be output, the user is reminded of cognizing and updating the error variable, and convenience can be provided for updating operation of the user.
Firstly, the update input box may be a prompt box directly rendered on the front-end page and including an error variable field, so that the user may correct the error variable, that is, input the update variable, through self judgment and prompting of associated entry information in the update input box.
Secondly, the classified search form may be equally understood with the classified search form provided above for searching for the business ID, that is, a search variable which may be associated with an error variable and which can be detected in the search database is output to the user side, or referred to as "business name information having an explicit business ID" is output to the user side as the associated term information.
Furthermore, the process of obtaining the associated term information based on the matching of the error variable in the classified search form may be implemented based on any natural language processing (NLP, natural Language Processing) algorithm provided in the prior art. Preferably, on the basis of sorting the business name information (i.e. the pre-stored variables) in the sorting search form with the number of repeated words as the weight, the corresponding weight may also be set between the fields characterizing the different contents. For example, the business name information generally includes contents such as "region-name subject-business type", and based on this, the name subject field may be set to have a larger weight and the region field and the business type field may be set to have a smaller weight in order after the above-described division of the error variable. Thus, the associated term information more needed by the user can be matched. Preferably, the associative term information is configured as the first five sets of business name information after the above-mentioned positive arrangement of weights (from high weight to low weight).
In addition, the technical scheme of "updating the variable matching to update the return value" may be implemented by referring to the foregoing technical scheme of obtaining the matching return value according to the search variable matching, which is not described herein. However, it is worth emphasizing that the technical solution of matching the update variable in the classified index table (particularly, the pre-stored variable therein) to obtain the update return value (i.e., the pre-stored return value corresponding to the update variable) is preferably adopted, so that the overall processing speed can be increased. Of course, under the technical scheme of providing the association term information, the click operation of the association term information from the front-end page can also be directly received, and the selected association term information is directly set as the update variable, so that a pre-stored return value taking the update variable as a pre-stored variable is searched in the classification index table, and the pre-stored return value is taken as the update return value.
For the step 33, the update key value pair is shaped as "search key-target value", and the update search array includes a plurality of key value pairs shaped as "match return value-search variable" and "update return value-update variable". In a specific embodiment, the previous key pair may be interpreted as "business ID-business name information", and the next key pair may be interpreted as "business ID obtained by updating and re-matching the original wrong business name information" and correct business name information obtained by updating the original wrong business name ".
Since step 34 performs the search based on the updated search array, in order to ensure that the library result array maintains a high degree of consistency with the original search array, "update the original search array according to the update key value pair and the index position information" provided in step 33 of the present invention may preferably include: and updating the position of the update key value pair stored in the original retrieval array pointed by the index position information to obtain the update retrieval array. Thus, the unification of the update search array, the original search array and the library result array on the data arrangement is further achieved.
Preferably, the above-mentioned process can be implemented by calling a preset function, so as to achieve the technical effects of simplifying the whole processing steps and accelerating the processing efficiency. Specifically, the steps may specifically include: and calling a preset array item modification function, taking the index position information as an index value in the array item modification function, and executing updating by taking the updating variable as an insertion value to obtain the updating search array.
In a specific example provided by the present invention, the array item modification function may be a splice function. Based on this, the index position information, the index value corresponds to the first bit index in the slice function, the update variable, the insert value corresponds to the third bit in the slice function, and the parameters (item 1, item2, item3, … …) located after it. When deletion is not involved, the second bit howmany in the slice function may be configured as 0 in this embodiment.
Further, in the first embodiment based on the foregoing embodiment, as shown in fig. 4, in order to further facilitate the retrieval of the library result array, the method may specifically further include the following steps.
Step 31, determining the search variable with the empty matching return value as the error variable, and analyzing to obtain the index position information of the error variable.
And step 32, receiving an update variable corresponding to the error variable, and obtaining an update return value according to the update variable matching.
And step 33, using the updated return value as a search key, using the updated variable as a target value to construct an updated key value pair, and updating the original search array according to the updated key value pair and the index position information to obtain an updated search array.
And step 34, searching is performed according to the updated search array, so as to obtain a library result array. The step 34 specifically includes:
And 341, searching in a preset search database by taking the search key in the updated search array as an index value to obtain an expansion information value corresponding to each key value pair in the updated search array.
And 342, according to the corresponding search variable and the corresponding expansion information value, and the update variable and the corresponding expansion information value, arranging the update variable and the corresponding expansion information value according to the sequence in the update search array to obtain a library result array.
Therefore, the expansion information value corresponding to the search variable or the update variable can be determined based on the more definite matching return value or the update return value in the update search array, so that information required by a user is searched in the search database. Since the data arrangement order corresponding to the original search array is retained in the update search array, step 342 is performed to enable the data arrangement order of the output library result array to coincide with the data arrangement order of the original search array input by the user, enhancing the legibility of the search result.
The search key in the update search array may include not only the update return value but also the match return value, and each key value pair in the update search array may include not only a key value pair in the form of "update return value-update variable" but also a key value pair in the form of "match return value-search variable". Thus, all "business name information-business specific information" in the form of "update variable-expansion information value" and "retrieve variable-expansion information value" may be included in the library result array at the same time. The data in the two aspects are arranged and output together, so that the user experience can be improved, and the priority connotation reflected in the information sequence of the enterprise name can be ensured not to be lost.
The invention further provides a second embodiment based on the above embodiment, which improves the efficiency of data processing by storing the original search array in the server platform and avoids repeated and meaningless calling of the interface. It will be appreciated that this second embodiment may be implemented as a solution independent of the first embodiment described above, or may be combined with the first embodiment described above. As shown in fig. 5, the second embodiment specifically includes the following steps.
Step 41, receiving the original search array, constructing a remote dictionary service framework at the server center, and storing the original search array into a temporary database constructed based on the remote dictionary service framework.
Step 31, determining the search variable with the empty matching return value as the error variable, and analyzing to obtain the index position information of the error variable.
And step 32, receiving an update variable corresponding to the error variable, and obtaining an update return value according to the update variable matching.
And step 33, using the updated return value as a search key, using the updated variable as a target value to construct an updated key value pair, and updating the original search array according to the updated key value pair and the index position information to obtain an updated search array.
And step 34, searching is performed according to the updated search array, so as to obtain a library result array.
Therefore, by means of constructing the temporary storage database in the server center, the original search array can be conveniently called and processed in subsequent steps, particularly in the aspects of searching based on the original search array, updating error variables and updating the original search array, a large amount of data can be uniformly processed, and the efficiency of the whole process is improved.
Notably, on the one hand, other steps may be provided in step 41 and step 31 to further optimize the technical solution provided by the present invention. On the other hand, when the above technical solution of the second embodiment is combined with at least part of the first embodiment, temporary storage of the original search array and temporary storage of the classified index form can be simultaneously realized at the remote dictionary service or the framework thereof constructed by the server center, so that the storage service between the front end and the rear end is fully utilized to ensure the timeliness of the response, and the redundancy of the interface calling step is avoided. The data stored in the server platform can accept the new addition, deletion and modification of the administrator, and the configuration can also enable the adjustment of the overall data architecture to be more flexible.
Preferably, in a specific example of this second embodiment, the step 34 may further include a step 42: and rendering the library result array on the front-end page. Therefore, unnecessary resource waste caused by that the content of the library result array which is updated frequently is stored in the server platform as well is avoided. And the response speed of the whole retrieval process can be improved, and overlong waiting time of a user is avoided.
In another specific example based on the above second embodiment, step 41 may be preceded by a step 40 for processing and generating an original search array. Step 40 may specifically include: and receiving an original search form, traversing and analyzing the original search form, generating a corresponding JS object numbered musical notation, and taking the JS object numbered musical notation as the original search array. Therefore, the original search array is configured to have a JS object numbered musical notation form, so that the efficiency of subsequent data processing can be improved, and the original search array has wider compatibility. Meanwhile, the disadvantage that the simple spectrum of the JS object does not have an error processing step can be made up to a certain extent by matching with the following steps 31 to 34.
Preferably, at least part of the original search form is in the form of a spreadsheet file. Thus, the application universality of the technical scheme of the invention is improved. Specifically, the spreadsheet file may be an Excel document to facilitate user operations. Of course, the invention does not exclude the technical scheme of generating the corresponding JS object numbered musical notation according to the simple text data to serve as the original search array or generating the original search array according to the simple text data.
Of course, the step 40 is not limited to be disposed before the step 41, and in the technical solution that does not include the step 41, the step 40 may be disposed at any position before the step 31, so as to achieve the desired technical effect.
Preferably, in a specific example, the step 40 may further include a step as shown in fig. 6, so as to split the original search form, and reduce the volume of the data to be processed, so as to prevent the interface high concurrency scheme from dragging the overall query speed. The step 40 specifically includes the following steps.
Step 401, extracting data in an original retrieval form in batches according to a preset data packaging window to obtain a plurality of original data sets.
Step 402, cleaning, screening and packaging are sequentially performed on the original data set, so as to obtain a plurality of original array columns.
In step 403, translation is performed on the original array sequence to generate a corresponding JS object numbered musical notation, and the JS object numbered musical notation is used as the original search array.
When the original search form is configured to at least partially have an Excel document form, the data length extracted from the original search form can support 5000 cells at maximum, and each cell can support 100 characters respectively, so that 5000 cells can be segmented into a group of 200 cells at maximum based on the steps provided by the invention, namely, the data packaging window is defined as the number of single coverage cells in the original search form. Therefore, the query speed can be optimized, the volume of data required each time is reduced, and the pressure of the interface required each time is reduced.
According to the embodiment provided by the invention, the universality of the JS object numbered musical notation relative to different compiling languages is utilized, so that the original retrieval array which can be processed by the platform where the subsequent steps are located can be obtained through simple translation. Of course, the present invention also includes a technical solution that determines the current compiling language through a server platform and the like, and translates the original array based on the specific compiling language, thereby obtaining an original search array that may have other forms.
In addition, by implementing the technical scheme provided by the invention, the effects of checking enterprises, checking reports, checking beneficiaries and checking groups in batches by one key can be realized, and the functions of deriving and downloading custom information dimension, monitoring and focusing on interested enterprises by one key can be realized according to the selection of the technicians in the field. The business name information described above may alternatively be implemented as a uniform social credit code for the business.
In summary, the invention analyzes the index position information of the error variable, matches the update variable with the update return value after receiving the corresponding update variable, and establishes the association relation between the key value pair generated thereby and the index position information, thereby correcting the error variable and generating the key value pair capable of executing subsequent retrieval on the premise of retaining the position condition of the error variable in the original retrieval array, effectively aiming at the special condition that the match return value is empty during primary retrieval or matching, and ensuring the high consistency of the finally output library result array and the original retrieval array input by a user as much as possible.
It should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is for clarity only, and that the skilled artisan should recognize that the embodiments may be combined as appropriate to form other embodiments that will be understood by those skilled in the art.
The above list of detailed descriptions is only specific to practical embodiments of the present invention, and they are not intended to limit the scope of the present invention, and all equivalent embodiments or modifications that do not depart from the spirit of the present invention should be included in the scope of the present invention.
Claims (13)
1. A data batch processing method, comprising:
receiving an original search array, and executing matching according to the original search array to obtain a plurality of matching return values corresponding to a plurality of search variables in the original search array; the matching return value is a retrieval return value corresponding to the matching by taking the retrieval variable as a reference;
determining a search variable with a null matching return value as an error variable, and analyzing to obtain index position information of the error variable;
receiving an update variable corresponding to the error variable, and obtaining an update return value according to the update variable matching;
Using the updated return value as a search key, using the updated variable as a target value to construct an updated key value pair, and updating the original search array according to the updated key value pair and the index position information to obtain an updated search array; the updating search array comprises a key value pair formed by a matching return value and a search variable, and a key value pair formed by an updating return value and an updating variable;
Searching is carried out according to the updated search array, and a library result array is obtained;
the "performing the search according to the updated search array to obtain the library result array" includes:
Searching is carried out in a preset search database by taking a search key in the updating search array as an index value, so that an expansion information value corresponding to each key value pair in the updating search array is obtained; the search key comprises the matching return value and the updating return value;
And according to the corresponding search variable and the corresponding expansion information value, and the update variable and the corresponding expansion information value, the library result array is obtained according to the sequence arrangement in the update search array.
2. The data batch processing method according to claim 1, wherein the matching return value is recorded in a preset classified index form stored in a remote dictionary service frame of the server center.
3. The data batch processing method according to claim 1, wherein the method specifically comprises:
And executing sequential queue inquiry in the original retrieval array according to the error variable to obtain the index position information.
4. The method of batch processing of data according to claim 1, further comprising:
creating an update input box corresponding to the error variable, retrieving a form based on a preset classification and outputting associated entry information pointing to the error variable.
5. The data batch processing method according to claim 1, wherein the method specifically comprises:
and updating the position of the update key value pair stored in the original retrieval array pointed by the index position information to obtain the update retrieval array.
6. The method for batch processing of data according to claim 5, wherein the method specifically comprises:
And calling a preset array item modification function, taking the index position information as an index value in the array item modification function, and executing updating by taking the updating variable as an insertion value to obtain the updating search array.
7. The data batch processing method according to claim 1, wherein the method specifically comprises:
And according to the corresponding search variable and the corresponding expansion information value, and the update variable and the corresponding expansion information value, the library result array is obtained according to the sequence arrangement in the update search array.
8. The method of batch processing of data according to claim 1, further comprising:
Receiving an original search array, constructing a remote dictionary service frame at a server center, and storing the original search array into a temporary database constructed based on the remote dictionary service frame;
The method further comprises the steps of:
And rendering the library result array on a front page.
9. The method of batch processing of data according to claim 1, further comprising:
Receiving an original search form, traversing and analyzing the original search form, generating a corresponding JS object numbered musical notation, and taking the JS object numbered musical notation as the original search array;
wherein at least part of the original retrieval form is in the form of a spreadsheet file.
10. The data batch processing method according to claim 9, characterized in that the method specifically comprises:
extracting data in the original retrieval form in batches according to a preset data packaging window to obtain a plurality of original data sets;
Sequentially performing cleaning, screening and packaging on the original data set to obtain a plurality of original array columns;
and executing translation on the original array to generate a corresponding JS object numbered musical notation, and taking the JS object numbered musical notation as the original retrieval array.
11. A data batch processing apparatus, comprising:
the null value index unit is used for receiving an original search array, executing matching according to the original search array to obtain a plurality of matching return values corresponding to a plurality of search variables in the original search array, determining the search variable with the matching return value being null as an error variable, and analyzing to obtain index position information of the error variable; the matching return value is a retrieval return value corresponding to the matching by taking the retrieval variable as a reference;
the updating matching unit is used for receiving an updating variable corresponding to the error variable and obtaining an updating return value according to the updating variable matching;
The array updating unit is used for constructing an updating key value pair by taking the updating return value as a search key and taking the updating variable as a target value, and updating the original search array according to the updating key value pair and the index position information to obtain an updated search array; the updating search array comprises a key value pair formed by a matching return value and a search variable, and a key value pair formed by an updating return value and an updating variable;
The library searching unit is used for performing searching according to the updating searching array to obtain a library result array, performing searching in a preset searching database by taking a searching key in the updating searching array as an index value to obtain an expansion information value corresponding to each key value pair in the updating searching array, and arranging according to the searching variable and the expansion information value, the updating variable and the expansion information value which correspond to each other and the sequence in the updating searching array to obtain the library result array; the search key includes the match return value and the update return value.
12. An electronic device, comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are in communication with each other through the communication bus;
The memory is used for storing a computer program;
the processor is configured to implement the steps of the data batch processing method according to any one of claims 1 to 10 when executing a program stored in a memory.
13. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the data batch processing method as claimed in any one of claims 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211152486.1A CN115495483B (en) | 2022-09-21 | 2022-09-21 | Data batch processing method, device, equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211152486.1A CN115495483B (en) | 2022-09-21 | 2022-09-21 | Data batch processing method, device, equipment and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115495483A CN115495483A (en) | 2022-12-20 |
CN115495483B true CN115495483B (en) | 2024-08-20 |
Family
ID=84471232
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211152486.1A Active CN115495483B (en) | 2022-09-21 | 2022-09-21 | Data batch processing method, device, equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115495483B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109902143A (en) * | 2019-03-04 | 2019-06-18 | 南京邮电大学 | A Multi-Keyword Extended Retrieval Method Based on Ciphertext |
CN112035730A (en) * | 2020-11-05 | 2020-12-04 | 北京智源人工智能研究院 | A semantic retrieval method, device and electronic device |
CN112966478A (en) * | 2021-03-30 | 2021-06-15 | 建信金融科技有限责任公司 | Format conversion method and device for table data and storage medium |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6144958A (en) * | 1998-07-15 | 2000-11-07 | Amazon.Com, Inc. | System and method for correcting spelling errors in search queries |
US8972431B2 (en) * | 2010-05-06 | 2015-03-03 | Salesforce.Com, Inc. | Synonym supported searches |
US11170425B2 (en) * | 2014-03-27 | 2021-11-09 | Bce Inc. | Methods of augmenting search engines for eCommerce information retrieval |
CN104268157A (en) * | 2014-09-03 | 2015-01-07 | 乐视网信息技术(北京)股份有限公司 | Device and method for error correction in data search |
CN107609098B (en) * | 2017-09-11 | 2019-02-01 | 北京金堤科技有限公司 | Searching method and device |
CN110119410A (en) * | 2018-01-10 | 2019-08-13 | 北大方正集团有限公司 | Processing method and processing device, computer equipment and the storage medium of reference book data |
CN110069610B (en) * | 2019-03-16 | 2024-03-19 | 平安科技(深圳)有限公司 | Solr-based retrieval method, solr-based retrieval device, solr-based retrieval equipment and storage medium |
CN112699214A (en) * | 2020-12-24 | 2021-04-23 | 成都六人行信息科技有限公司 | Keyword matching analysis direct system and method |
CN114461672A (en) * | 2022-01-18 | 2022-05-10 | 上海复深蓝软件股份有限公司 | Data retrieval method, device, computer equipment and storage medium |
CN114647658A (en) * | 2022-03-30 | 2022-06-21 | 新华三信息技术有限公司 | Data retrieval method, device, equipment and machine-readable storage medium |
-
2022
- 2022-09-21 CN CN202211152486.1A patent/CN115495483B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109902143A (en) * | 2019-03-04 | 2019-06-18 | 南京邮电大学 | A Multi-Keyword Extended Retrieval Method Based on Ciphertext |
CN112035730A (en) * | 2020-11-05 | 2020-12-04 | 北京智源人工智能研究院 | A semantic retrieval method, device and electronic device |
CN112966478A (en) * | 2021-03-30 | 2021-06-15 | 建信金融科技有限责任公司 | Format conversion method and device for table data and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN115495483A (en) | 2022-12-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111522816B (en) | Data processing method, device, terminal and medium based on database engine | |
CN113760891B (en) | Data table generation method, device, equipment and storage medium | |
US9390176B2 (en) | System and method for recursively traversing the internet and other sources to identify, gather, curate, adjudicate, and qualify business identity and related data | |
CN110457302A (en) | A kind of structural data intelligence cleaning method | |
CN107329987A (en) | A kind of search system based on mongo databases | |
CN110580308A (en) | information auditing method and device, electronic equipment and storage medium | |
EP4226297A1 (en) | Systems and methods for enabling relevant data to be extracted from a plurality of documents | |
CN112307318A (en) | Content publishing method, system and device | |
CN116797195A (en) | Work order processing method, apparatus, computer device, and computer readable storage medium | |
CN119166740A (en) | Knowledge base construction method, data processing method, device, storage medium and program product | |
US8805803B2 (en) | Index extraction from documents | |
CN110008448B (en) | Method and device for automatically converting SQL code into Java code | |
US10503823B2 (en) | Method and apparatus providing contextual suggestion in planning spreadsheet | |
CN119830003A (en) | Method and computing device for training small parameter model for automatic analysis of electric power contract clause | |
CN118551058B (en) | Relay protection unstructured document information processing method, device and computer equipment | |
CN119597870A (en) | Information retrieval method and system based on RAG and large language model | |
CN118733717A (en) | File duplication checking method, device, equipment, storage medium and program product | |
CN118656395A (en) | A query processing method, device, equipment and readable storage medium | |
CN115495483B (en) | Data batch processing method, device, equipment and computer readable storage medium | |
CN109542890B (en) | Data modification method, device, computer equipment and storage medium | |
CN111563178A (en) | Rule logic diagram comparison method, device, medium and electronic equipment | |
US11195115B2 (en) | File format prediction based on relative frequency of a character in the file | |
CN116821325B (en) | Information extraction method for unstructured report | |
KR102808048B1 (en) | Method for Generating Match Sentence of Attributes of Master Data and Method for Determining Similarity Between Master Data | |
CN119048255A (en) | Information content display method, device, equipment, medium and product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information | ||
CB02 | Change of applicant information |
Country or region after: China Address after: No. 8 Huizhi Street, Suzhou Industrial Park, Suzhou Area, China (Jiangsu) Pilot Free Trade Zone, Suzhou City, Jiangsu Province, 215000 Applicant after: Qichacha Technology Co.,Ltd. Address before: Room 503, 5th floor, C1 Building, 88 Dongchang Road, Suzhou Industrial Park, Jiangsu Province, 215000 Applicant before: Qicha Technology Co.,Ltd. Country or region before: China |
|
GR01 | Patent grant | ||
GR01 | Patent grant |