[go: up one dir, main page]

CN107169085A - A kind of data search system - Google Patents

A kind of data search system Download PDF

Info

Publication number
CN107169085A
CN107169085A CN201710332855.8A CN201710332855A CN107169085A CN 107169085 A CN107169085 A CN 107169085A CN 201710332855 A CN201710332855 A CN 201710332855A CN 107169085 A CN107169085 A CN 107169085A
Authority
CN
China
Prior art keywords
search
data
module
rule
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710332855.8A
Other languages
Chinese (zh)
Other versions
CN107169085B (en
Inventor
杜源
李鸽子
景蔚亮
陈小刚
陈邦明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Xinchu Integrated Circuit Co Ltd
Original Assignee
Shanghai Xinchu Integrated Circuit Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Xinchu Integrated Circuit Co Ltd filed Critical Shanghai Xinchu Integrated Circuit Co Ltd
Priority to CN201710332855.8A priority Critical patent/CN107169085B/en
Publication of CN107169085A publication Critical patent/CN107169085A/en
Application granted granted Critical
Publication of CN107169085B publication Critical patent/CN107169085B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of data search system, belong to data retrieval technology field;System includes:First setup unit, for setting the search period, the processor and internal memory of server are in closed mode, and server is in off-line state;Second setup unit, the search rule of setting data search institute foundation;First data search unit, further comprises:Rule acquisition module, obtains the search rule that this data search is relied on;Rule search module, concatenate rule acquisition module, within the search period, being scanned for according to search rule to the storage network in server, and export corresponding search result;First data search unit is arranged in the controller of the memory inside of server.The beneficial effect of above-mentioned technical proposal is:Data search effectively is carried out using the idle period of server, only data center does not save substantial amounts of power consumption, so as to reduce the cost of serving of service provider, also improves service quality.

Description

A kind of data search system
Technical field
The present invention relates to data retrieval technology field, more particularly to a kind of data search system.
Background technology
With the arriving of cloud era, big data (Big Data) technology is used in daily life more and more In, people are generally described and defined the mass data and associated technology of information explosion epoch generation using big data Development and innovation.With increasing rapidly for data volume, having required for user how is inquired about or searched for from huge data group The problem of data of value are of universal significance as one.
In the prior art, typically user sends data query by personal computer or other users terminal to network Request, the server of data center, which is received, starts inquiry or the data searched for required for user from storage network after inquiry request Information.For the server of data center, substantial amounts of task is exactly to inquire about or search for user from the storage network of magnanimity Data message.At present, for the processor in the server of data center, it can only directly handle the data in internal memory Information, that is to say, that, it is necessary to first handle the data conversion storage for storing memory inside storage in network into the internal memory of server Device could be inquired about or searched for data according to inquiry request, and result finally is fed back into user terminal.However, server Memory size be than relatively limited, for huge in storage network internal storage reservoir and ever-increasing data volume, service Device can only be transferred in internal memory several times to be handled for processor, and relative to processor processes data speed and Speech, speed of the data from storage network transitions into internal memory is slow, and this necessarily becomes the speed of processor processes data Rate bottleneck.Use is back in addition, to be obviously far longer than from the data volume in storage network importing server memory from server The data volume of family terminal, and server memory is usually by dynamic RAM (Dynamic Random Access Memory, DRAM) composition, it is necessary to continuous refreshing keeps data, so will also result in the waste of a big chunk power consumption.
Also, with the gradually popularization of internet, increasing accessing user terminal to network, therefore it can also bring therewith The inquiry request of substantial amounts.When inquiry request is more in the same time, because the limitation of the network bandwidth can be caused The blocking of network so that the time that useful information is reached in user's hand greatly increases, so as to reduce Consumer's Experience.It is same with this When, numerous inquiry requests can produce huge data search result, and these results would generally be retained in the internal memory of server So as to the same inquiry request that makes an immediate response, this memory size to server is a huge test.Traditional way is The historical record and result of search are cleared out into memory headroom successively within a certain period of time, but so processing is identical after Or during close inquiry request, it is necessary to same inquiry or search operation are done once from storage network again.Further, since The server of traditional data center is when carrying out data query, it is necessary to the common participation of processor and internal memory, substantial amounts of user Inquiry request make it that processor and internal memory are not stopped work work, and above-mentioned situation can all cause the cost of the power consumption of data center to increase Plus.
The content of the invention
According to the above-mentioned problems in the prior art, a kind of technical scheme of data search system is now provided, it is intended to have Imitate and carry out data search using the idle period of server, only data center does not save substantial amounts of power consumption, so as to reduce service The cost of serving of business, also improves service quality.
Above-mentioned technical proposal is specifically included:
A kind of data search system, applied in the server of data center;Wherein, including:
First setup unit, the search period of data search is performed for setting, and is searched in described in the period, the service The processor and internal memory of device are in closed mode, and the server is in off-line state;
Second setup unit is there is provided the search rule that data search institute foundation is set to user, in the search rule Including data search relied on by least one keyword and/or crucial phrase into Search Hints;
First data search unit, connects first setup unit and second setup unit, described first respectively Data search unit further comprises:
Rule acquisition module, obtains the search rule that this data search is relied on;
Rule search module, connects the rule acquisition module, within the search period, being advised according to the search Then the storage network in the server is scanned for, and exports corresponding search result;
The first data search unit is arranged in the controller of the memory inside of the server.
It is preferred that, the data search system, wherein, in addition to:
Statistic unit, connects first setup unit, and the statistic unit is used for the normal work to the server Cycle is counted, to obtain the idle operation period of the server;
First setup unit searches for the period according to the idle operation slot setup.
It is preferred that, the data search system, wherein, in addition to:
Input block, connects first setup unit, during for being supplied to user to input for setting the search The setting instruction of section;
First setup unit searches for the period according to the setting given settings.
It is preferred that, the data search system, wherein, in addition to:
First memory cell, connects the first data search unit, corresponding for being generated according to the search result Result document is simultaneously preserved.
It is preferred that, the data search system, wherein, the first data search unit includes:
Results contrast module, connects the rule search module, for the result text for obtaining this data search The result document that shelves are obtained with last data search is compared, and exports corresponding comparative result;
First notification module, connects the results contrast module, for representing this data search in the comparative result Result when having renewal, sent a notification message to user.
It is preferred that, the data search system, wherein, the first data search unit includes:
Results contrast module, connects the rule search module, for the result text for obtaining this data search The result document that shelves are obtained with last data search is compared, and exports corresponding comparative result;
Reconstructed module is searched for, the results contrast module and the rule search module are connected respectively, in the ratio When relatively result represents that the result of this data search does not update, according to preset rules to the search in the search rule Prompting is reconfigured, to form the Search Hints after reconstruct;
Second notification module, connects the search reconstructed module and rule search module respectively, for will be according to weight The result document for the search result formation that search rule search after structure is obtained is supplied to user to check.
It is preferred that, the data search system, wherein, set one to connect first memory cell and first number respectively It is used to preserving that each data search relied on described searches according to the second memory cell of search unit, in second memory cell Rope rule and the result document extracted from first memory cell, and set up institute in second memory cell State the corresponding relation between search rule and the result document;
Also include one second data search unit in the data search system, the second data search unit connects institute State the second memory cell;
Also include in the second data search unit:
Acquisition request module, includes the inquiry request of the search rule for obtain outside input;
Enquiry module, connects the acquisition request module, for being entered according to the inquiry request in the storage network Row data search;
Rule judgment module, connects the acquisition request module and the enquiry module, for according to this data respectively The search rule relied on is searched for search with the presence or absence of the search rule matched in second memory cell, and Export corresponding judged result;
The enquiry module is used for according to the judged result:
When there is the search rule matched in second memory cell, the search rule pair is directly extracted The result document answered as this data search the search result and export;
In second memory cell be not present match the search rule when, using this data search institute according to The bad search rule carries out data search, and exports the corresponding search result.
It is preferred that, the data search system, wherein, also include in the second data search unit:
Setting module, connects the rule judgment module, for being turned on and off the rule according to the instruction of outside input Then judge module.
It is preferred that, the data search system, wherein, the data preserved in the storage network of the server are included respectively In multiple different user folders;
Also include in the data search system:
3rd data search unit, connects first setup unit, for when the server is in off-line state To carrying out data search in the user folder different in the server, with the different user folders Identical data carries out duplicate removal processing;
The 3rd data search unit is arranged in the controller of the memory inside of the server;
The 3rd data search unit further comprises:
First search module, for the server be in off-line state when to the use different in the server Data search is carried out in the file of family, to find the identical data in the different user folders;
Data deduplication module, connects first search module, for the search result according to first search module, Data in retaining one of them described user folder in multiple user folders with identical data, and delete Identical data in remaining all described user folder;
Generation module is linked, the data deduplication module is connected, for being deleted in the data deduplication module described in one During data in user folder, corresponding link is accessed in generating one under the user folder;
After data deduplication processing, only include in multiple different user folders with identical data One is not deleted the user folder of the identical data and is used as destination folder, deleted identical data quilt It is used as target data;
It is described to access the storage address that target data described in the destination folder is pointed in link.
It is preferred that, the data search system, wherein, the data preserved in the storage network of the server are included respectively In multiple different user folders;
Also include in the data search system:
4th data search unit, connects first setup unit, for when the server is in off-line state To carrying out data search in the user folder different in the server, with the different user folders Corresponding relation is set up between close data;
The 4th data search unit is arranged in the controller of the memory inside of the server;
The 4th data search unit further comprises:
Second search module, for the server be in off-line state when to the use different in the server Data search is carried out in the file of family, to find the close data in the different user folders;
Mark module, connects second search module, for the search result according to second search module, to not Close data in the same user folder are marked, to set up the corresponding relation between close data.
The beneficial effect of above-mentioned technical proposal is:A kind of data search system is provided, the sky of server can be effectively utilized Idle section carries out data search, and only data center does not save substantial amounts of power consumption, so as to reduce the cost of serving of service provider, also carries Service quality is risen.
Brief description of the drawings
Fig. 1 be the present invention preferred embodiment in, a kind of general structure schematic diagram of data search system;
Fig. 2 be the present invention preferred embodiment in, the concrete structure schematic diagram of the first data search unit;
Fig. 3 be the present invention preferred embodiment in, the concrete structure schematic diagram of the second data search unit;
Fig. 4 be the present invention preferred embodiment in, the concrete structure schematic diagram of the 3rd data search unit;
Fig. 5 be the present invention preferred embodiment in, the concrete structure schematic diagram of the 4th data search unit;
During Fig. 6 is the preferred embodiment of the present invention, period of server peak hours/period and idle period distribution schematic diagram;
Fig. 7 be the present invention preferred embodiment in, using data search system realize data deduplication processing schematic diagram.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art obtained on the premise of creative work is not made it is all its His embodiment, belongs to the scope of protection of the invention.
It should be noted that in the case where not conflicting, the embodiment in the present invention and the feature in embodiment can phases Mutually combination.
The invention will be further described with specific embodiment below in conjunction with the accompanying drawings, but not as limiting to the invention.
According to the above-mentioned problems in the prior art, a kind of data search system is now provided, the data search system should For in the server of data center, its concrete structure as shown in figure 1, including:
First setup unit 1, for set perform data search the search period, in search the period in, the place of server Reason device and internal memory are in closed mode, and server is in off-line state;
Second setup unit 2 includes there is provided the search rule that data search institute foundation is set to user, search rule Data search relied on by least one keyword and/or crucial phrase into Search Hints;
First data search unit 3, connects the first setup unit 1 and the second setup unit 2 respectively.
As shown in Figure 2, the first data search unit further comprises:
Rule acquisition module 31, obtains the search rule that this data search is relied on;
Rule search module 32, concatenate rule acquisition module 31, within the search period, according to search rule to service Storage network in device is scanned for, and exports corresponding search result;
First data search unit is arranged in the controller of the memory inside of server.
Specifically, in the present embodiment, above-mentioned first data search unit be arranged on server memory (such as HDD or Person SSD) in internal controller, i.e., directly the data in storage network are scanned for using the controller of memory inside, Rather than in the prior art using processor and internal memory progress data search.
Therefore, above-mentioned data search system can be within the period (or being the idle period of server) of network not congestion Carry out, in these periods, server is generally in (Off-line) state offline, therefore.Within these periods (typically Night), processor (CPU) and internal memory in server can be closed, and using the controller in memory come to storage Data in network carry out data search.
The service behaviour of data search is carried out although with the controller in memory more than the processor in server Service behaviour it is low, the speed for performing data processing is also slow, but is due to that now server is in and idle period and is in Controller in off-line state, memory has the substantial amounts of time to carry out the inferior position pair in terms of data search, therefore service behaviour Do not have too big influence for Consumer's Experience.
In the present embodiment, the search period can be set using above-mentioned first setup unit 1, the search period is service The idle period (0 point to 6 points of such as morning) of device.The specific setting means of the search period can hereinafter be described in detail.
In the present embodiment, search rule can be set using above-mentioned second setup unit 2, can be wrapped in the search rule Include that data search relied on by least one keyword and/or crucial phrase into Search Hints.Specifically, Search Hints can Think multiple keywords and/or crucial contamination, and further can wish in above-mentioned search rule including the use of person The period scanned for, certain period should be in the above-mentioned search period set in advance.
For example, user can be according to the need for itself and interest sets the keyword and/or keyword inquired about automatically Combination and enquiry frequency (for example daily, weekly or every month), and can be according to setting inquiry the need for oneself Time started and end time.Above- mentioned information is included in search rule.After then search rule sets and submitted, the One data search unit 3 can start in related topic information or webpage etc. in the search period automatically according to search rule Appearance is scanned for, final output search result.
In the preferred embodiment of the present invention, still as shown in figure 1, also including in above-mentioned data search system:
Statistic unit 4, connects the first setup unit 1, and statistic unit 4 is used to unite to the normal work cycle of server Meter, to obtain the idle operation period of server;
First setup unit 1 is according to the free time operation slot setup search period.
Specifically, in the present embodiment, above-mentioned statistic unit 4 server normal workweek it is interim carry out data acquisition and Statistics, specifically can be with the occupancy of the overall network bandwidth of the running situation and server of the processor of acquisition server and internal memory Situation, thus counts the idle period for obtaining server.In the idle period, user's inquiry request that server is received compared with Few, the network bandwidth takes less (i.e. network not congestion), therefore the processor and internal memory of the idle period server can be by The normal work closed without influencing whether server.The idle period of different servers may be different, therefore using above-mentioned 4 pairs of different servers of statistic unit, which are done, each to be counted and obtains its idle period, and the idle period just can be by above-mentioned first Setup unit 1 is set as searching for the period.The contrast of the peak hours/period and idle period of server can be referring generally to shown in Fig. 6.
In another preferred embodiment of the present invention, still as shown in figure 1, also including in above-mentioned data search system:
Input block 5, connects the first setup unit 1, for being supplied to user to input for setting setting for search period Fixed instruction;
First setup unit 1 is according to the setting given settings search period.
Specifically, in the present embodiment, above-mentioned input block 5 is supplied to user to manually set the search period.
Above-mentioned statistic unit 4 and input block 5 are also show in Fig. 1.It is above-mentioned in the different embodiments of the present invention Statistic unit 4 and input block 5 can select one and use.
In the preferred embodiment of the present invention, still as shown in figure 1, also including in above-mentioned data search system:
First memory cell 6, connects the first data search unit 3, for generating corresponding result text according to search result Shelves are simultaneously preserved.
Specifically, in the present embodiment, above-mentioned first data search unit 3 carries out the search result quilt obtained after data search Form corresponding document and be saved in the first memory cell 6.First memory cell 6 is the memory in server (HDD or SSD).In other words, during the data search of the present invention, without the participation of processor and internal memory, only by storage Device and the controller of memory inside can be completed.
In the preferred embodiment of the present invention, as shown in Figure 2, above-mentioned first data search unit 3 further comprises:
Results contrast module 33, concatenate rule search module 32, for the result document that obtains this data search with The result document that last data search is obtained is compared, and exports corresponding comparative result;
First notification module 34, connection result comparison module 33, the knot for representing this data search in comparative result When fruit has renewal, sent a notification message to user.
Specifically, in the present embodiment, when terminate each data search and generate to should number of times according to search result document Afterwards, it is necessary to using the corresponding result document of 33 pairs of this data search of results contrast module and the corresponding result of previous data search Document is compared, to judge whether the search result to same live body or webpage has fresh information:
If there is fresh information, it is included into a new document and is saved in the first memory cell 6 after putting in order, then Be sent to user using the document as notification message, such as can by mail, short message or instant communication software mode Above-mentioned notification message is pushed to user, the push operation need to wait until the search period terminate after (the i.e. processor of server When restarting work with internal memory) perform again.
If without fresh information, server need not remake new document, it is not required that notify user.Certainly, It should in advance be appointed between server and user, when server is not to user's sending out notice message, represent search knot Fruit is without fresh information.Correspondingly, user can also require server in no fresh information by way of setting in advance When still notify user to know search result, i.e., above-mentioned advice method can be changed according to the setting of user.
Then by above-mentioned setting, the data search system in technical solution of the present invention, which not only eliminates user, to be needed often The trouble that information carries out respective queries operation is inputted, while reducing the power consumption and cost of data center server.
In the preferred embodiment of the present invention, the concatenate rule search module 32 of the above results comparison module 33, for incite somebody to action this The result document that the result document that secondary data search is obtained is obtained with last data search is compared, and exports corresponding ratio Relatively result.
Then still as shown in fig. 1, above-mentioned first data search unit 3 further comprises:
Reconstructed module 35 is searched for, respectively connection result comparison module 33 and rule search module 32, in comparative result When representing that the result of this data search does not update, group again is carried out to the Search Hints in search rule according to preset rules Close, to form the Search Hints after reconstruct;
Second notification module 36, connection search reconstructed module 35 and rule search module 32 respectively, for will be according to reconstruct The result document for the search result formation that search rule search afterwards is obtained is supplied to user to check.
Specifically, in the present embodiment, when this data search does not have fresh information relative to previous data search, it is necessary to Further provide for a kind of mechanism searched for generally.Under the mechanism, using a search reconstructed module 35 to searching in search rule Rope prompting is reconfigured, to form the Search Hints after reconstruct, and according to the search rule including the Search Hints after reconstruct Then re-start search.Specifically, it is so-called that Search Hints are reconfigured, refer to the keyword in Search Hints and/ Or keyword carries out rearranging combination, to form new Search Hints, the new Search Hints and Search Hints before it Between there is certain degree of association, but will not be identical Search Hints, therefore can be restarted using new Search Hints Data search, to expand hunting zone.
In the present embodiment, the restructuring procedure of above-mentioned Search Hints can include:Using tandom number generator produce one group with Machine number, is then recombinated using this group of random number to the keyword and/or keyword in Search Hints, or is deleted therein Indivedual key words, the Search Hints formed afterwards after reconstruct.
For example, the content to be searched for of user (i.e. by crucial phrase into Search Hints) for " and air-conditioning, intelligence, variable speed, Certain brand 1, certain brand 2 ", if in once searching for, relatively preceding once search is without the result updated, then compile " air-conditioning " Code is 1, and " intelligence " is encoded into 2, and " variable speed " is encoded into 3, by " certain brand 1 " is encoded to 4, will " certain brand 2 " coding For 5, and N (N are produced using tandom number generator<5) individual random number, the span of these random numbers is 1 to 5.Such as certain The random number once produced is 1,3,4,5, then the Search Hints formed after being reconstructed using this group of random number are " empty Adjust, variable speed, certain brand 1, certain brand 2 ".
In the present embodiment, after being re-searched for, a new document is generated according to search result and first is saved in In memory cell 6, then user is sent to using the document as notification message.Sending method and transmission time are referred to Above for the description of the first notification module 34, it will not be repeated here.
In the preferred embodiment of the present invention, still as shown in fig. 1, in above-mentioned data search system, one point is set Do not connect is used to preserve in the first memory cell 6 and the second memory cell 7 of the first data search unit 3, the second memory cell 7 Each data search search rule relied on and the result document extracted from the first memory cell 6, and in the second storage The corresponding relation set up in unit 7 between search rule and result document;
Then also include one second data search unit 8, the second data search unit 8 connection the in above-mentioned data search system Two memory cell 7;
As shown in Figure 3, also include in the second data search unit 8:
Acquisition request module 81, includes the inquiry request of search rule for obtain outside input;
Enquiry module 82, connection request acquisition module 81 is searched for carrying out data in storage network according to inquiry request Rope;
Rule judgment module 83, difference connection request acquisition module 81 and enquiry module 82, for being searched according to this data The search rule that rope is relied on is searched with the presence or absence of the search rule matched in the second memory cell, and exports corresponding judgement As a result;
Then above-mentioned enquiry module 81 is used for according to judged result:
When there is the search rule matched in the second memory cell 7, the corresponding result text of search rule is directly extracted Search result and output of the shelves as this data search;
When the search rule matched being not present in the second memory cell 7, using searching that this data search is relied on Rope rule carries out data search, and exports corresponding search result.
Specifically, in one embodiment of the present of invention, above-mentioned second data search unit 8 can be arranged on the place of server Manage device in, now mean the second data search unit 8 data search operate outside the above-mentioned search period (i.e. processor with During internal memory normal work) perform.In an alternative embodiment of the invention, above-mentioned second data search unit 8 can also be arranged on In the controller of memory, now mean that the data search of the second data search unit 8 is operated within the above-mentioned search period (i.e. When processor and internal memory closing) perform.
In the present embodiment, above-mentioned second memory cell 7 can be the internal memory of server, or nonvolatile memory, or Other memories in person's storage network.Within the search period, obtained after each data search of the first data search unit 3 Result document, in addition to being saved in the first memory cell 6, also restarts normally in the processor and internal memory of server It is saved to during work in the second memory cell 7.In addition, above-mentioned second memory cell 7 is also connected with the first data search list Member 3, when processor and internal memory start normal work, above-mentioned first data search unit 3 every time advise by the relied on search of search Then it is transferred in the second memory cell 7 and preserves, and is set up in the second memory cell 7 between search rule and result document Corresponding relation.
Then specifically, after the acquisition request module 81 in the second data search unit 8 receives an inquiry request, rule Then judge module 83 can parse corresponding search rule (i.e. keyword/contamination) from the inquiry request, and according to this Search rule is retrieved from the second memory cell 8 with the presence or absence of the search rule matched:
If in the presence of illustrating that someone searched for before for search rule that this data search relied on, now directly Connect and extract corresponding result document and be pushed to user;
If being not present, illustrate that someone did not used before for search rule that this data search relied on, now Normal data search is directly restarted once according to the search rule.
In the present embodiment, so-called search rule matches, and refers to that search rule is same or similar.Further, refer to Crucial character/word in search rule is same or similar.For example, crucial character/word included in two search rules is identical (no Consider putting in order for crucial character/word), or crucial character/word included in two search rules only has a few character/word Increase and decrease etc..
In technical solution of the present invention, using above-mentioned searching method, enable to the server of data center to save and repeat to The operation that related content is searched in network is stored, so as to save power consumption.
In the preferred embodiment of the present invention, still as shown in Figure 3, also include in above-mentioned second data search unit 8:
Setting module 84, concatenate rule judge module 83 is sentenced for being turned on and off rule according to the instruction of outside input Disconnected module 83.
Specifically, whether user can voluntarily be set by above-mentioned setting module 84 and need first to do before data search Matching operation:
If user's input instruction controls above-mentioned setting module 84 to open rule judgment module 83, then it represents that user wishes Matching operation is first done before data search, above-mentioned second data search unit 8 will be according to the above matching process Scan for the matching and corresponding data search processing of rule.
If user's input instruction controls the above-mentioned shut-off rule judge module 83 of setting module 84, then it represents that user wishes Data search is directly carried out, the inquiry request that above-mentioned second data search unit 8 will be inputted directly according to user includes Search rule carry out data search.
In the preferred embodiment of the present invention, the data preserved in the storage network of above-mentioned server are respectively included many In individual different user folder;
Then still as shown in fig. 1, also include in above-mentioned data search system:
3rd data search unit 9, connect the first setup unit 1, for server be in off-line state when to service Data search is carried out in device in different user folders, to be carried out to the identical data in different user folders at duplicate removal Reason.
Above-mentioned 3rd data search unit 9 is arranged in the controller of the memory inside of server, i.e., the 3rd data are searched Cable elements 9 are same to realize corresponding function within the search period by the controller in memory.
Then further, as shown in Figure 4, above-mentioned 3rd data search unit 9 includes:
First search module 91, for server be in off-line state when in user folder different in server Data search is carried out, to find the identical data in different user folders;
Data deduplication module 92, connects the first search module 91, for the search result according to the first search module 91, Retain the data in one of user folder in multiple user folders with identical data, and it is useful to delete remaining institute Identical data in the file of family;
Link generation module 93, connection data deduplication module 92, for deleting user's text in data deduplication module 92 During data in part folder, corresponding link is accessed in generating one under user folder;
After data deduplication is handled, in multiple different user folders with identical data only include one not by Delete the user folder of identical data and by as destination folder, deleted identical data is used as target data;
The storage address of target data in destination folder is pointed in above-mentioned access link.
Specifically, in the present embodiment, when server is in idle period (being now in off-line state), in memory Controller is comprehensively searched the data automatically to storage network internal storage, to be performed to storage network at data deduplication Reason.Specifically include:
First using first search module 91, whole memory space is scanned for, in search memory space not It whether there is identical data in same user folder, and export search result.
Then using a data deduplication module 92, according to mentioned above searching results, to the different user with identical data File, only retains the identical data in one of user folder, deletes the identical number in remaining all user folder According to, and the user folder of retained identical data is considered as destination folder, wherein retained identical data is considered as mesh Mark data.
Finally, using a link generation module 93, in the respective stored position for being deleted the user folder of identical data Put generation one and access link, access link is pointed to the storage address that target data is preserved in destination folder, that is, used Person can link direct access target data by the access, and after data deduplication processing, in entirely storage network In only exist in a user folder and preserve target data, no longer preserved in remaining user folder with destination folder Target data identical data.
In the preferred embodiment of the present invention, the substantially handling process of above-mentioned data deduplication processing is referred to Fig. 7 Shown in.In Fig. 7, file A, file B, file C and other texts are preserved in the user folder of one entitled " user's first " File A, file Y, file Z and alternative document are preserved in part, the user folder of another entitled " user's second ".Then pass through Found after data search, identical data is preserved in two user folders, and " file A " now retains the file in user's first User's first (is considered as destination folder, file A is considered as into target data) by A, deletes the file A in user's second, and in user Originally newly-generated one of the storage location that file A is preserved in second accesses link, and access link is directly linked to user's first hereafter Part A storage address.
In the preferred embodiment of the present invention, the data preserved in the storage network of above-mentioned server are respectively included many In individual different user folder;
Then still as shown in fig. 1, also include in above-mentioned data search system:
4th data search unit 10, connect the first setup unit 1, for server be in off-line state when to service Data search is carried out in device in different user folders, with foundation pair between the close data in different user folders It should be related to;
Above-mentioned 4th data search unit 10 is arranged in the controller of the memory inside of server, i.e., the 4th data are searched Cable elements 10 are same to realize corresponding function within the search period by the controller in memory.
Further, as shown in Figure 5, above-mentioned 4th data search unit 10 includes:
Second search module 101, for server be in off-line state when to user folder different in server Middle carry out data search, to find the close data in different user folders.The judgement of above-mentioned close data can pass through Similarity is preset to realize.The data similarity for for example pre-setting a standard is 40% or 50%, when two numbers Similarity between then thinks that the two data are close data when exceeding the data similarity of the standard.In the present embodiment, Because each data or file have its different attribute respectively, such as correlation of author, content, art, included Keyword etc., the above data similarity can be realized by judging the similar degree of data attribute.For example count It is identical then it is considered that between the two data according to including the different attribute of four classes, then there are two generic attributes between two data Data similarity is 50%.)
Mark module 102, connects the second search module 101, for the search result according to the second search module, to difference User folder in close data mark, to set up the corresponding relation between close data.
Specifically, in the present embodiment, in the present embodiment, idle period (being now in off-line state) is in server When, the controller in memory comprehensively be searched the data automatically to storage network internal storage, to set up storage net Linked database in network between different pieces of information.
Scanned for first using the total data in one second search module, 101 pairs of storage networks, to find to store net Close data in network, export corresponding search result.
Then use a mark module 102 to be marked to close data to set up dependency relation, that is, pass through data search And mark sets the relational database for setting up the data preserved in storage network.After opening relationships database, after user During continuous Query Information, it becomes possible to more similar information are provided and understands and selects for user.
Preferred embodiments of the present invention are the foregoing is only, embodiments of the present invention and protection model is not thereby limited Enclose, to those skilled in the art, should can appreciate that made by all utilization description of the invention and diagramatic content Scheme obtained by equivalent substitution and obvious change, should be included in protection scope of the present invention.

Claims (10)

1. a kind of data search system, applied in the server of data center;It is characterised in that it includes:
First setup unit, the search period of data search is performed for setting, and is searched in described in the period, the server Processor and internal memory are in closed mode, and the server is in off-line state;
Second setup unit includes there is provided the search rule that data search institute foundation is set to user, the search rule Data search relied on by least one keyword and/or crucial phrase into Search Hints;
First data search unit, connects first setup unit and second setup unit, first data respectively Search unit further comprises:
Rule acquisition module, obtains the search rule that this data search is relied on;
Rule search module, connects the rule acquisition module, within the search period, according to the search rule pair Storage network in the server is scanned for, and exports corresponding search result;
The first data search unit is arranged in the controller of the memory inside of the server.
2. data search system as claimed in claim 1, it is characterised in that also include:
Statistic unit, connects first setup unit, and the statistic unit is used for the normal work cycle to the server Counted, to obtain the idle operation period of the server;
First setup unit searches for the period according to the idle operation slot setup.
3. data search system as claimed in claim 1, it is characterised in that also include:
Input block, connects first setup unit, for being supplied to user to input for setting the search period Setting instruction;
First setup unit searches for the period according to the setting given settings.
4. data search system as claimed in claim 1, it is characterised in that also include:
First memory cell, connects the first data search unit, for generating corresponding result according to the search result Document is simultaneously preserved.
5. data search system as claimed in claim 4, it is characterised in that the first data search unit includes:
Results contrast module, connects the rule search module, for the result document that obtains this data search with The result document that last data search is obtained is compared, and exports corresponding comparative result;
First notification module, connects the results contrast module, the knot for representing this data search in the comparative result When fruit has renewal, sent a notification message to user.
6. data search system as claimed in claim 4, it is characterised in that the first data search unit includes:
Results contrast module, connects the rule search module, for the result document that obtains this data search with The result document that last data search is obtained is compared, and exports corresponding comparative result;
Reconstructed module is searched for, the results contrast module and the rule search module are connected respectively, for comparing knot described When fruit represents that the result of this data search does not update, according to preset rules to the Search Hints in the search rule Reconfigured, to form the Search Hints after reconstruct;
Second notification module, connects the search reconstructed module and the rule search module, after will be according to reconstruct respectively The search rule search obtain search result formation the result document be supplied to user to check.
7. data search system as claimed in claim 4, it is characterised in that set one to connect first memory cell respectively With the second memory cell of the first data search unit, it is used in second memory cell preserve each data search institute The search rule relied on and the result document extracted from first memory cell, and in the described second storage The corresponding relation set up in unit between the search rule and the result document;
Also include one second data search unit in the data search system, the second data search unit connection described the Two memory cell;
Also include in the second data search unit:
Acquisition request module, includes the inquiry request of the search rule for obtain outside input;
Enquiry module, connects the acquisition request module, for entering line number in the storage network according to the inquiry request According to search;
Rule judgment module, connects the acquisition request module and the enquiry module, for according to this data search respectively The search rule relied on is searched with the presence or absence of the search rule matched in second memory cell, and is exported Corresponding judged result;
The enquiry module is used for according to the judged result:
When there is the search rule matched in second memory cell, the search rule is directly extracted corresponding The result document as this data search the search result and export;
When the search rule matched being not present in second memory cell, relied on using this data search The search rule carries out data search, and exports the corresponding search result.
8. data search system as claimed in claim 7, it is characterised in that also include in the second data search unit:
Setting module, connects the rule judgment module, sentences for being turned on and off the rule according to the instruction of outside input Disconnected module.
9. data search system as claimed in claim 1, it is characterised in that the number preserved in the storage network of the server According to being respectively included in multiple different user folders;
Also include in the data search system:
3rd data search unit, connects first setup unit 1, for when the server is in off-line state to institute State in the user folders different in server and carry out data search, with to identical in the different user folders Data carry out duplicate removal processing;
The 3rd data search unit is arranged in the controller of the memory inside of the server;
The 3rd data search unit further comprises:
First search module, for literary to the user different in the server when the server is in off-line state Data search is carried out in part folder, to find the identical data in the different user folders;
Data deduplication module, connects first search module, for the search result according to first search module, many Retain the data in one of them described user folder in the individual user folder with identical data, and delete remaining Identical data in all user folders;
Generation module is linked, the data deduplication module is connected, for deleting a user in the data deduplication module During data in file, corresponding link is accessed in generating one under the user folder;
After data deduplication processing, one is only included in multiple different user folders with identical data The user folder of the identical data is not deleted and by as destination folder, deleted identical data is by conduct Target data;
It is described to access the storage address that target data described in the destination folder is pointed in link.
10. data search system as claimed in claim 1, it is characterised in that preserved in the storage network of the server Data are respectively included in multiple different user folders;
Also include in the data search system:
4th data search unit, connects first setup unit, for when the server is in off-line state to institute State in the user folders different in server and carry out data search, with close in the different user folders Corresponding relation is set up between data;
The 4th data search unit is arranged in the controller of the memory inside of the server;
The 4th data search unit further comprises:
Second search module, for literary to the user different in the server when the server is in off-line state Data search is carried out in part folder, to find the close data in the different user folders;
Mark module, connects second search module, for the search result according to second search module, to different Close data in the user folder are marked, to set up the corresponding relation between close data.
CN201710332855.8A 2017-05-12 2017-05-12 Data search system Active CN107169085B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710332855.8A CN107169085B (en) 2017-05-12 2017-05-12 Data search system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710332855.8A CN107169085B (en) 2017-05-12 2017-05-12 Data search system

Publications (2)

Publication Number Publication Date
CN107169085A true CN107169085A (en) 2017-09-15
CN107169085B CN107169085B (en) 2020-12-01

Family

ID=59815916

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710332855.8A Active CN107169085B (en) 2017-05-12 2017-05-12 Data search system

Country Status (1)

Country Link
CN (1) CN107169085B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107743101A (en) * 2017-09-26 2018-02-27 杭州迪普科技股份有限公司 The retransmission method and device of a kind of data
CN111506818A (en) * 2020-04-22 2020-08-07 中国民航信息网络股份有限公司 Flight data processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100057802A1 (en) * 2001-11-30 2010-03-04 Micron Technology, Inc. Method and system for updating a search engine
CN102081626A (en) * 2009-11-30 2011-06-01 中国移动通信集团北京有限公司 Data inquiring method and data inquiring server
US20110145181A1 (en) * 2006-12-08 2011-06-16 Ashish Pandya 100gbps security and search architecture using programmable intelligent search memory (prism) that comprises one or more bit interval counters
CN105320569A (en) * 2015-11-04 2016-02-10 浪潮(北京)电子信息产业有限公司 Method and system of improving database server performance

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100057802A1 (en) * 2001-11-30 2010-03-04 Micron Technology, Inc. Method and system for updating a search engine
US20110145181A1 (en) * 2006-12-08 2011-06-16 Ashish Pandya 100gbps security and search architecture using programmable intelligent search memory (prism) that comprises one or more bit interval counters
CN102081626A (en) * 2009-11-30 2011-06-01 中国移动通信集团北京有限公司 Data inquiring method and data inquiring server
CN105320569A (en) * 2015-11-04 2016-02-10 浪潮(北京)电子信息产业有限公司 Method and system of improving database server performance

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107743101A (en) * 2017-09-26 2018-02-27 杭州迪普科技股份有限公司 The retransmission method and device of a kind of data
CN107743101B (en) * 2017-09-26 2020-10-09 杭州迪普科技股份有限公司 Data forwarding method and device
CN111506818A (en) * 2020-04-22 2020-08-07 中国民航信息网络股份有限公司 Flight data processing method and device

Also Published As

Publication number Publication date
CN107169085B (en) 2020-12-01

Similar Documents

Publication Publication Date Title
KR101153082B1 (en) Application programming interface for text mining and search
US7657515B1 (en) High efficiency document search
US6502091B1 (en) Apparatus and method for discovering context groups and document categories by mining usage logs
CN108255958A (en) Data query method, apparatus and storage medium
US20040249808A1 (en) Query expansion using query logs
US8655902B2 (en) Identifying superphrases of text strings
CN113836898B (en) A method for automatically dispatching orders in power system
KR20160020429A (en) Contextual mobile application advertisements
WO2010123705A2 (en) System and method for performing longest common prefix strings searches
CN118410152A (en) Information processing method, question-answering method, and question-answering system
JP2016131045A (en) Search method, apparatus and server for online trading platform
US7765204B2 (en) Method of finding candidate sub-queries from longer queries
CN119692469B (en) Reply text generation method and device, storage medium and program product
CN118503350A (en) Flow optimization design method and system for improving accuracy of large-model RAG
CN107169085A (en) A kind of data search system
CN117112930A (en) Point of interest recall method, device, computer equipment and storage medium
WO2025147358A1 (en) Semantic memory vector database consolidation and generative model update
CN112883143A (en) Elasticissearch-based digital exhibition searching method and system
CN118364051A (en) Information technology consultation system and device
CN102799996A (en) Network advertisement strategy matching method and system
CN111026706B (en) Warehouse entry method, device, equipment and medium for power system data
CN109241444B (en) Content recommendation method, device, equipment and storage medium based on state machine
Wendi Research on the application of computer digital retrieval technology in the construction of library and information database
KR20020067160A (en) Method and system for indexing document
US20260037837A1 (en) Retrieval-augmented generation method, system, device, and medium and question-answering method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant