CN107169085A - A kind of data search system - Google Patents
A kind of data search system Download PDFInfo
- Publication number
- CN107169085A CN107169085A CN201710332855.8A CN201710332855A CN107169085A CN 107169085 A CN107169085 A CN 107169085A CN 201710332855 A CN201710332855 A CN 201710332855A CN 107169085 A CN107169085 A CN 107169085A
- Authority
- CN
- China
- Prior art keywords
- search
- data
- module
- rule
- server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of data search system, belong to data retrieval technology field;System includes:First setup unit, for setting the search period, the processor and internal memory of server are in closed mode, and server is in off-line state;Second setup unit, the search rule of setting data search institute foundation;First data search unit, further comprises:Rule acquisition module, obtains the search rule that this data search is relied on;Rule search module, concatenate rule acquisition module, within the search period, being scanned for according to search rule to the storage network in server, and export corresponding search result;First data search unit is arranged in the controller of the memory inside of server.The beneficial effect of above-mentioned technical proposal is:Data search effectively is carried out using the idle period of server, only data center does not save substantial amounts of power consumption, so as to reduce the cost of serving of service provider, also improves service quality.
Description
Technical field
The present invention relates to data retrieval technology field, more particularly to a kind of data search system.
Background technology
With the arriving of cloud era, big data (Big Data) technology is used in daily life more and more
In, people are generally described and defined the mass data and associated technology of information explosion epoch generation using big data
Development and innovation.With increasing rapidly for data volume, having required for user how is inquired about or searched for from huge data group
The problem of data of value are of universal significance as one.
In the prior art, typically user sends data query by personal computer or other users terminal to network
Request, the server of data center, which is received, starts inquiry or the data searched for required for user from storage network after inquiry request
Information.For the server of data center, substantial amounts of task is exactly to inquire about or search for user from the storage network of magnanimity
Data message.At present, for the processor in the server of data center, it can only directly handle the data in internal memory
Information, that is to say, that, it is necessary to first handle the data conversion storage for storing memory inside storage in network into the internal memory of server
Device could be inquired about or searched for data according to inquiry request, and result finally is fed back into user terminal.However, server
Memory size be than relatively limited, for huge in storage network internal storage reservoir and ever-increasing data volume, service
Device can only be transferred in internal memory several times to be handled for processor, and relative to processor processes data speed and
Speech, speed of the data from storage network transitions into internal memory is slow, and this necessarily becomes the speed of processor processes data
Rate bottleneck.Use is back in addition, to be obviously far longer than from the data volume in storage network importing server memory from server
The data volume of family terminal, and server memory is usually by dynamic RAM (Dynamic Random Access
Memory, DRAM) composition, it is necessary to continuous refreshing keeps data, so will also result in the waste of a big chunk power consumption.
Also, with the gradually popularization of internet, increasing accessing user terminal to network, therefore it can also bring therewith
The inquiry request of substantial amounts.When inquiry request is more in the same time, because the limitation of the network bandwidth can be caused
The blocking of network so that the time that useful information is reached in user's hand greatly increases, so as to reduce Consumer's Experience.It is same with this
When, numerous inquiry requests can produce huge data search result, and these results would generally be retained in the internal memory of server
So as to the same inquiry request that makes an immediate response, this memory size to server is a huge test.Traditional way is
The historical record and result of search are cleared out into memory headroom successively within a certain period of time, but so processing is identical after
Or during close inquiry request, it is necessary to same inquiry or search operation are done once from storage network again.Further, since
The server of traditional data center is when carrying out data query, it is necessary to the common participation of processor and internal memory, substantial amounts of user
Inquiry request make it that processor and internal memory are not stopped work work, and above-mentioned situation can all cause the cost of the power consumption of data center to increase
Plus.
The content of the invention
According to the above-mentioned problems in the prior art, a kind of technical scheme of data search system is now provided, it is intended to have
Imitate and carry out data search using the idle period of server, only data center does not save substantial amounts of power consumption, so as to reduce service
The cost of serving of business, also improves service quality.
Above-mentioned technical proposal is specifically included:
A kind of data search system, applied in the server of data center;Wherein, including:
First setup unit, the search period of data search is performed for setting, and is searched in described in the period, the service
The processor and internal memory of device are in closed mode, and the server is in off-line state;
Second setup unit is there is provided the search rule that data search institute foundation is set to user, in the search rule
Including data search relied on by least one keyword and/or crucial phrase into Search Hints;
First data search unit, connects first setup unit and second setup unit, described first respectively
Data search unit further comprises:
Rule acquisition module, obtains the search rule that this data search is relied on;
Rule search module, connects the rule acquisition module, within the search period, being advised according to the search
Then the storage network in the server is scanned for, and exports corresponding search result;
The first data search unit is arranged in the controller of the memory inside of the server.
It is preferred that, the data search system, wherein, in addition to:
Statistic unit, connects first setup unit, and the statistic unit is used for the normal work to the server
Cycle is counted, to obtain the idle operation period of the server;
First setup unit searches for the period according to the idle operation slot setup.
It is preferred that, the data search system, wherein, in addition to:
Input block, connects first setup unit, during for being supplied to user to input for setting the search
The setting instruction of section;
First setup unit searches for the period according to the setting given settings.
It is preferred that, the data search system, wherein, in addition to:
First memory cell, connects the first data search unit, corresponding for being generated according to the search result
Result document is simultaneously preserved.
It is preferred that, the data search system, wherein, the first data search unit includes:
Results contrast module, connects the rule search module, for the result text for obtaining this data search
The result document that shelves are obtained with last data search is compared, and exports corresponding comparative result;
First notification module, connects the results contrast module, for representing this data search in the comparative result
Result when having renewal, sent a notification message to user.
It is preferred that, the data search system, wherein, the first data search unit includes:
Results contrast module, connects the rule search module, for the result text for obtaining this data search
The result document that shelves are obtained with last data search is compared, and exports corresponding comparative result;
Reconstructed module is searched for, the results contrast module and the rule search module are connected respectively, in the ratio
When relatively result represents that the result of this data search does not update, according to preset rules to the search in the search rule
Prompting is reconfigured, to form the Search Hints after reconstruct;
Second notification module, connects the search reconstructed module and rule search module respectively, for will be according to weight
The result document for the search result formation that search rule search after structure is obtained is supplied to user to check.
It is preferred that, the data search system, wherein, set one to connect first memory cell and first number respectively
It is used to preserving that each data search relied on described searches according to the second memory cell of search unit, in second memory cell
Rope rule and the result document extracted from first memory cell, and set up institute in second memory cell
State the corresponding relation between search rule and the result document;
Also include one second data search unit in the data search system, the second data search unit connects institute
State the second memory cell;
Also include in the second data search unit:
Acquisition request module, includes the inquiry request of the search rule for obtain outside input;
Enquiry module, connects the acquisition request module, for being entered according to the inquiry request in the storage network
Row data search;
Rule judgment module, connects the acquisition request module and the enquiry module, for according to this data respectively
The search rule relied on is searched for search with the presence or absence of the search rule matched in second memory cell, and
Export corresponding judged result;
The enquiry module is used for according to the judged result:
When there is the search rule matched in second memory cell, the search rule pair is directly extracted
The result document answered as this data search the search result and export;
In second memory cell be not present match the search rule when, using this data search institute according to
The bad search rule carries out data search, and exports the corresponding search result.
It is preferred that, the data search system, wherein, also include in the second data search unit:
Setting module, connects the rule judgment module, for being turned on and off the rule according to the instruction of outside input
Then judge module.
It is preferred that, the data search system, wherein, the data preserved in the storage network of the server are included respectively
In multiple different user folders;
Also include in the data search system:
3rd data search unit, connects first setup unit, for when the server is in off-line state
To carrying out data search in the user folder different in the server, with the different user folders
Identical data carries out duplicate removal processing;
The 3rd data search unit is arranged in the controller of the memory inside of the server;
The 3rd data search unit further comprises:
First search module, for the server be in off-line state when to the use different in the server
Data search is carried out in the file of family, to find the identical data in the different user folders;
Data deduplication module, connects first search module, for the search result according to first search module,
Data in retaining one of them described user folder in multiple user folders with identical data, and delete
Identical data in remaining all described user folder;
Generation module is linked, the data deduplication module is connected, for being deleted in the data deduplication module described in one
During data in user folder, corresponding link is accessed in generating one under the user folder;
After data deduplication processing, only include in multiple different user folders with identical data
One is not deleted the user folder of the identical data and is used as destination folder, deleted identical data quilt
It is used as target data;
It is described to access the storage address that target data described in the destination folder is pointed in link.
It is preferred that, the data search system, wherein, the data preserved in the storage network of the server are included respectively
In multiple different user folders;
Also include in the data search system:
4th data search unit, connects first setup unit, for when the server is in off-line state
To carrying out data search in the user folder different in the server, with the different user folders
Corresponding relation is set up between close data;
The 4th data search unit is arranged in the controller of the memory inside of the server;
The 4th data search unit further comprises:
Second search module, for the server be in off-line state when to the use different in the server
Data search is carried out in the file of family, to find the close data in the different user folders;
Mark module, connects second search module, for the search result according to second search module, to not
Close data in the same user folder are marked, to set up the corresponding relation between close data.
The beneficial effect of above-mentioned technical proposal is:A kind of data search system is provided, the sky of server can be effectively utilized
Idle section carries out data search, and only data center does not save substantial amounts of power consumption, so as to reduce the cost of serving of service provider, also carries
Service quality is risen.
Brief description of the drawings
Fig. 1 be the present invention preferred embodiment in, a kind of general structure schematic diagram of data search system;
Fig. 2 be the present invention preferred embodiment in, the concrete structure schematic diagram of the first data search unit;
Fig. 3 be the present invention preferred embodiment in, the concrete structure schematic diagram of the second data search unit;
Fig. 4 be the present invention preferred embodiment in, the concrete structure schematic diagram of the 3rd data search unit;
Fig. 5 be the present invention preferred embodiment in, the concrete structure schematic diagram of the 4th data search unit;
During Fig. 6 is the preferred embodiment of the present invention, period of server peak hours/period and idle period distribution schematic diagram;
Fig. 7 be the present invention preferred embodiment in, using data search system realize data deduplication processing schematic diagram.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art obtained on the premise of creative work is not made it is all its
His embodiment, belongs to the scope of protection of the invention.
It should be noted that in the case where not conflicting, the embodiment in the present invention and the feature in embodiment can phases
Mutually combination.
The invention will be further described with specific embodiment below in conjunction with the accompanying drawings, but not as limiting to the invention.
According to the above-mentioned problems in the prior art, a kind of data search system is now provided, the data search system should
For in the server of data center, its concrete structure as shown in figure 1, including:
First setup unit 1, for set perform data search the search period, in search the period in, the place of server
Reason device and internal memory are in closed mode, and server is in off-line state;
Second setup unit 2 includes there is provided the search rule that data search institute foundation is set to user, search rule
Data search relied on by least one keyword and/or crucial phrase into Search Hints;
First data search unit 3, connects the first setup unit 1 and the second setup unit 2 respectively.
As shown in Figure 2, the first data search unit further comprises:
Rule acquisition module 31, obtains the search rule that this data search is relied on;
Rule search module 32, concatenate rule acquisition module 31, within the search period, according to search rule to service
Storage network in device is scanned for, and exports corresponding search result;
First data search unit is arranged in the controller of the memory inside of server.
Specifically, in the present embodiment, above-mentioned first data search unit be arranged on server memory (such as HDD or
Person SSD) in internal controller, i.e., directly the data in storage network are scanned for using the controller of memory inside,
Rather than in the prior art using processor and internal memory progress data search.
Therefore, above-mentioned data search system can be within the period (or being the idle period of server) of network not congestion
Carry out, in these periods, server is generally in (Off-line) state offline, therefore.Within these periods (typically
Night), processor (CPU) and internal memory in server can be closed, and using the controller in memory come to storage
Data in network carry out data search.
The service behaviour of data search is carried out although with the controller in memory more than the processor in server
Service behaviour it is low, the speed for performing data processing is also slow, but is due to that now server is in and idle period and is in
Controller in off-line state, memory has the substantial amounts of time to carry out the inferior position pair in terms of data search, therefore service behaviour
Do not have too big influence for Consumer's Experience.
In the present embodiment, the search period can be set using above-mentioned first setup unit 1, the search period is service
The idle period (0 point to 6 points of such as morning) of device.The specific setting means of the search period can hereinafter be described in detail.
In the present embodiment, search rule can be set using above-mentioned second setup unit 2, can be wrapped in the search rule
Include that data search relied on by least one keyword and/or crucial phrase into Search Hints.Specifically, Search Hints can
Think multiple keywords and/or crucial contamination, and further can wish in above-mentioned search rule including the use of person
The period scanned for, certain period should be in the above-mentioned search period set in advance.
For example, user can be according to the need for itself and interest sets the keyword and/or keyword inquired about automatically
Combination and enquiry frequency (for example daily, weekly or every month), and can be according to setting inquiry the need for oneself
Time started and end time.Above- mentioned information is included in search rule.After then search rule sets and submitted, the
One data search unit 3 can start in related topic information or webpage etc. in the search period automatically according to search rule
Appearance is scanned for, final output search result.
In the preferred embodiment of the present invention, still as shown in figure 1, also including in above-mentioned data search system:
Statistic unit 4, connects the first setup unit 1, and statistic unit 4 is used to unite to the normal work cycle of server
Meter, to obtain the idle operation period of server;
First setup unit 1 is according to the free time operation slot setup search period.
Specifically, in the present embodiment, above-mentioned statistic unit 4 server normal workweek it is interim carry out data acquisition and
Statistics, specifically can be with the occupancy of the overall network bandwidth of the running situation and server of the processor of acquisition server and internal memory
Situation, thus counts the idle period for obtaining server.In the idle period, user's inquiry request that server is received compared with
Few, the network bandwidth takes less (i.e. network not congestion), therefore the processor and internal memory of the idle period server can be by
The normal work closed without influencing whether server.The idle period of different servers may be different, therefore using above-mentioned
4 pairs of different servers of statistic unit, which are done, each to be counted and obtains its idle period, and the idle period just can be by above-mentioned first
Setup unit 1 is set as searching for the period.The contrast of the peak hours/period and idle period of server can be referring generally to shown in Fig. 6.
In another preferred embodiment of the present invention, still as shown in figure 1, also including in above-mentioned data search system:
Input block 5, connects the first setup unit 1, for being supplied to user to input for setting setting for search period
Fixed instruction;
First setup unit 1 is according to the setting given settings search period.
Specifically, in the present embodiment, above-mentioned input block 5 is supplied to user to manually set the search period.
Above-mentioned statistic unit 4 and input block 5 are also show in Fig. 1.It is above-mentioned in the different embodiments of the present invention
Statistic unit 4 and input block 5 can select one and use.
In the preferred embodiment of the present invention, still as shown in figure 1, also including in above-mentioned data search system:
First memory cell 6, connects the first data search unit 3, for generating corresponding result text according to search result
Shelves are simultaneously preserved.
Specifically, in the present embodiment, above-mentioned first data search unit 3 carries out the search result quilt obtained after data search
Form corresponding document and be saved in the first memory cell 6.First memory cell 6 is the memory in server
(HDD or SSD).In other words, during the data search of the present invention, without the participation of processor and internal memory, only by storage
Device and the controller of memory inside can be completed.
In the preferred embodiment of the present invention, as shown in Figure 2, above-mentioned first data search unit 3 further comprises:
Results contrast module 33, concatenate rule search module 32, for the result document that obtains this data search with
The result document that last data search is obtained is compared, and exports corresponding comparative result;
First notification module 34, connection result comparison module 33, the knot for representing this data search in comparative result
When fruit has renewal, sent a notification message to user.
Specifically, in the present embodiment, when terminate each data search and generate to should number of times according to search result document
Afterwards, it is necessary to using the corresponding result document of 33 pairs of this data search of results contrast module and the corresponding result of previous data search
Document is compared, to judge whether the search result to same live body or webpage has fresh information:
If there is fresh information, it is included into a new document and is saved in the first memory cell 6 after putting in order, then
Be sent to user using the document as notification message, such as can by mail, short message or instant communication software mode
Above-mentioned notification message is pushed to user, the push operation need to wait until the search period terminate after (the i.e. processor of server
When restarting work with internal memory) perform again.
If without fresh information, server need not remake new document, it is not required that notify user.Certainly,
It should in advance be appointed between server and user, when server is not to user's sending out notice message, represent search knot
Fruit is without fresh information.Correspondingly, user can also require server in no fresh information by way of setting in advance
When still notify user to know search result, i.e., above-mentioned advice method can be changed according to the setting of user.
Then by above-mentioned setting, the data search system in technical solution of the present invention, which not only eliminates user, to be needed often
The trouble that information carries out respective queries operation is inputted, while reducing the power consumption and cost of data center server.
In the preferred embodiment of the present invention, the concatenate rule search module 32 of the above results comparison module 33, for incite somebody to action this
The result document that the result document that secondary data search is obtained is obtained with last data search is compared, and exports corresponding ratio
Relatively result.
Then still as shown in fig. 1, above-mentioned first data search unit 3 further comprises:
Reconstructed module 35 is searched for, respectively connection result comparison module 33 and rule search module 32, in comparative result
When representing that the result of this data search does not update, group again is carried out to the Search Hints in search rule according to preset rules
Close, to form the Search Hints after reconstruct;
Second notification module 36, connection search reconstructed module 35 and rule search module 32 respectively, for will be according to reconstruct
The result document for the search result formation that search rule search afterwards is obtained is supplied to user to check.
Specifically, in the present embodiment, when this data search does not have fresh information relative to previous data search, it is necessary to
Further provide for a kind of mechanism searched for generally.Under the mechanism, using a search reconstructed module 35 to searching in search rule
Rope prompting is reconfigured, to form the Search Hints after reconstruct, and according to the search rule including the Search Hints after reconstruct
Then re-start search.Specifically, it is so-called that Search Hints are reconfigured, refer to the keyword in Search Hints and/
Or keyword carries out rearranging combination, to form new Search Hints, the new Search Hints and Search Hints before it
Between there is certain degree of association, but will not be identical Search Hints, therefore can be restarted using new Search Hints
Data search, to expand hunting zone.
In the present embodiment, the restructuring procedure of above-mentioned Search Hints can include:Using tandom number generator produce one group with
Machine number, is then recombinated using this group of random number to the keyword and/or keyword in Search Hints, or is deleted therein
Indivedual key words, the Search Hints formed afterwards after reconstruct.
For example, the content to be searched for of user (i.e. by crucial phrase into Search Hints) for " and air-conditioning, intelligence, variable speed,
Certain brand 1, certain brand 2 ", if in once searching for, relatively preceding once search is without the result updated, then compile " air-conditioning "
Code is 1, and " intelligence " is encoded into 2, and " variable speed " is encoded into 3, by " certain brand 1 " is encoded to 4, will " certain brand 2 " coding
For 5, and N (N are produced using tandom number generator<5) individual random number, the span of these random numbers is 1 to 5.Such as certain
The random number once produced is 1,3,4,5, then the Search Hints formed after being reconstructed using this group of random number are " empty
Adjust, variable speed, certain brand 1, certain brand 2 ".
In the present embodiment, after being re-searched for, a new document is generated according to search result and first is saved in
In memory cell 6, then user is sent to using the document as notification message.Sending method and transmission time are referred to
Above for the description of the first notification module 34, it will not be repeated here.
In the preferred embodiment of the present invention, still as shown in fig. 1, in above-mentioned data search system, one point is set
Do not connect is used to preserve in the first memory cell 6 and the second memory cell 7 of the first data search unit 3, the second memory cell 7
Each data search search rule relied on and the result document extracted from the first memory cell 6, and in the second storage
The corresponding relation set up in unit 7 between search rule and result document;
Then also include one second data search unit 8, the second data search unit 8 connection the in above-mentioned data search system
Two memory cell 7;
As shown in Figure 3, also include in the second data search unit 8:
Acquisition request module 81, includes the inquiry request of search rule for obtain outside input;
Enquiry module 82, connection request acquisition module 81 is searched for carrying out data in storage network according to inquiry request
Rope;
Rule judgment module 83, difference connection request acquisition module 81 and enquiry module 82, for being searched according to this data
The search rule that rope is relied on is searched with the presence or absence of the search rule matched in the second memory cell, and exports corresponding judgement
As a result;
Then above-mentioned enquiry module 81 is used for according to judged result:
When there is the search rule matched in the second memory cell 7, the corresponding result text of search rule is directly extracted
Search result and output of the shelves as this data search;
When the search rule matched being not present in the second memory cell 7, using searching that this data search is relied on
Rope rule carries out data search, and exports corresponding search result.
Specifically, in one embodiment of the present of invention, above-mentioned second data search unit 8 can be arranged on the place of server
Manage device in, now mean the second data search unit 8 data search operate outside the above-mentioned search period (i.e. processor with
During internal memory normal work) perform.In an alternative embodiment of the invention, above-mentioned second data search unit 8 can also be arranged on
In the controller of memory, now mean that the data search of the second data search unit 8 is operated within the above-mentioned search period (i.e.
When processor and internal memory closing) perform.
In the present embodiment, above-mentioned second memory cell 7 can be the internal memory of server, or nonvolatile memory, or
Other memories in person's storage network.Within the search period, obtained after each data search of the first data search unit 3
Result document, in addition to being saved in the first memory cell 6, also restarts normally in the processor and internal memory of server
It is saved to during work in the second memory cell 7.In addition, above-mentioned second memory cell 7 is also connected with the first data search list
Member 3, when processor and internal memory start normal work, above-mentioned first data search unit 3 every time advise by the relied on search of search
Then it is transferred in the second memory cell 7 and preserves, and is set up in the second memory cell 7 between search rule and result document
Corresponding relation.
Then specifically, after the acquisition request module 81 in the second data search unit 8 receives an inquiry request, rule
Then judge module 83 can parse corresponding search rule (i.e. keyword/contamination) from the inquiry request, and according to this
Search rule is retrieved from the second memory cell 8 with the presence or absence of the search rule matched:
If in the presence of illustrating that someone searched for before for search rule that this data search relied on, now directly
Connect and extract corresponding result document and be pushed to user;
If being not present, illustrate that someone did not used before for search rule that this data search relied on, now
Normal data search is directly restarted once according to the search rule.
In the present embodiment, so-called search rule matches, and refers to that search rule is same or similar.Further, refer to
Crucial character/word in search rule is same or similar.For example, crucial character/word included in two search rules is identical (no
Consider putting in order for crucial character/word), or crucial character/word included in two search rules only has a few character/word
Increase and decrease etc..
In technical solution of the present invention, using above-mentioned searching method, enable to the server of data center to save and repeat to
The operation that related content is searched in network is stored, so as to save power consumption.
In the preferred embodiment of the present invention, still as shown in Figure 3, also include in above-mentioned second data search unit 8:
Setting module 84, concatenate rule judge module 83 is sentenced for being turned on and off rule according to the instruction of outside input
Disconnected module 83.
Specifically, whether user can voluntarily be set by above-mentioned setting module 84 and need first to do before data search
Matching operation:
If user's input instruction controls above-mentioned setting module 84 to open rule judgment module 83, then it represents that user wishes
Matching operation is first done before data search, above-mentioned second data search unit 8 will be according to the above matching process
Scan for the matching and corresponding data search processing of rule.
If user's input instruction controls the above-mentioned shut-off rule judge module 83 of setting module 84, then it represents that user wishes
Data search is directly carried out, the inquiry request that above-mentioned second data search unit 8 will be inputted directly according to user includes
Search rule carry out data search.
In the preferred embodiment of the present invention, the data preserved in the storage network of above-mentioned server are respectively included many
In individual different user folder;
Then still as shown in fig. 1, also include in above-mentioned data search system:
3rd data search unit 9, connect the first setup unit 1, for server be in off-line state when to service
Data search is carried out in device in different user folders, to be carried out to the identical data in different user folders at duplicate removal
Reason.
Above-mentioned 3rd data search unit 9 is arranged in the controller of the memory inside of server, i.e., the 3rd data are searched
Cable elements 9 are same to realize corresponding function within the search period by the controller in memory.
Then further, as shown in Figure 4, above-mentioned 3rd data search unit 9 includes:
First search module 91, for server be in off-line state when in user folder different in server
Data search is carried out, to find the identical data in different user folders;
Data deduplication module 92, connects the first search module 91, for the search result according to the first search module 91,
Retain the data in one of user folder in multiple user folders with identical data, and it is useful to delete remaining institute
Identical data in the file of family;
Link generation module 93, connection data deduplication module 92, for deleting user's text in data deduplication module 92
During data in part folder, corresponding link is accessed in generating one under user folder;
After data deduplication is handled, in multiple different user folders with identical data only include one not by
Delete the user folder of identical data and by as destination folder, deleted identical data is used as target data;
The storage address of target data in destination folder is pointed in above-mentioned access link.
Specifically, in the present embodiment, when server is in idle period (being now in off-line state), in memory
Controller is comprehensively searched the data automatically to storage network internal storage, to be performed to storage network at data deduplication
Reason.Specifically include:
First using first search module 91, whole memory space is scanned for, in search memory space not
It whether there is identical data in same user folder, and export search result.
Then using a data deduplication module 92, according to mentioned above searching results, to the different user with identical data
File, only retains the identical data in one of user folder, deletes the identical number in remaining all user folder
According to, and the user folder of retained identical data is considered as destination folder, wherein retained identical data is considered as mesh
Mark data.
Finally, using a link generation module 93, in the respective stored position for being deleted the user folder of identical data
Put generation one and access link, access link is pointed to the storage address that target data is preserved in destination folder, that is, used
Person can link direct access target data by the access, and after data deduplication processing, in entirely storage network
In only exist in a user folder and preserve target data, no longer preserved in remaining user folder with destination folder
Target data identical data.
In the preferred embodiment of the present invention, the substantially handling process of above-mentioned data deduplication processing is referred to Fig. 7
Shown in.In Fig. 7, file A, file B, file C and other texts are preserved in the user folder of one entitled " user's first "
File A, file Y, file Z and alternative document are preserved in part, the user folder of another entitled " user's second ".Then pass through
Found after data search, identical data is preserved in two user folders, and " file A " now retains the file in user's first
User's first (is considered as destination folder, file A is considered as into target data) by A, deletes the file A in user's second, and in user
Originally newly-generated one of the storage location that file A is preserved in second accesses link, and access link is directly linked to user's first hereafter
Part A storage address.
In the preferred embodiment of the present invention, the data preserved in the storage network of above-mentioned server are respectively included many
In individual different user folder;
Then still as shown in fig. 1, also include in above-mentioned data search system:
4th data search unit 10, connect the first setup unit 1, for server be in off-line state when to service
Data search is carried out in device in different user folders, with foundation pair between the close data in different user folders
It should be related to;
Above-mentioned 4th data search unit 10 is arranged in the controller of the memory inside of server, i.e., the 4th data are searched
Cable elements 10 are same to realize corresponding function within the search period by the controller in memory.
Further, as shown in Figure 5, above-mentioned 4th data search unit 10 includes:
Second search module 101, for server be in off-line state when to user folder different in server
Middle carry out data search, to find the close data in different user folders.The judgement of above-mentioned close data can pass through
Similarity is preset to realize.The data similarity for for example pre-setting a standard is 40% or 50%, when two numbers
Similarity between then thinks that the two data are close data when exceeding the data similarity of the standard.In the present embodiment,
Because each data or file have its different attribute respectively, such as correlation of author, content, art, included
Keyword etc., the above data similarity can be realized by judging the similar degree of data attribute.For example count
It is identical then it is considered that between the two data according to including the different attribute of four classes, then there are two generic attributes between two data
Data similarity is 50%.)
Mark module 102, connects the second search module 101, for the search result according to the second search module, to difference
User folder in close data mark, to set up the corresponding relation between close data.
Specifically, in the present embodiment, in the present embodiment, idle period (being now in off-line state) is in server
When, the controller in memory comprehensively be searched the data automatically to storage network internal storage, to set up storage net
Linked database in network between different pieces of information.
Scanned for first using the total data in one second search module, 101 pairs of storage networks, to find to store net
Close data in network, export corresponding search result.
Then use a mark module 102 to be marked to close data to set up dependency relation, that is, pass through data search
And mark sets the relational database for setting up the data preserved in storage network.After opening relationships database, after user
During continuous Query Information, it becomes possible to more similar information are provided and understands and selects for user.
Preferred embodiments of the present invention are the foregoing is only, embodiments of the present invention and protection model is not thereby limited
Enclose, to those skilled in the art, should can appreciate that made by all utilization description of the invention and diagramatic content
Scheme obtained by equivalent substitution and obvious change, should be included in protection scope of the present invention.
Claims (10)
1. a kind of data search system, applied in the server of data center;It is characterised in that it includes:
First setup unit, the search period of data search is performed for setting, and is searched in described in the period, the server
Processor and internal memory are in closed mode, and the server is in off-line state;
Second setup unit includes there is provided the search rule that data search institute foundation is set to user, the search rule
Data search relied on by least one keyword and/or crucial phrase into Search Hints;
First data search unit, connects first setup unit and second setup unit, first data respectively
Search unit further comprises:
Rule acquisition module, obtains the search rule that this data search is relied on;
Rule search module, connects the rule acquisition module, within the search period, according to the search rule pair
Storage network in the server is scanned for, and exports corresponding search result;
The first data search unit is arranged in the controller of the memory inside of the server.
2. data search system as claimed in claim 1, it is characterised in that also include:
Statistic unit, connects first setup unit, and the statistic unit is used for the normal work cycle to the server
Counted, to obtain the idle operation period of the server;
First setup unit searches for the period according to the idle operation slot setup.
3. data search system as claimed in claim 1, it is characterised in that also include:
Input block, connects first setup unit, for being supplied to user to input for setting the search period
Setting instruction;
First setup unit searches for the period according to the setting given settings.
4. data search system as claimed in claim 1, it is characterised in that also include:
First memory cell, connects the first data search unit, for generating corresponding result according to the search result
Document is simultaneously preserved.
5. data search system as claimed in claim 4, it is characterised in that the first data search unit includes:
Results contrast module, connects the rule search module, for the result document that obtains this data search with
The result document that last data search is obtained is compared, and exports corresponding comparative result;
First notification module, connects the results contrast module, the knot for representing this data search in the comparative result
When fruit has renewal, sent a notification message to user.
6. data search system as claimed in claim 4, it is characterised in that the first data search unit includes:
Results contrast module, connects the rule search module, for the result document that obtains this data search with
The result document that last data search is obtained is compared, and exports corresponding comparative result;
Reconstructed module is searched for, the results contrast module and the rule search module are connected respectively, for comparing knot described
When fruit represents that the result of this data search does not update, according to preset rules to the Search Hints in the search rule
Reconfigured, to form the Search Hints after reconstruct;
Second notification module, connects the search reconstructed module and the rule search module, after will be according to reconstruct respectively
The search rule search obtain search result formation the result document be supplied to user to check.
7. data search system as claimed in claim 4, it is characterised in that set one to connect first memory cell respectively
With the second memory cell of the first data search unit, it is used in second memory cell preserve each data search institute
The search rule relied on and the result document extracted from first memory cell, and in the described second storage
The corresponding relation set up in unit between the search rule and the result document;
Also include one second data search unit in the data search system, the second data search unit connection described the
Two memory cell;
Also include in the second data search unit:
Acquisition request module, includes the inquiry request of the search rule for obtain outside input;
Enquiry module, connects the acquisition request module, for entering line number in the storage network according to the inquiry request
According to search;
Rule judgment module, connects the acquisition request module and the enquiry module, for according to this data search respectively
The search rule relied on is searched with the presence or absence of the search rule matched in second memory cell, and is exported
Corresponding judged result;
The enquiry module is used for according to the judged result:
When there is the search rule matched in second memory cell, the search rule is directly extracted corresponding
The result document as this data search the search result and export;
When the search rule matched being not present in second memory cell, relied on using this data search
The search rule carries out data search, and exports the corresponding search result.
8. data search system as claimed in claim 7, it is characterised in that also include in the second data search unit:
Setting module, connects the rule judgment module, sentences for being turned on and off the rule according to the instruction of outside input
Disconnected module.
9. data search system as claimed in claim 1, it is characterised in that the number preserved in the storage network of the server
According to being respectively included in multiple different user folders;
Also include in the data search system:
3rd data search unit, connects first setup unit 1, for when the server is in off-line state to institute
State in the user folders different in server and carry out data search, with to identical in the different user folders
Data carry out duplicate removal processing;
The 3rd data search unit is arranged in the controller of the memory inside of the server;
The 3rd data search unit further comprises:
First search module, for literary to the user different in the server when the server is in off-line state
Data search is carried out in part folder, to find the identical data in the different user folders;
Data deduplication module, connects first search module, for the search result according to first search module, many
Retain the data in one of them described user folder in the individual user folder with identical data, and delete remaining
Identical data in all user folders;
Generation module is linked, the data deduplication module is connected, for deleting a user in the data deduplication module
During data in file, corresponding link is accessed in generating one under the user folder;
After data deduplication processing, one is only included in multiple different user folders with identical data
The user folder of the identical data is not deleted and by as destination folder, deleted identical data is by conduct
Target data;
It is described to access the storage address that target data described in the destination folder is pointed in link.
10. data search system as claimed in claim 1, it is characterised in that preserved in the storage network of the server
Data are respectively included in multiple different user folders;
Also include in the data search system:
4th data search unit, connects first setup unit, for when the server is in off-line state to institute
State in the user folders different in server and carry out data search, with close in the different user folders
Corresponding relation is set up between data;
The 4th data search unit is arranged in the controller of the memory inside of the server;
The 4th data search unit further comprises:
Second search module, for literary to the user different in the server when the server is in off-line state
Data search is carried out in part folder, to find the close data in the different user folders;
Mark module, connects second search module, for the search result according to second search module, to different
Close data in the user folder are marked, to set up the corresponding relation between close data.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710332855.8A CN107169085B (en) | 2017-05-12 | 2017-05-12 | Data search system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710332855.8A CN107169085B (en) | 2017-05-12 | 2017-05-12 | Data search system |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN107169085A true CN107169085A (en) | 2017-09-15 |
| CN107169085B CN107169085B (en) | 2020-12-01 |
Family
ID=59815916
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201710332855.8A Active CN107169085B (en) | 2017-05-12 | 2017-05-12 | Data search system |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN107169085B (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107743101A (en) * | 2017-09-26 | 2018-02-27 | 杭州迪普科技股份有限公司 | The retransmission method and device of a kind of data |
| CN111506818A (en) * | 2020-04-22 | 2020-08-07 | 中国民航信息网络股份有限公司 | Flight data processing method and device |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100057802A1 (en) * | 2001-11-30 | 2010-03-04 | Micron Technology, Inc. | Method and system for updating a search engine |
| CN102081626A (en) * | 2009-11-30 | 2011-06-01 | 中国移动通信集团北京有限公司 | Data inquiring method and data inquiring server |
| US20110145181A1 (en) * | 2006-12-08 | 2011-06-16 | Ashish Pandya | 100gbps security and search architecture using programmable intelligent search memory (prism) that comprises one or more bit interval counters |
| CN105320569A (en) * | 2015-11-04 | 2016-02-10 | 浪潮(北京)电子信息产业有限公司 | Method and system of improving database server performance |
-
2017
- 2017-05-12 CN CN201710332855.8A patent/CN107169085B/en active Active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100057802A1 (en) * | 2001-11-30 | 2010-03-04 | Micron Technology, Inc. | Method and system for updating a search engine |
| US20110145181A1 (en) * | 2006-12-08 | 2011-06-16 | Ashish Pandya | 100gbps security and search architecture using programmable intelligent search memory (prism) that comprises one or more bit interval counters |
| CN102081626A (en) * | 2009-11-30 | 2011-06-01 | 中国移动通信集团北京有限公司 | Data inquiring method and data inquiring server |
| CN105320569A (en) * | 2015-11-04 | 2016-02-10 | 浪潮(北京)电子信息产业有限公司 | Method and system of improving database server performance |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107743101A (en) * | 2017-09-26 | 2018-02-27 | 杭州迪普科技股份有限公司 | The retransmission method and device of a kind of data |
| CN107743101B (en) * | 2017-09-26 | 2020-10-09 | 杭州迪普科技股份有限公司 | Data forwarding method and device |
| CN111506818A (en) * | 2020-04-22 | 2020-08-07 | 中国民航信息网络股份有限公司 | Flight data processing method and device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN107169085B (en) | 2020-12-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR101153082B1 (en) | Application programming interface for text mining and search | |
| US7657515B1 (en) | High efficiency document search | |
| US6502091B1 (en) | Apparatus and method for discovering context groups and document categories by mining usage logs | |
| CN108255958A (en) | Data query method, apparatus and storage medium | |
| US20040249808A1 (en) | Query expansion using query logs | |
| US8655902B2 (en) | Identifying superphrases of text strings | |
| CN113836898B (en) | A method for automatically dispatching orders in power system | |
| KR20160020429A (en) | Contextual mobile application advertisements | |
| WO2010123705A2 (en) | System and method for performing longest common prefix strings searches | |
| CN118410152A (en) | Information processing method, question-answering method, and question-answering system | |
| JP2016131045A (en) | Search method, apparatus and server for online trading platform | |
| US7765204B2 (en) | Method of finding candidate sub-queries from longer queries | |
| CN119692469B (en) | Reply text generation method and device, storage medium and program product | |
| CN118503350A (en) | Flow optimization design method and system for improving accuracy of large-model RAG | |
| CN107169085A (en) | A kind of data search system | |
| CN117112930A (en) | Point of interest recall method, device, computer equipment and storage medium | |
| WO2025147358A1 (en) | Semantic memory vector database consolidation and generative model update | |
| CN112883143A (en) | Elasticissearch-based digital exhibition searching method and system | |
| CN118364051A (en) | Information technology consultation system and device | |
| CN102799996A (en) | Network advertisement strategy matching method and system | |
| CN111026706B (en) | Warehouse entry method, device, equipment and medium for power system data | |
| CN109241444B (en) | Content recommendation method, device, equipment and storage medium based on state machine | |
| Wendi | Research on the application of computer digital retrieval technology in the construction of library and information database | |
| KR20020067160A (en) | Method and system for indexing document | |
| US20260037837A1 (en) | Retrieval-augmented generation method, system, device, and medium and question-answering method |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |