CN106294487B - Self-adapted search method, equipment and system Internet-based - Google Patents
Self-adapted search method, equipment and system Internet-based Download PDFInfo
- Publication number
- CN106294487B CN106294487B CN201510308566.5A CN201510308566A CN106294487B CN 106294487 B CN106294487 B CN 106294487B CN 201510308566 A CN201510308566 A CN 201510308566A CN 106294487 B CN106294487 B CN 106294487B
- Authority
- CN
- China
- Prior art keywords
- business
- file
- score value
- document
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
This application discloses a kind of self-adapted search methods Internet-based, are applied to search server, comprising: receive and save the content of the description data of each file in all kinds of business of service terminal transmission;For every class business, the text relevant configuration file of such business is generated according to the description data of such business;Receive the searching request for the carrying search key that user terminal is sent, according to the content of the description data of the text relevant configuration file of all kinds of business and each file saved, the calculated text relevant score value is determined as the total score of this document by the text relevant score value for calculating each file Yu described search keyword;It is sorted from high to low according to the total score of calculated each file, and the information of the corresponding file of total score for the first forward preset quantity that sorts is sent to the user terminal.
Description
Technical field
The present invention relates to Internet technical field more particularly to a kind of self-adapted search methods Internet-based, equipment
And system.
Background technique
All kinds of uniform services to come in every shape are regarded in non-structured text based search as to be made of each field
The weight of text file, each field is identical, and this searching method can satisfy versatility, i.e. not differentiated service type, uses
Same a set of searching method and sort method scan for various different types of business, such as use same searcher
Method searches for different types of business such as film, novel or picture.It can guarantee to quickly access new business, while maximum journey in this way
Degree reduces the maintenance cost of search system.
Summary of the invention
The embodiment of the present invention provides a kind of self-adapted search method Internet-based, equipment and system, to improve search
The search performance of server.
The technical solution of the embodiment of the present invention is achieved in that
A kind of self-adapted search method Internet-based is applied to search server, comprising:
Receive and save the content of the description data of each file in all kinds of business of service terminal transmission, the description number
Be arranged according to being the service terminal for every class business, including characterizing the text field of such service attribute and its respectively weighing
Value;
For every class business, the text relevant configuration file of such business is generated according to the description data of such business,
Wherein, it is configured in the text relevant configuration file and calculates any text in such business according to the description data of such business
The calculation method of the text relevant score value of part and any search key;
The searching request for receiving the carrying search key that user terminal is sent, matches according to the text relevant of all kinds of business
The content for setting file with the description data of each file saved, calculates text phase of each file with described search keyword
Closing property score value, wherein each file is directed to, according to the text relevant configuration file of the affiliated business of this document, to this document
The matching degree of the content and described search keyword that describe data is given a mark, and is closed so that this document and described search is calculated
The calculated text relevant score value is determined as the total score of this document by the text relevant score value of keyword;According to meter
The total score of each file calculated sorts from high to low, and by the corresponding text of total score for the first forward preset quantity that sorts
The information of part is sent to the user terminal.
A kind of search server equipment, comprising:
Receiving module, for receiving and saving in all kinds of business that service terminal is sent in the description data of each file
Hold, the description data are that the service terminal is arranged for every class business, the text word including characterizing such service attribute
Section and its respective weight;
Configuration file generation module generates such business according to the description data of such business for being directed to every class business
Text relevant configuration file, wherein configured with according to the description number of such business in the text relevant configuration file
According to the calculation method for calculating the text relevant score value of any file and any search key in such business;
The receiving module is also used to, and receives the searching request for the carrying search key that user terminal is sent;
Computing module, for the description according to the text relevant configuration files of all kinds of business and each file saved
The content of data calculates the text relevant score value of each file Yu described search keyword, wherein is directed to each file, root
According to the text relevant configuration file of the affiliated business of this document, to the content and described search keyword of the description data of this document
Matching degree give a mark, the text relevant score value of this document Yu described search keyword is calculated, will calculate
The text relevant score value be determined as the total score of this document;
Sending module, for being sorted from high to low according to the total score of calculated each file, and it is forward by sorting
The information of the corresponding file of the total score of first preset quantity is sent to the user terminal.
A kind of adaptable search system Internet-based, comprising:
One search server, at least one service terminal and at least one user terminal;
Service terminal, the content of the description data of each file for sending all kinds of business;
User terminal, the searching request for sending carrying search key give described search server, and described in reception
The information for the file that search server returns.
The method, apparatus and system provided according to embodiments of the present invention, search server receive and save all kinds of business
Then the content of the description data of each file automatically generates correlation configuration file for every kind of business respectively, when search takes
When business device receives the searching request of user terminal transmission, the search key carried using searching request successively with saved
The description field for the file that all kinds of business include is matched, and configures text according to matching result and the correlation automatically generated
Part is that each file is given a mark, and is ranked up according to marking result to each file, and ranking is presented for user terminal and leans on
The information of the file of preceding preset quantity, completes search mission, and this method being capable of adaptively various different types of business, and root
The weight of the description field of file is customized according to the difference of type of business, improves the search performance of search server.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art
To obtain other drawings based on these drawings.
Fig. 1 is a kind of adaptable search system construction drawing Internet-based;
Fig. 2 is the flow chart of technical solution of the present invention;
Fig. 3 is the search server equipment structure chart in the embodiment of the present invention;
Fig. 4 is search server hardware structural diagram provided by one embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that the described embodiment is only a part of the embodiment of the present invention, instead of all the embodiments.Based on this
Embodiment in invention, every other reality obtained by those of ordinary skill in the art without making creative efforts
Example is applied, shall fall within the protection scope of the present invention.
Searching method in the prior art is normalized to plain text text in order to meet versatility, by different types of business
Shelves, such as regard video files and novel file as same text document, use same a set of searching method and sort method
Various different types of business are scanned for, the present inventor has found during excavating the present invention, this satisfaction
The searching method of versatility cannot obtain best correlation effect, for example, different types of business is for characterizing service attribute
Description field weighting it is different, if performer's description field significance level is higher in video display business, the important journey of description field of writing a play
Spend lower, and author's description field significance level is higher in novel business, and Publication Year description field significance level is lower, has bright
Aobvious difference, in the prior art regards the weight of the different field of different business as identical, and such searching method cannot make
Search system provides best search effect for user.
To solve the above problems, the present invention proposes a kind of self-adapted search method Internet-based, equipment and system, this
The technical solution of inventive embodiments is:
Fig. 1 be technical solution of the present invention networking structure figure, as shown in Figure 1, search system include a search server,
Service terminal 1, service terminal 2 and user terminal 1, the quantity of service terminal can be multiple in practical application, user terminal
Quantity can be to be multiple, Fig. 1 the technical program for ease of description, only by taking two service terminals and a user terminal as an example.
Fig. 2 is the flow chart of technical solution of the present invention, as shown in Figure 2, comprising the following steps:
Step 201: search server receives and saves the description number of each file in all kinds of business of service terminal transmission
According to content, this describes data, and to be service terminal be arranged for every class business, the text word including characterizing such service attribute
Section and its respective weight.
In this step, it is assumed that the corresponding business of service terminal 1 is novel business, and the corresponding business of service terminal 2 is video display
Business.
Search server receives and saves the interior of the description data of each file of the novel business of the transmission of service terminal 1
Hold, it is assumed that as shown in table 1;Search server receives and saves the description of each file of the video display business of the transmission of service terminal 2
The content of data, it is assumed that as shown in table 2.
Table 1
Table 2
As shown in table 1, the text field of novel business is respectively author, task, topic, and service terminal 1 is by all novels
The content of above-mentioned the text field of business is sent to search server and is saved, and each file is for each text in novel business
The setting of the weight of this field is controlled by service terminal 1.Video display business shown in table 2 is similar with novel business, is no longer described in detail.
Step 202: being directed to every class business, search server generates the text of such business according to the description data of such business
This correlation configuration file, wherein configured with the description data meter according to such business in the text relevant configuration file
Calculate the calculation method of the text relevant score value of any file and any search key in such business.
Text relevant configuration file in this step, which is configured with, calculates such business according to the description data of such business
In any file and any search key text relevant score value calculation method, which specifically includes: characterization
N number of sub- score value formula of text relevant in such business between any file and any search key, is denoted as f respectively1、
f2、…、fN, N number of sub- score value formula is for respectively from N number of different angle to the text of any file in such business
Field and the matching degree of any search key are given a mark;Also, respectively every sub- score value formula configures specific gravity q, then
The calculation formula of the text relevant score value of any file and any search key in such business are as follows:Wherein N is the number of the sub- score value formula of text relevant, and M is any file in such business
Description data in the text field number, qjFor the sub- score value formula f of j-th of text relevant in such businessjSpecific gravity, pi
For the weight of i-th of the text field of any file in such business.
By taking the text relevant configuration file of novel business as an example: the text relevant configuration file of novel business is provided with
The sub- score value formula of 3 of text relevant in novel business between any file and any search key, respectively f1、f2、
f3, wherein f1Each the text field of each novel file and the accurate matching degree of any search key are traversed, if accurately
It matches, then f1=1, otherwise f1=0;f2Traverse each the text field of each novel file and obscuring for any search key
Matching degree, if fuzzy matching, f2=1, if uncorrelated, f completely2=0;f3The tightness of any search key is measured,
The accurate matching degree of participle and each the text field of novel file after i.e. any keyword fractionation, traverses each novel file
Each the text field, if the participle after some the text field and any search key are split exactly matches, f3=1, it is no
Then f3=0.Above-mentioned 3 sub- score value formula are respectively from 3 different angles to the text word of any file in the novel business
Section and the matching degree of any search key are given a mark.Moreover, the weight of every height marking value are as follows: the sub- marking value corresponds to
Son marking formula specific gravity and the product of the weight of the text field that is related to of the sub- marking value.
In addition, respectively every sub- score value formula configures specific gravity q, for example, f1Specific gravity be 0.3, f2Specific gravity be 0.6, f3
Specific gravity be 0.4;Then in novel business the text relevant score value of any file and any search key calculation formula are as follows:
The text relevant configuration file of video display business and the text relevant configuration file of novel business are similar, are only sons
The algorithm of marking formula designs difference, and this will not be detailed here.
Step 203: search server receives the searching request for the carrying search key that user terminal is sent, according to all kinds of
The content of the description data of the text relevant configuration file of business and each file saved calculates each file and receives
The text relevant score value of the search key arrived, wherein it is directed to each file, it is related according to the text of the affiliated business of this document
Property configuration file, the matching degree of content and described search keyword to the description data of this document gives a mark, to calculate
The text relevant score value of this document Yu described search keyword is obtained, the calculated text relevant score value is determined as
The total score of this document.
In this step, search server receives the searching request for the carrying search key that user terminal is sent, and utilizes step
The method of calculating text relevant score value in rapid 202 in text relevant configuration file, calculates each file in all kinds of business
With the text relevant score value of the search key, and calculated text relevant score value is determined as to the total score of this document
Value.
Step 204: search server sorts from high to low according to the total score of calculated each file, and sequence is leaned on
The information of the corresponding file of total score of the first preceding preset quantity is sent to the user terminal.
In another embodiment of the present invention, service terminal is sent in the description data of every class business of search server also
Including technorati authority field, which is to measure any authoritative numerical value of file in such business;
For every class business, when generating the text relevant configuration file of such business, according to the description of such business
Technorati authority field in data generates the technorati authority configuration file for calculating authoritative score value;Each file is calculated to search with described
It is right according to the technorati authority configuration file of the affiliated business of this document for each file when the text relevant score value of rope keyword
The technorati authority of this document is given a mark, the authoritative score value of this document is calculated;
And/or
It further includes time field, the time that service terminal, which is sent in the description data of every class business of search server,
Field is to measure the numerical value of any file timeliness n in such business;
For every class business, when generating the text relevant configuration file of such business, according to the description of such business
The time field in data generates the timeliness n configuration file for calculating timeliness n score value;
When calculating text relevant score value of each file with described search keyword, for each file, according to this article
The timeliness n configuration file of the affiliated business of part, gives a mark to the timeliness n of this document, the timeliness n of this document is calculated
Score value;
The sum of the text relevant score value of this document, technorati authority score value and/or timeliness n score value is determined as this document
Total score.
Wherein, it according to the technorati authority configuration file of the affiliated business of this document, gives a mark to the technorati authority of this document, in terms of
Calculation obtains the authoritative score value of this document, comprising:
Authoritative score value is calculated according to formula y=α × h (x), wherein α is constant, for guaranteeing the authority of this document
Score value is identical as the text relevant score value order of magnitude;H () is positive correlation function;X is the content of the technorati authority field of this document;
It according to the timeliness n configuration file of the affiliated business of this document, gives a mark to the timeliness n of this document, to calculate
To the timeliness n score value of this document, comprising:
Authoritative score value is calculated according to formula y=β × g (t), wherein β is constant, for guaranteeing the timeliness n of this document
Score value is identical as the text relevant score value order of magnitude;G () is inverse correlation function;T is the time word of current time and this document
The difference of section.
Above-mentioned constant coefficient α or β be in order to guarantee the text relevant score value of same service scripts, technorati authority score value, when
New property score value is not in unbalance situation, and e.g., the text relevant score value of certain file is 2, and technorati authority score value is 90, timeliness n
Score value is 0.01, and the order of magnitude of three kinds of score values is different in this case, as unbalance.
Increase technorati authority field, be since the text relevant score value that will appear two files is identical, but obviously only have one
A file is the search result that user wants, such as user wants the founder " Ma Yun " of search Alibaba, search server
It searches two " Ma Yun ", one is ordinary people, and one is Alibaba founder, if only considering text relevant, search
Two search results are then all sent to user terminal by server, and it is undesirable that this will lead to search effect.If increasing technorati authority
Field, the technorati authority of Alibaba founder " Ma Yun " are necessarily higher than ordinary people " Ma Yun ", and such text relevant combines authority
The search result obtained is spent closer in user search request.
Increases timeliness n field, but is differed widely on the time because the text relevant of different files is close or identical,
Such as when user's search news class file, it is clear that focused more on to the time of search result, after earthquake occurs such as somewhere, user thinks
Latest developments search " earthquake " is paid close attention to, and file relevant to " earthquake " has very much, if search server can be in conjunction with every
The timeliness n score value of a file returns to search result to user, then can greatly improve search performance.
In another embodiment of the present invention, search server receives the customization text for the specific type business that service terminal is sent
This correlation temper score value formula covers the text relevant of the specific type business with the sub- score value formula of the customization text relevant
Default text correlation temper score value formula in configuration file realizes that providing score value to service terminal calculates customization service, business
The sub- score value formula for the calculating text relevant that terminal customizes out according to the characteristic of own service can accurately more match user's
Searching request.
The self-adapted search method Internet-based proposed above to the embodiment of the present invention is illustrated.Below with reference to
Attached drawing is illustrated search server provided in an embodiment of the present invention.
Fig. 3 is a kind of search server equipment structure chart provided in an embodiment of the present invention, as shown in figure 3, the search service
Device equipment includes:
Receiving module 301, for receiving and saving the description data of each file in all kinds of business that service terminal is sent
Content, the description data are the service terminals to be arranged for every class business, the text including characterizing such service attribute
This field and its respective weight;
Configuration file generation module 302 generates such industry according to the description data of such business for being directed to every class business
The text relevant configuration file of business, wherein configured with the description according to such business in the text relevant configuration file
Data calculate the calculation method of the text relevant score value of any file and any search key in such business;
The receiving module 301 is also used to, and receives the searching request for the carrying search key that user terminal is sent;
Computing module 303, for according to the text relevant configuration files of all kinds of business and each file saved
The content for describing data, calculates the text relevant score value of each file Yu described search keyword, wherein is directed to each text
Part, according to the text relevant configuration file of the affiliated business of this document, to the content and described search of the description data of this document
The matching degree of keyword is given a mark, will the text relevant score value of this document Yu described search keyword is calculated
The calculated text relevant score value is determined as the total score of this document;
Sending module 304, for being sorted from high to low according to the total score of calculated each file, and it is forward to sort
The information of the corresponding file of total score of the first preset quantity be sent to the user terminal.
It is configured in the text relevant configuration file for every class business that the configuration file generation module 302 generates
The calculation method specifically includes: characterizing the text relevant in such business between any file and any search key
N number of sub- score value formula, is denoted as f respectively1、f2、…、fN, N number of sub- score value formula is for respectively from N number of different angle to this
The matching degree of the text field and any search key of any file in class business is given a mark;It also, is respectively every
A sub- score value formula configures specific gravity q, then the text relevant score value of any file and any search key in such business
Calculation formula are as follows:Wherein N is the number of the sub- score value formula of text relevant, and M is such business
In any file description data in the text field number, qjFor the sub- score value formula f of j-th of text relevant in such businessj
Specific gravity, piFor the weight of i-th of the text field of any file in such business.
It further include technorati authority field, the authority in the description data for every class business that the receiving module 301 receives
Spending field is to measure any authoritative numerical value of file in such business;
The configuration file generation module 302 is directed to every class business, configures text in the text relevant for generating such business
It is also used to when part, the power for calculating authoritative score value is generated according to the technorati authority field in the description data of such business
Prestige degree configuration file;
The computing module 303 is also used to when calculating text relevant score value of each file with described search keyword,
It is given a mark to the technorati authority of this document, in terms of for each file according to the technorati authority configuration file of the affiliated business of this document
Calculation obtains the authoritative score value of this document;
And/or
It further include time field in the description data for every class business that the receiving module 301 receives, the time word
Section is to measure the numerical value of any file timeliness n in such business;
The configuration file generation module 302 is directed to every class business, configures text in the text relevant for generating such business
It is also used to when part, is generated according to the time field in the description data of such business for calculating the stylish of timeliness n score value
Property configuration file;
The computing module 303 is also used to when calculating text relevant score value of each file with described search keyword,
It is given a mark to the timeliness n of this document, in terms of for each file according to the timeliness n configuration file of the affiliated business of this document
Calculation obtains the timeliness n score value of this document;
The sum of the text relevant score value of this document, technorati authority score value and/or timeliness n score value is determined as this document
Total score.
The computing module 303 is used for the technorati authority configuration file according to the affiliated business of this document, to the authority of this document
Degree is given a mark, when authoritative score value this document is calculated, comprising:
Authoritative score value is calculated according to formula y=α × h (x), wherein α is constant, for guaranteeing the authority of this document
Score value is identical as the text relevant score value order of magnitude;H () is positive correlation function;X is the content of the technorati authority field of this document;
The computing module 303 is used for the timeliness n configuration file according to the affiliated business of this document, to the stylish of this document
Property is given a mark, when timeliness n score value this document is calculated, comprising:
Authoritative score value is calculated according to formula y=β × g (t), wherein β is constant, for guaranteeing the timeliness n of this document
Score value is identical as the text relevant score value order of magnitude;G () is inverse correlation function;T is the time word of current time and this document
The difference of section;
Receiving module 301 is further used for:
The sub- score value formula of customization text relevant for receiving the specific type business that the service terminal is sent, with described fixed
The sub- score value formula of text relevant processed covers the text relevant in the text relevant configuration file of the specific type business
Sub- score value formula.
Fig. 4 is the hardware structural diagram of the search server of the realization adaptable search of an embodiment according to the present invention.
The device of the realization network security can include: processor 410, memory 420, port 430 and bus 440.410 He of processor
Memory 420 is interconnected by bus 440.Processor 410 can send and receive data by port 430.
Wherein, processor 410 is used to execute the machine readable instructions module of the storage of memory 420.
Memory 420 is stored with the executable machine readable instructions module of processor 410, comprising: read module 421 and tune
Mould preparation block 422.Wherein, it when processor 410 executes read module 421 and adjusts the instruction of module 422, can realize respectively above-mentioned
The various functions of receiving module 301, configuration file generation module 302, computing module 303 and sending module 304.
The apparatus and method embodiment provided by the above embodiment for realizing network security belongs to same design, implements
Process is detailed in embodiment of the method, and which is not described herein again.
It, can also be in addition, each functional module in each embodiment of the present invention can integrate in one processing unit
It is that modules physically exist alone, can also be integrated in one unit with two or more modules.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
In addition, each embodiment of the invention can pass through the data processing by data processing equipment such as computer execution
Program is realized.Obviously, data processor constitutes the present invention.In addition, being commonly stored data in one storage medium
Processing routine is by directly reading out storage medium for program or by depositing program installation or storage media (such as paper tape), magnetic
Storage media (such as floppy disk, hard disk, flash memory), optical storage media (such as CD-ROM), magnetic-optical storage medium (such as MO) etc..
Therefore the present invention also provides a kind of storage mediums, wherein it is stored with data processor, the data processor
For executing any embodiment of the above method of the present invention.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware
It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable
In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.
Claims (11)
1. a kind of self-adapted search method Internet-based, which is characterized in that it is applied to search server, this method comprises:
The content of the description data of each file in all kinds of business of service terminal transmission is received and saved, the description data are
The service terminal is for the setting of every class business, the text field and its respective weight including characterizing such service attribute;
For every class business, the text relevant configuration file of such business is generated according to the description data of such business, wherein
In the text relevant configuration file configured with according to the description data of such business calculate in such business any file with
The calculation method of the text relevant score value of any search key;
The searching request for receiving the carrying search key that user terminal is sent, configures text according to the text relevant of all kinds of business
The content of the description data of part and each file saved, calculates the text relevant of each file Yu described search keyword
Score value, wherein be directed to each file, the description according to the text relevant configuration file of the affiliated business of this document, to this document
The content of data and the matching degree of described search keyword are given a mark, this document and described search keyword is calculated
Text relevant score value, the calculated text relevant score value is determined as to the total score of this document;According to calculating
The total score of each file sort from high to low, and by the corresponding file of total score for the first forward preset quantity that sorts
Information is sent to the user terminal.
2. the method according to claim 1, wherein
The calculation method configured in the text relevant configuration file of every class business specifically includes: characterizing such business
In text relevant between any file and any search key N number of sub- score value formula, be denoted as f respectively1、f2、…、fN,
N number of sub- score value formula from N number of different angle for the text field of any file in such business and appointing respectively
The matching degree of one search key is given a mark;Also, respectively every sub- score value formula configures specific gravity q, then such business
In any file and any search key text relevant score value calculation formula are as follows:
Wherein N is the number of the sub- score value formula of text relevant, and M is the text field in the description data of any file in such business
Number, qjFor the sub- score value formula f of j-th of text relevant in such businessjSpecific gravity, piFor file i-th any in such business
The weight of a the text field.
3. the method according to claim 1, wherein this method further comprises:
It further include technorati authority field in the description data of every class business, the technorati authority field is any in such business of measurement
The authoritative numerical value of file;
For every class business, when generating the text relevant configuration file of such business, according to the description data of such business
In the technorati authority field generate the technorati authority configuration file for calculating authoritative score value;Each file is calculated to search with described
It is right according to the technorati authority configuration file of the affiliated business of this document for each file when the text relevant score value of rope keyword
The technorati authority of this document is given a mark, the authoritative score value of this document is calculated;
And/or
It further include time field in the description data of every class business, the time field is to measure any file in such business
The numerical value of timeliness n;
For every class business, when generating the text relevant configuration file of such business, according to the description data of such business
In the time field generate the timeliness n configuration file for calculating timeliness n score value;
When calculating text relevant score value of each file with described search keyword, for each file, according to this document institute
The timeliness n configuration file of category business, gives a mark to the timeliness n of this document, the timeliness n score value of this document is calculated;
The sum of the text relevant score value of this document, technorati authority score value and/or timeliness n score value is determined as to the total score of this document
Value.
4. according to the method described in claim 3, it is characterized in that,
The technorati authority configuration file according to the affiliated business of this document, gives a mark to the technorati authority of this document, to calculate
To the authoritative score value of this document, comprising:
Authoritative score value is calculated according to formula y=α × h (x), wherein α is constant, for guaranteeing the authoritative score value of this document
It is identical as the text relevant score value order of magnitude;H () is positive correlation function;X is the content of the technorati authority field of this document;
The timeliness n configuration file according to the affiliated business of this document, gives a mark to the timeliness n of this document, to calculate
To the timeliness n score value of this document, comprising:
Timeliness n score value is calculated according to formula y=β × g (t), wherein β is constant, for guaranteeing the timeliness n score value of this document
It is identical as the text relevant score value order of magnitude;G () is inverse correlation function;T is the difference of the time field of current time and this document
Value.
5. according to the method described in claim 2, it is characterized in that, this method further comprises:
The sub- score value formula of customization text relevant for receiving the specific type business that the service terminal is sent, with the customization text
This correlation temper score value formula covers text relevant in the text relevant configuration file of the specific type business point
It is worth formula.
6. a kind of search server equipment, which is characterized in that the equipment includes:
Receiving module, for receiving and saving the content of the description data of each file in all kinds of business that service terminal is sent,
The description data are the service terminals to be arranged for every class business, including characterize such service attribute the text field and
Its respective weight;
Configuration file generation module generates the text of such business according to the description data of such business for being directed to every class business
This correlation configuration file, wherein configured with the description data meter according to such business in the text relevant configuration file
Calculate the calculation method of the text relevant score value of any file and any search key in such business;
The receiving module is also used to, and receives the searching request for the carrying search key that user terminal is sent;
Computing module, for according to the text relevant configuration file of all kinds of business and the description data of each file saved
Content, calculate the text relevant score value of each file Yu described search keyword, wherein be directed to each file, according to this
The text relevant configuration file of the affiliated business of file, of content and described search keyword to the description data of this document
It gives a mark with degree, the text relevant score value of this document Yu described search keyword is calculated, by calculated institute
State the total score that text relevant score value is determined as this document;
Sending module, for being sorted from high to low according to the total score of calculated each file, and by forward first of sorting
The information of the corresponding file of the total score of preset quantity is sent to the user terminal.
7. equipment according to claim 6, which is characterized in that
The meter configured in the text relevant configuration file for every class business that the configuration file generation module generates
Calculation method specifically includes: characterizing N number of son of the text relevant in such business between any file and any search key
Score value formula, is denoted as f respectively1、f2、…、fN, N number of sub- score value formula is for respectively from N number of different angle to such industry
The matching degree of the text field and any search key of any file in business is given a mark;Also, respectively every height
Score value formula configures specific gravity q, then in such business the text relevant score value of any file and any search key calculating
Formula are as follows:Wherein N is the number of the sub- score value formula of text relevant, and M is to appoint in such business
The number of the text field, q in the description data of one filejFor the sub- score value formula f of j-th of text relevant in such businessjRatio
Weight, piFor the weight of i-th of the text field of any file in such business.
8. equipment according to claim 6, which is characterized in that
It further include technorati authority field in the description data for every class business that the receiving module receives, the technorati authority field is
Measure any authoritative numerical value of file in such business;
The configuration file generation module is directed to every class business, also uses when generating the text relevant configuration file of such business
According to the technorati authority field generation in the description data of such business for calculating the technorati authority configuration of authoritative score value
File;
The computing module is also used to when calculating text relevant score value of each file with described search keyword, for each
File gives a mark to the technorati authority of this document according to the technorati authority configuration file of the affiliated business of this document, this is calculated
The authoritative score value of file;
And/or
It further include time field in the description data for every class business that the receiving module receives, the time field is to measure
The numerical value of any file timeliness n in such business;
The configuration file generation module is directed to every class business, also uses when generating the text relevant configuration file of such business
According to the time field generation in the description data of such business for calculating the timeliness n configuration text of timeliness n score value
Part;
The computing module is also used to when calculating text relevant score value of each file with described search keyword, for each
File gives a mark to the timeliness n of this document according to the timeliness n configuration file of the affiliated business of this document, this is calculated
The timeliness n score value of file;
The sum of the text relevant score value of this document, technorati authority score value and/or timeliness n score value is determined as to the total score of this document
Value.
9. equipment according to claim 8, which is characterized in that
The computing module is used for the technorati authority configuration file according to the affiliated business of this document, beats the technorati authority of this document
Point, when authoritative score value this document is calculated, comprising:
Authoritative score value is calculated according to formula y=α × h (x), wherein α is constant, for guaranteeing the authoritative score value of this document
It is identical as the text relevant score value order of magnitude;H () is positive correlation function;X is the content of the technorati authority field of this document;
The computing module is used for the timeliness n configuration file according to the affiliated business of this document, beats the timeliness n of this document
Point, when timeliness n score value this document is calculated, comprising:
Timeliness n score value is calculated according to formula y=β × g (t), wherein β is constant, for guaranteeing the timeliness n score value of this document
It is identical as the text relevant score value order of magnitude;G () is inverse correlation function;T is the difference of the time field of current time and this document
Value.
10. equipment according to claim 7, which is characterized in that the receiving module is further used for:
The sub- score value formula of customization text relevant for receiving the specific type business that the service terminal is sent, with the customization text
This correlation temper score value formula covers text relevant in the text relevant configuration file of the specific type business point
It is worth formula.
11. a kind of adaptable search system Internet-based, which is characterized in that the system includes:
One such as the described in any item search servers of claim 6 to 10;
At least one service terminal, the content of the description data of each file for sending all kinds of business;
At least one user terminal, the searching request for sending carrying search key gives described search server, and receives
The information for the file that described search server returns.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510308566.5A CN106294487B (en) | 2015-06-08 | 2015-06-08 | Self-adapted search method, equipment and system Internet-based |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510308566.5A CN106294487B (en) | 2015-06-08 | 2015-06-08 | Self-adapted search method, equipment and system Internet-based |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106294487A CN106294487A (en) | 2017-01-04 |
CN106294487B true CN106294487B (en) | 2019-10-08 |
Family
ID=57659479
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510308566.5A Active CN106294487B (en) | 2015-06-08 | 2015-06-08 | Self-adapted search method, equipment and system Internet-based |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106294487B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110232138B (en) * | 2019-05-20 | 2022-05-20 | 中国银行股份有限公司 | Service guiding method, device and storage medium |
CN113343046B (en) * | 2021-05-20 | 2023-08-25 | 成都美尔贝科技股份有限公司 | Intelligent search ordering system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6240408B1 (en) * | 1998-06-08 | 2001-05-29 | Kcsl, Inc. | Method and system for retrieving relevant documents from a database |
CN101105815A (en) * | 2007-09-06 | 2008-01-16 | 腾讯科技(深圳)有限公司 | Internet music file sequencing method, system and search method and search engine |
CN103186574A (en) * | 2011-12-29 | 2013-07-03 | 北京百度网讯科技有限公司 | Method and device for generating searching result |
CN104572918A (en) * | 2014-12-26 | 2015-04-29 | 清华大学 | Online course searching method |
-
2015
- 2015-06-08 CN CN201510308566.5A patent/CN106294487B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6240408B1 (en) * | 1998-06-08 | 2001-05-29 | Kcsl, Inc. | Method and system for retrieving relevant documents from a database |
CN101105815A (en) * | 2007-09-06 | 2008-01-16 | 腾讯科技(深圳)有限公司 | Internet music file sequencing method, system and search method and search engine |
CN103186574A (en) * | 2011-12-29 | 2013-07-03 | 北京百度网讯科技有限公司 | Method and device for generating searching result |
CN104572918A (en) * | 2014-12-26 | 2015-04-29 | 清华大学 | Online course searching method |
Also Published As
Publication number | Publication date |
---|---|
CN106294487A (en) | 2017-01-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102073699B (en) | For improving the method for Search Results, device and equipment based on user behavior | |
TWI636416B (en) | Method and system for multi-phase ranking for content personalization | |
US9589050B2 (en) | Semantic context based keyword search techniques | |
US20140108431A1 (en) | Correlated information recommendation | |
CN102855256B (en) | For determining the method, apparatus and equipment of Website Evaluation information | |
US20070233685A1 (en) | Displaying access rights on search results pages | |
WO2018121700A1 (en) | Method and device for recommending application information based on installed application, terminal device, and storage medium | |
EP2862105A1 (en) | Ranking search results based on click through rates | |
CN113673262A (en) | Machine translation between different languages using statistical streaming data | |
KR20180072222A (en) | Apparatus for recommending a book | |
WO2016202214A2 (en) | Method and device for displaying keyword | |
Li et al. | Exploiting rich user information for one-class collaborative filtering | |
CN110766489A (en) | Method for requesting content and providing content and corresponding device | |
CN104123321B (en) | A kind of determining method and device for recommending picture | |
US20150169579A1 (en) | Associating entities based on resource associations | |
Pal | An efficient system using implicit feedback and lifelong learning approach to improve recommendation | |
WO2012115254A1 (en) | Search device, search method, search program, and computer-readable memory medium for recording search program | |
CN106294487B (en) | Self-adapted search method, equipment and system Internet-based | |
US10191988B2 (en) | System and method for returning prioritized content | |
US20150134632A1 (en) | Search method | |
CN110262906B (en) | Interface label recommendation method and device, storage medium and electronic equipment | |
CN109948040A (en) | Storage, recommended method and the system of object information, equipment and storage medium | |
US20170177580A1 (en) | Title standardization ranking algorithm | |
Wang et al. | A personalization-oriented academic literature recommendation method | |
Zhang et al. | Estimating online review helpfulness with probabilistic distribution and confidence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |