[go: up one dir, main page]

CN106294487B - Self-adapted search method, equipment and system Internet-based - Google Patents

Self-adapted search method, equipment and system Internet-based Download PDF

Info

Publication number
CN106294487B
CN106294487B CN201510308566.5A CN201510308566A CN106294487B CN 106294487 B CN106294487 B CN 106294487B CN 201510308566 A CN201510308566 A CN 201510308566A CN 106294487 B CN106294487 B CN 106294487B
Authority
CN
China
Prior art keywords
business
file
score value
document
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510308566.5A
Other languages
Chinese (zh)
Other versions
CN106294487A (en
Inventor
马中团
康战辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201510308566.5A priority Critical patent/CN106294487B/en
Publication of CN106294487A publication Critical patent/CN106294487A/en
Application granted granted Critical
Publication of CN106294487B publication Critical patent/CN106294487B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses a kind of self-adapted search methods Internet-based, are applied to search server, comprising: receive and save the content of the description data of each file in all kinds of business of service terminal transmission;For every class business, the text relevant configuration file of such business is generated according to the description data of such business;Receive the searching request for the carrying search key that user terminal is sent, according to the content of the description data of the text relevant configuration file of all kinds of business and each file saved, the calculated text relevant score value is determined as the total score of this document by the text relevant score value for calculating each file Yu described search keyword;It is sorted from high to low according to the total score of calculated each file, and the information of the corresponding file of total score for the first forward preset quantity that sorts is sent to the user terminal.

Description

Self-adapted search method, equipment and system Internet-based
Technical field
The present invention relates to Internet technical field more particularly to a kind of self-adapted search methods Internet-based, equipment And system.
Background technique
All kinds of uniform services to come in every shape are regarded in non-structured text based search as to be made of each field The weight of text file, each field is identical, and this searching method can satisfy versatility, i.e. not differentiated service type, uses Same a set of searching method and sort method scan for various different types of business, such as use same searcher Method searches for different types of business such as film, novel or picture.It can guarantee to quickly access new business, while maximum journey in this way Degree reduces the maintenance cost of search system.
Summary of the invention
The embodiment of the present invention provides a kind of self-adapted search method Internet-based, equipment and system, to improve search The search performance of server.
The technical solution of the embodiment of the present invention is achieved in that
A kind of self-adapted search method Internet-based is applied to search server, comprising:
Receive and save the content of the description data of each file in all kinds of business of service terminal transmission, the description number Be arranged according to being the service terminal for every class business, including characterizing the text field of such service attribute and its respectively weighing Value;
For every class business, the text relevant configuration file of such business is generated according to the description data of such business, Wherein, it is configured in the text relevant configuration file and calculates any text in such business according to the description data of such business The calculation method of the text relevant score value of part and any search key;
The searching request for receiving the carrying search key that user terminal is sent, matches according to the text relevant of all kinds of business The content for setting file with the description data of each file saved, calculates text phase of each file with described search keyword Closing property score value, wherein each file is directed to, according to the text relevant configuration file of the affiliated business of this document, to this document The matching degree of the content and described search keyword that describe data is given a mark, and is closed so that this document and described search is calculated The calculated text relevant score value is determined as the total score of this document by the text relevant score value of keyword;According to meter The total score of each file calculated sorts from high to low, and by the corresponding text of total score for the first forward preset quantity that sorts The information of part is sent to the user terminal.
A kind of search server equipment, comprising:
Receiving module, for receiving and saving in all kinds of business that service terminal is sent in the description data of each file Hold, the description data are that the service terminal is arranged for every class business, the text word including characterizing such service attribute Section and its respective weight;
Configuration file generation module generates such business according to the description data of such business for being directed to every class business Text relevant configuration file, wherein configured with according to the description number of such business in the text relevant configuration file According to the calculation method for calculating the text relevant score value of any file and any search key in such business;
The receiving module is also used to, and receives the searching request for the carrying search key that user terminal is sent;
Computing module, for the description according to the text relevant configuration files of all kinds of business and each file saved The content of data calculates the text relevant score value of each file Yu described search keyword, wherein is directed to each file, root According to the text relevant configuration file of the affiliated business of this document, to the content and described search keyword of the description data of this document Matching degree give a mark, the text relevant score value of this document Yu described search keyword is calculated, will calculate The text relevant score value be determined as the total score of this document;
Sending module, for being sorted from high to low according to the total score of calculated each file, and it is forward by sorting The information of the corresponding file of the total score of first preset quantity is sent to the user terminal.
A kind of adaptable search system Internet-based, comprising:
One search server, at least one service terminal and at least one user terminal;
Service terminal, the content of the description data of each file for sending all kinds of business;
User terminal, the searching request for sending carrying search key give described search server, and described in reception The information for the file that search server returns.
The method, apparatus and system provided according to embodiments of the present invention, search server receive and save all kinds of business Then the content of the description data of each file automatically generates correlation configuration file for every kind of business respectively, when search takes When business device receives the searching request of user terminal transmission, the search key carried using searching request successively with saved The description field for the file that all kinds of business include is matched, and configures text according to matching result and the correlation automatically generated Part is that each file is given a mark, and is ranked up according to marking result to each file, and ranking is presented for user terminal and leans on The information of the file of preceding preset quantity, completes search mission, and this method being capable of adaptively various different types of business, and root The weight of the description field of file is customized according to the difference of type of business, improves the search performance of search server.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art To obtain other drawings based on these drawings.
Fig. 1 is a kind of adaptable search system construction drawing Internet-based;
Fig. 2 is the flow chart of technical solution of the present invention;
Fig. 3 is the search server equipment structure chart in the embodiment of the present invention;
Fig. 4 is search server hardware structural diagram provided by one embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that the described embodiment is only a part of the embodiment of the present invention, instead of all the embodiments.Based on this Embodiment in invention, every other reality obtained by those of ordinary skill in the art without making creative efforts Example is applied, shall fall within the protection scope of the present invention.
Searching method in the prior art is normalized to plain text text in order to meet versatility, by different types of business Shelves, such as regard video files and novel file as same text document, use same a set of searching method and sort method Various different types of business are scanned for, the present inventor has found during excavating the present invention, this satisfaction The searching method of versatility cannot obtain best correlation effect, for example, different types of business is for characterizing service attribute Description field weighting it is different, if performer's description field significance level is higher in video display business, the important journey of description field of writing a play Spend lower, and author's description field significance level is higher in novel business, and Publication Year description field significance level is lower, has bright Aobvious difference, in the prior art regards the weight of the different field of different business as identical, and such searching method cannot make Search system provides best search effect for user.
To solve the above problems, the present invention proposes a kind of self-adapted search method Internet-based, equipment and system, this The technical solution of inventive embodiments is:
Fig. 1 be technical solution of the present invention networking structure figure, as shown in Figure 1, search system include a search server, Service terminal 1, service terminal 2 and user terminal 1, the quantity of service terminal can be multiple in practical application, user terminal Quantity can be to be multiple, Fig. 1 the technical program for ease of description, only by taking two service terminals and a user terminal as an example. Fig. 2 is the flow chart of technical solution of the present invention, as shown in Figure 2, comprising the following steps:
Step 201: search server receives and saves the description number of each file in all kinds of business of service terminal transmission According to content, this describes data, and to be service terminal be arranged for every class business, the text word including characterizing such service attribute Section and its respective weight.
In this step, it is assumed that the corresponding business of service terminal 1 is novel business, and the corresponding business of service terminal 2 is video display Business.
Search server receives and saves the interior of the description data of each file of the novel business of the transmission of service terminal 1 Hold, it is assumed that as shown in table 1;Search server receives and saves the description of each file of the video display business of the transmission of service terminal 2 The content of data, it is assumed that as shown in table 2.
Table 1
Table 2
As shown in table 1, the text field of novel business is respectively author, task, topic, and service terminal 1 is by all novels The content of above-mentioned the text field of business is sent to search server and is saved, and each file is for each text in novel business The setting of the weight of this field is controlled by service terminal 1.Video display business shown in table 2 is similar with novel business, is no longer described in detail.
Step 202: being directed to every class business, search server generates the text of such business according to the description data of such business This correlation configuration file, wherein configured with the description data meter according to such business in the text relevant configuration file Calculate the calculation method of the text relevant score value of any file and any search key in such business.
Text relevant configuration file in this step, which is configured with, calculates such business according to the description data of such business In any file and any search key text relevant score value calculation method, which specifically includes: characterization N number of sub- score value formula of text relevant in such business between any file and any search key, is denoted as f respectively1、 f2、…、fN, N number of sub- score value formula is for respectively from N number of different angle to the text of any file in such business Field and the matching degree of any search key are given a mark;Also, respectively every sub- score value formula configures specific gravity q, then The calculation formula of the text relevant score value of any file and any search key in such business are as follows:Wherein N is the number of the sub- score value formula of text relevant, and M is any file in such business Description data in the text field number, qjFor the sub- score value formula f of j-th of text relevant in such businessjSpecific gravity, pi For the weight of i-th of the text field of any file in such business.
By taking the text relevant configuration file of novel business as an example: the text relevant configuration file of novel business is provided with The sub- score value formula of 3 of text relevant in novel business between any file and any search key, respectively f1、f2、 f3, wherein f1Each the text field of each novel file and the accurate matching degree of any search key are traversed, if accurately It matches, then f1=1, otherwise f1=0;f2Traverse each the text field of each novel file and obscuring for any search key Matching degree, if fuzzy matching, f2=1, if uncorrelated, f completely2=0;f3The tightness of any search key is measured, The accurate matching degree of participle and each the text field of novel file after i.e. any keyword fractionation, traverses each novel file Each the text field, if the participle after some the text field and any search key are split exactly matches, f3=1, it is no Then f3=0.Above-mentioned 3 sub- score value formula are respectively from 3 different angles to the text word of any file in the novel business Section and the matching degree of any search key are given a mark.Moreover, the weight of every height marking value are as follows: the sub- marking value corresponds to Son marking formula specific gravity and the product of the weight of the text field that is related to of the sub- marking value.
In addition, respectively every sub- score value formula configures specific gravity q, for example, f1Specific gravity be 0.3, f2Specific gravity be 0.6, f3 Specific gravity be 0.4;Then in novel business the text relevant score value of any file and any search key calculation formula are as follows:
The text relevant configuration file of video display business and the text relevant configuration file of novel business are similar, are only sons The algorithm of marking formula designs difference, and this will not be detailed here.
Step 203: search server receives the searching request for the carrying search key that user terminal is sent, according to all kinds of The content of the description data of the text relevant configuration file of business and each file saved calculates each file and receives The text relevant score value of the search key arrived, wherein it is directed to each file, it is related according to the text of the affiliated business of this document Property configuration file, the matching degree of content and described search keyword to the description data of this document gives a mark, to calculate The text relevant score value of this document Yu described search keyword is obtained, the calculated text relevant score value is determined as The total score of this document.
In this step, search server receives the searching request for the carrying search key that user terminal is sent, and utilizes step The method of calculating text relevant score value in rapid 202 in text relevant configuration file, calculates each file in all kinds of business With the text relevant score value of the search key, and calculated text relevant score value is determined as to the total score of this document Value.
Step 204: search server sorts from high to low according to the total score of calculated each file, and sequence is leaned on The information of the corresponding file of total score of the first preceding preset quantity is sent to the user terminal.
In another embodiment of the present invention, service terminal is sent in the description data of every class business of search server also Including technorati authority field, which is to measure any authoritative numerical value of file in such business;
For every class business, when generating the text relevant configuration file of such business, according to the description of such business Technorati authority field in data generates the technorati authority configuration file for calculating authoritative score value;Each file is calculated to search with described It is right according to the technorati authority configuration file of the affiliated business of this document for each file when the text relevant score value of rope keyword The technorati authority of this document is given a mark, the authoritative score value of this document is calculated;
And/or
It further includes time field, the time that service terminal, which is sent in the description data of every class business of search server, Field is to measure the numerical value of any file timeliness n in such business;
For every class business, when generating the text relevant configuration file of such business, according to the description of such business The time field in data generates the timeliness n configuration file for calculating timeliness n score value;
When calculating text relevant score value of each file with described search keyword, for each file, according to this article The timeliness n configuration file of the affiliated business of part, gives a mark to the timeliness n of this document, the timeliness n of this document is calculated Score value;
The sum of the text relevant score value of this document, technorati authority score value and/or timeliness n score value is determined as this document Total score.
Wherein, it according to the technorati authority configuration file of the affiliated business of this document, gives a mark to the technorati authority of this document, in terms of Calculation obtains the authoritative score value of this document, comprising:
Authoritative score value is calculated according to formula y=α × h (x), wherein α is constant, for guaranteeing the authority of this document Score value is identical as the text relevant score value order of magnitude;H () is positive correlation function;X is the content of the technorati authority field of this document;
It according to the timeliness n configuration file of the affiliated business of this document, gives a mark to the timeliness n of this document, to calculate To the timeliness n score value of this document, comprising:
Authoritative score value is calculated according to formula y=β × g (t), wherein β is constant, for guaranteeing the timeliness n of this document Score value is identical as the text relevant score value order of magnitude;G () is inverse correlation function;T is the time word of current time and this document The difference of section.
Above-mentioned constant coefficient α or β be in order to guarantee the text relevant score value of same service scripts, technorati authority score value, when New property score value is not in unbalance situation, and e.g., the text relevant score value of certain file is 2, and technorati authority score value is 90, timeliness n Score value is 0.01, and the order of magnitude of three kinds of score values is different in this case, as unbalance.
Increase technorati authority field, be since the text relevant score value that will appear two files is identical, but obviously only have one A file is the search result that user wants, such as user wants the founder " Ma Yun " of search Alibaba, search server It searches two " Ma Yun ", one is ordinary people, and one is Alibaba founder, if only considering text relevant, search Two search results are then all sent to user terminal by server, and it is undesirable that this will lead to search effect.If increasing technorati authority Field, the technorati authority of Alibaba founder " Ma Yun " are necessarily higher than ordinary people " Ma Yun ", and such text relevant combines authority The search result obtained is spent closer in user search request.
Increases timeliness n field, but is differed widely on the time because the text relevant of different files is close or identical, Such as when user's search news class file, it is clear that focused more on to the time of search result, after earthquake occurs such as somewhere, user thinks Latest developments search " earthquake " is paid close attention to, and file relevant to " earthquake " has very much, if search server can be in conjunction with every The timeliness n score value of a file returns to search result to user, then can greatly improve search performance.
In another embodiment of the present invention, search server receives the customization text for the specific type business that service terminal is sent This correlation temper score value formula covers the text relevant of the specific type business with the sub- score value formula of the customization text relevant Default text correlation temper score value formula in configuration file realizes that providing score value to service terminal calculates customization service, business The sub- score value formula for the calculating text relevant that terminal customizes out according to the characteristic of own service can accurately more match user's Searching request.
The self-adapted search method Internet-based proposed above to the embodiment of the present invention is illustrated.Below with reference to Attached drawing is illustrated search server provided in an embodiment of the present invention.
Fig. 3 is a kind of search server equipment structure chart provided in an embodiment of the present invention, as shown in figure 3, the search service Device equipment includes:
Receiving module 301, for receiving and saving the description data of each file in all kinds of business that service terminal is sent Content, the description data are the service terminals to be arranged for every class business, the text including characterizing such service attribute This field and its respective weight;
Configuration file generation module 302 generates such industry according to the description data of such business for being directed to every class business The text relevant configuration file of business, wherein configured with the description according to such business in the text relevant configuration file Data calculate the calculation method of the text relevant score value of any file and any search key in such business;
The receiving module 301 is also used to, and receives the searching request for the carrying search key that user terminal is sent;
Computing module 303, for according to the text relevant configuration files of all kinds of business and each file saved The content for describing data, calculates the text relevant score value of each file Yu described search keyword, wherein is directed to each text Part, according to the text relevant configuration file of the affiliated business of this document, to the content and described search of the description data of this document The matching degree of keyword is given a mark, will the text relevant score value of this document Yu described search keyword is calculated The calculated text relevant score value is determined as the total score of this document;
Sending module 304, for being sorted from high to low according to the total score of calculated each file, and it is forward to sort The information of the corresponding file of total score of the first preset quantity be sent to the user terminal.
It is configured in the text relevant configuration file for every class business that the configuration file generation module 302 generates The calculation method specifically includes: characterizing the text relevant in such business between any file and any search key N number of sub- score value formula, is denoted as f respectively1、f2、…、fN, N number of sub- score value formula is for respectively from N number of different angle to this The matching degree of the text field and any search key of any file in class business is given a mark;It also, is respectively every A sub- score value formula configures specific gravity q, then the text relevant score value of any file and any search key in such business Calculation formula are as follows:Wherein N is the number of the sub- score value formula of text relevant, and M is such business In any file description data in the text field number, qjFor the sub- score value formula f of j-th of text relevant in such businessj Specific gravity, piFor the weight of i-th of the text field of any file in such business.
It further include technorati authority field, the authority in the description data for every class business that the receiving module 301 receives Spending field is to measure any authoritative numerical value of file in such business;
The configuration file generation module 302 is directed to every class business, configures text in the text relevant for generating such business It is also used to when part, the power for calculating authoritative score value is generated according to the technorati authority field in the description data of such business Prestige degree configuration file;
The computing module 303 is also used to when calculating text relevant score value of each file with described search keyword, It is given a mark to the technorati authority of this document, in terms of for each file according to the technorati authority configuration file of the affiliated business of this document Calculation obtains the authoritative score value of this document;
And/or
It further include time field in the description data for every class business that the receiving module 301 receives, the time word Section is to measure the numerical value of any file timeliness n in such business;
The configuration file generation module 302 is directed to every class business, configures text in the text relevant for generating such business It is also used to when part, is generated according to the time field in the description data of such business for calculating the stylish of timeliness n score value Property configuration file;
The computing module 303 is also used to when calculating text relevant score value of each file with described search keyword, It is given a mark to the timeliness n of this document, in terms of for each file according to the timeliness n configuration file of the affiliated business of this document Calculation obtains the timeliness n score value of this document;
The sum of the text relevant score value of this document, technorati authority score value and/or timeliness n score value is determined as this document Total score.
The computing module 303 is used for the technorati authority configuration file according to the affiliated business of this document, to the authority of this document Degree is given a mark, when authoritative score value this document is calculated, comprising:
Authoritative score value is calculated according to formula y=α × h (x), wherein α is constant, for guaranteeing the authority of this document Score value is identical as the text relevant score value order of magnitude;H () is positive correlation function;X is the content of the technorati authority field of this document;
The computing module 303 is used for the timeliness n configuration file according to the affiliated business of this document, to the stylish of this document Property is given a mark, when timeliness n score value this document is calculated, comprising:
Authoritative score value is calculated according to formula y=β × g (t), wherein β is constant, for guaranteeing the timeliness n of this document Score value is identical as the text relevant score value order of magnitude;G () is inverse correlation function;T is the time word of current time and this document The difference of section;
Receiving module 301 is further used for:
The sub- score value formula of customization text relevant for receiving the specific type business that the service terminal is sent, with described fixed The sub- score value formula of text relevant processed covers the text relevant in the text relevant configuration file of the specific type business Sub- score value formula.
Fig. 4 is the hardware structural diagram of the search server of the realization adaptable search of an embodiment according to the present invention. The device of the realization network security can include: processor 410, memory 420, port 430 and bus 440.410 He of processor Memory 420 is interconnected by bus 440.Processor 410 can send and receive data by port 430.
Wherein, processor 410 is used to execute the machine readable instructions module of the storage of memory 420.
Memory 420 is stored with the executable machine readable instructions module of processor 410, comprising: read module 421 and tune Mould preparation block 422.Wherein, it when processor 410 executes read module 421 and adjusts the instruction of module 422, can realize respectively above-mentioned The various functions of receiving module 301, configuration file generation module 302, computing module 303 and sending module 304.
The apparatus and method embodiment provided by the above embodiment for realizing network security belongs to same design, implements Process is detailed in embodiment of the method, and which is not described herein again.
It, can also be in addition, each functional module in each embodiment of the present invention can integrate in one processing unit It is that modules physically exist alone, can also be integrated in one unit with two or more modules.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
In addition, each embodiment of the invention can pass through the data processing by data processing equipment such as computer execution Program is realized.Obviously, data processor constitutes the present invention.In addition, being commonly stored data in one storage medium Processing routine is by directly reading out storage medium for program or by depositing program installation or storage media (such as paper tape), magnetic Storage media (such as floppy disk, hard disk, flash memory), optical storage media (such as CD-ROM), magnetic-optical storage medium (such as MO) etc..
Therefore the present invention also provides a kind of storage mediums, wherein it is stored with data processor, the data processor For executing any embodiment of the above method of the present invention.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.

Claims (11)

1. a kind of self-adapted search method Internet-based, which is characterized in that it is applied to search server, this method comprises:
The content of the description data of each file in all kinds of business of service terminal transmission is received and saved, the description data are The service terminal is for the setting of every class business, the text field and its respective weight including characterizing such service attribute;
For every class business, the text relevant configuration file of such business is generated according to the description data of such business, wherein In the text relevant configuration file configured with according to the description data of such business calculate in such business any file with The calculation method of the text relevant score value of any search key;
The searching request for receiving the carrying search key that user terminal is sent, configures text according to the text relevant of all kinds of business The content of the description data of part and each file saved, calculates the text relevant of each file Yu described search keyword Score value, wherein be directed to each file, the description according to the text relevant configuration file of the affiliated business of this document, to this document The content of data and the matching degree of described search keyword are given a mark, this document and described search keyword is calculated Text relevant score value, the calculated text relevant score value is determined as to the total score of this document;According to calculating The total score of each file sort from high to low, and by the corresponding file of total score for the first forward preset quantity that sorts Information is sent to the user terminal.
2. the method according to claim 1, wherein
The calculation method configured in the text relevant configuration file of every class business specifically includes: characterizing such business In text relevant between any file and any search key N number of sub- score value formula, be denoted as f respectively1、f2、…、fN, N number of sub- score value formula from N number of different angle for the text field of any file in such business and appointing respectively The matching degree of one search key is given a mark;Also, respectively every sub- score value formula configures specific gravity q, then such business In any file and any search key text relevant score value calculation formula are as follows: Wherein N is the number of the sub- score value formula of text relevant, and M is the text field in the description data of any file in such business Number, qjFor the sub- score value formula f of j-th of text relevant in such businessjSpecific gravity, piFor file i-th any in such business The weight of a the text field.
3. the method according to claim 1, wherein this method further comprises:
It further include technorati authority field in the description data of every class business, the technorati authority field is any in such business of measurement The authoritative numerical value of file;
For every class business, when generating the text relevant configuration file of such business, according to the description data of such business In the technorati authority field generate the technorati authority configuration file for calculating authoritative score value;Each file is calculated to search with described It is right according to the technorati authority configuration file of the affiliated business of this document for each file when the text relevant score value of rope keyword The technorati authority of this document is given a mark, the authoritative score value of this document is calculated;
And/or
It further include time field in the description data of every class business, the time field is to measure any file in such business The numerical value of timeliness n;
For every class business, when generating the text relevant configuration file of such business, according to the description data of such business In the time field generate the timeliness n configuration file for calculating timeliness n score value;
When calculating text relevant score value of each file with described search keyword, for each file, according to this document institute The timeliness n configuration file of category business, gives a mark to the timeliness n of this document, the timeliness n score value of this document is calculated;
The sum of the text relevant score value of this document, technorati authority score value and/or timeliness n score value is determined as to the total score of this document Value.
4. according to the method described in claim 3, it is characterized in that,
The technorati authority configuration file according to the affiliated business of this document, gives a mark to the technorati authority of this document, to calculate To the authoritative score value of this document, comprising:
Authoritative score value is calculated according to formula y=α × h (x), wherein α is constant, for guaranteeing the authoritative score value of this document It is identical as the text relevant score value order of magnitude;H () is positive correlation function;X is the content of the technorati authority field of this document;
The timeliness n configuration file according to the affiliated business of this document, gives a mark to the timeliness n of this document, to calculate To the timeliness n score value of this document, comprising:
Timeliness n score value is calculated according to formula y=β × g (t), wherein β is constant, for guaranteeing the timeliness n score value of this document It is identical as the text relevant score value order of magnitude;G () is inverse correlation function;T is the difference of the time field of current time and this document Value.
5. according to the method described in claim 2, it is characterized in that, this method further comprises:
The sub- score value formula of customization text relevant for receiving the specific type business that the service terminal is sent, with the customization text This correlation temper score value formula covers text relevant in the text relevant configuration file of the specific type business point It is worth formula.
6. a kind of search server equipment, which is characterized in that the equipment includes:
Receiving module, for receiving and saving the content of the description data of each file in all kinds of business that service terminal is sent, The description data are the service terminals to be arranged for every class business, including characterize such service attribute the text field and Its respective weight;
Configuration file generation module generates the text of such business according to the description data of such business for being directed to every class business This correlation configuration file, wherein configured with the description data meter according to such business in the text relevant configuration file Calculate the calculation method of the text relevant score value of any file and any search key in such business;
The receiving module is also used to, and receives the searching request for the carrying search key that user terminal is sent;
Computing module, for according to the text relevant configuration file of all kinds of business and the description data of each file saved Content, calculate the text relevant score value of each file Yu described search keyword, wherein be directed to each file, according to this The text relevant configuration file of the affiliated business of file, of content and described search keyword to the description data of this document It gives a mark with degree, the text relevant score value of this document Yu described search keyword is calculated, by calculated institute State the total score that text relevant score value is determined as this document;
Sending module, for being sorted from high to low according to the total score of calculated each file, and by forward first of sorting The information of the corresponding file of the total score of preset quantity is sent to the user terminal.
7. equipment according to claim 6, which is characterized in that
The meter configured in the text relevant configuration file for every class business that the configuration file generation module generates Calculation method specifically includes: characterizing N number of son of the text relevant in such business between any file and any search key Score value formula, is denoted as f respectively1、f2、…、fN, N number of sub- score value formula is for respectively from N number of different angle to such industry The matching degree of the text field and any search key of any file in business is given a mark;Also, respectively every height Score value formula configures specific gravity q, then in such business the text relevant score value of any file and any search key calculating Formula are as follows:Wherein N is the number of the sub- score value formula of text relevant, and M is to appoint in such business The number of the text field, q in the description data of one filejFor the sub- score value formula f of j-th of text relevant in such businessjRatio Weight, piFor the weight of i-th of the text field of any file in such business.
8. equipment according to claim 6, which is characterized in that
It further include technorati authority field in the description data for every class business that the receiving module receives, the technorati authority field is Measure any authoritative numerical value of file in such business;
The configuration file generation module is directed to every class business, also uses when generating the text relevant configuration file of such business According to the technorati authority field generation in the description data of such business for calculating the technorati authority configuration of authoritative score value File;
The computing module is also used to when calculating text relevant score value of each file with described search keyword, for each File gives a mark to the technorati authority of this document according to the technorati authority configuration file of the affiliated business of this document, this is calculated The authoritative score value of file;
And/or
It further include time field in the description data for every class business that the receiving module receives, the time field is to measure The numerical value of any file timeliness n in such business;
The configuration file generation module is directed to every class business, also uses when generating the text relevant configuration file of such business According to the time field generation in the description data of such business for calculating the timeliness n configuration text of timeliness n score value Part;
The computing module is also used to when calculating text relevant score value of each file with described search keyword, for each File gives a mark to the timeliness n of this document according to the timeliness n configuration file of the affiliated business of this document, this is calculated The timeliness n score value of file;
The sum of the text relevant score value of this document, technorati authority score value and/or timeliness n score value is determined as to the total score of this document Value.
9. equipment according to claim 8, which is characterized in that
The computing module is used for the technorati authority configuration file according to the affiliated business of this document, beats the technorati authority of this document Point, when authoritative score value this document is calculated, comprising:
Authoritative score value is calculated according to formula y=α × h (x), wherein α is constant, for guaranteeing the authoritative score value of this document It is identical as the text relevant score value order of magnitude;H () is positive correlation function;X is the content of the technorati authority field of this document;
The computing module is used for the timeliness n configuration file according to the affiliated business of this document, beats the timeliness n of this document Point, when timeliness n score value this document is calculated, comprising:
Timeliness n score value is calculated according to formula y=β × g (t), wherein β is constant, for guaranteeing the timeliness n score value of this document It is identical as the text relevant score value order of magnitude;G () is inverse correlation function;T is the difference of the time field of current time and this document Value.
10. equipment according to claim 7, which is characterized in that the receiving module is further used for:
The sub- score value formula of customization text relevant for receiving the specific type business that the service terminal is sent, with the customization text This correlation temper score value formula covers text relevant in the text relevant configuration file of the specific type business point It is worth formula.
11. a kind of adaptable search system Internet-based, which is characterized in that the system includes:
One such as the described in any item search servers of claim 6 to 10;
At least one service terminal, the content of the description data of each file for sending all kinds of business;
At least one user terminal, the searching request for sending carrying search key gives described search server, and receives The information for the file that described search server returns.
CN201510308566.5A 2015-06-08 2015-06-08 Self-adapted search method, equipment and system Internet-based Active CN106294487B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510308566.5A CN106294487B (en) 2015-06-08 2015-06-08 Self-adapted search method, equipment and system Internet-based

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510308566.5A CN106294487B (en) 2015-06-08 2015-06-08 Self-adapted search method, equipment and system Internet-based

Publications (2)

Publication Number Publication Date
CN106294487A CN106294487A (en) 2017-01-04
CN106294487B true CN106294487B (en) 2019-10-08

Family

ID=57659479

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510308566.5A Active CN106294487B (en) 2015-06-08 2015-06-08 Self-adapted search method, equipment and system Internet-based

Country Status (1)

Country Link
CN (1) CN106294487B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110232138B (en) * 2019-05-20 2022-05-20 中国银行股份有限公司 Service guiding method, device and storage medium
CN113343046B (en) * 2021-05-20 2023-08-25 成都美尔贝科技股份有限公司 Intelligent search ordering system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6240408B1 (en) * 1998-06-08 2001-05-29 Kcsl, Inc. Method and system for retrieving relevant documents from a database
CN101105815A (en) * 2007-09-06 2008-01-16 腾讯科技(深圳)有限公司 Internet music file sequencing method, system and search method and search engine
CN103186574A (en) * 2011-12-29 2013-07-03 北京百度网讯科技有限公司 Method and device for generating searching result
CN104572918A (en) * 2014-12-26 2015-04-29 清华大学 Online course searching method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6240408B1 (en) * 1998-06-08 2001-05-29 Kcsl, Inc. Method and system for retrieving relevant documents from a database
CN101105815A (en) * 2007-09-06 2008-01-16 腾讯科技(深圳)有限公司 Internet music file sequencing method, system and search method and search engine
CN103186574A (en) * 2011-12-29 2013-07-03 北京百度网讯科技有限公司 Method and device for generating searching result
CN104572918A (en) * 2014-12-26 2015-04-29 清华大学 Online course searching method

Also Published As

Publication number Publication date
CN106294487A (en) 2017-01-04

Similar Documents

Publication Publication Date Title
CN102073699B (en) For improving the method for Search Results, device and equipment based on user behavior
TWI636416B (en) Method and system for multi-phase ranking for content personalization
US9589050B2 (en) Semantic context based keyword search techniques
US20140108431A1 (en) Correlated information recommendation
CN102855256B (en) For determining the method, apparatus and equipment of Website Evaluation information
US20070233685A1 (en) Displaying access rights on search results pages
WO2018121700A1 (en) Method and device for recommending application information based on installed application, terminal device, and storage medium
EP2862105A1 (en) Ranking search results based on click through rates
CN113673262A (en) Machine translation between different languages using statistical streaming data
KR20180072222A (en) Apparatus for recommending a book
WO2016202214A2 (en) Method and device for displaying keyword
Li et al. Exploiting rich user information for one-class collaborative filtering
CN110766489A (en) Method for requesting content and providing content and corresponding device
CN104123321B (en) A kind of determining method and device for recommending picture
US20150169579A1 (en) Associating entities based on resource associations
Pal An efficient system using implicit feedback and lifelong learning approach to improve recommendation
WO2012115254A1 (en) Search device, search method, search program, and computer-readable memory medium for recording search program
CN106294487B (en) Self-adapted search method, equipment and system Internet-based
US10191988B2 (en) System and method for returning prioritized content
US20150134632A1 (en) Search method
CN110262906B (en) Interface label recommendation method and device, storage medium and electronic equipment
CN109948040A (en) Storage, recommended method and the system of object information, equipment and storage medium
US20170177580A1 (en) Title standardization ranking algorithm
Wang et al. A personalization-oriented academic literature recommendation method
Zhang et al. Estimating online review helpfulness with probabilistic distribution and confidence

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant