[go: up one dir, main page]

CN105824951A - Retrieval method and retrieval device - Google Patents

Retrieval method and retrieval device Download PDF

Info

Publication number
CN105824951A
CN105824951A CN201610170303.7A CN201610170303A CN105824951A CN 105824951 A CN105824951 A CN 105824951A CN 201610170303 A CN201610170303 A CN 201610170303A CN 105824951 A CN105824951 A CN 105824951A
Authority
CN
China
Prior art keywords
information
author
retrieval
social network
network site
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610170303.7A
Other languages
Chinese (zh)
Other versions
CN105824951B (en
Inventor
郝运峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201610170303.7A priority Critical patent/CN105824951B/en
Publication of CN105824951A publication Critical patent/CN105824951A/en
Application granted granted Critical
Publication of CN105824951B publication Critical patent/CN105824951B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a retrieval method and a retrieval device. A specific embodiment of the method comprises the following steps of receiving a retrieval request from a user, wherein the retrieval request comprises retrieval keywords; performing retrieval operation on at least one reserved social networking site according to the retrieval keywords to generate a retrieval information set; scoring each piece of retrieval information in the retrieval information set according to website information of the social networking site, corresponding to the retrieval information and author information about contents of the social networking site, corresponding to the retrieval information; sorting all pieces of retrieval information according to scores to generate a set of sorted retrieval information as a retrieval result. According to the embodiment, the retrieval result has higher pertinence.

Description

Retrieval method and device
Technical Field
The application relates to the technical field of computers, in particular to the technical field of internet, and particularly relates to a retrieval method and a retrieval device.
Background
Search engine ranking refers to a process for a search engine to spawn a program, commonly referred to as a web crawler, that finds a new web page on the web and crawls the documents. The web crawler starts with known web pages in the database, accesses these web pages and crawls files just like a normal user's browser. After the search terms are processed, a search engine sequencing program starts to work, all the webpages containing the search terms are found out from the index database, the webpages which should be ranked in front are calculated according to a ranking algorithm, and then the webpages are returned to the 'search' page according to a certain format. Thus, the search engine can complete and return the search result required by the user within one or two seconds.
At present, a large amount of social network site original content exists in search results, and the existing search engine ranking algorithm mainly ranks web pages containing search terms by the factors of content relevance, website level, timeliness and the like without considering the author factor of the original content, so that the social network site related data is not utilized sufficiently, and the retrieval results lack pertinence.
Disclosure of Invention
The present application aims to provide an improved retrieval method and apparatus to solve the technical problems mentioned in the background section above.
In a first aspect, the present application provides a retrieval method, including: receiving a retrieval request of a user, wherein the retrieval request comprises a retrieval keyword; performing retrieval operation on at least one preset social network site according to the retrieval key words to generate a retrieval information set; scoring each piece of retrieval information in the retrieval information set according to website information of a social network site corresponding to the retrieval information and author information of content of the social network site corresponding to the retrieval information; and sorting the pieces of search information according to the scores, and generating a set of sorted search information as a search result.
In some embodiments, the website information includes a website level of the website.
In some embodiments, before scoring each piece of search information in the search information set according to website information of a social networking site corresponding to the search information and author information of content of the social networking site corresponding to the search information, the method further includes: and acquiring website information of the social network site corresponding to the retrieval information and author information of the content of the social network site corresponding to the retrieval information.
In some embodiments, the obtaining website information of a social networking site corresponding to the retrieval information and author information of content of the social networking site corresponding to the retrieval information includes: and capturing website information of the social network site corresponding to the retrieval information and author information of the content of the social network site corresponding to the retrieval information by using a web crawler technology.
In some embodiments, the method further comprises: receiving website information, content information and/or author information of content actively pushed by the at least one predetermined social networking site.
In some embodiments, the author information comprises at least one of: author basic information and author behavior information; wherein the author base information includes at least one of: the name of the author, the grade of the author on the corresponding social network site, the attention amount of the author on the corresponding social network site and whether the author passes official authentication of the social network site; the author behavior information includes at least one of: the method comprises the following steps of publishing time of content published by an author on a corresponding social network site, replying quantity of the content published by the author on the corresponding social network site, forwarding quantity of the content published by the author on the corresponding social network site, clicking quantity of the content published by the author on the corresponding social network site and displaying quantity of the content published by the author on the corresponding social network site.
In a second aspect, the present application provides a retrieval apparatus, the apparatus comprising: the system comprises a receiving unit, a searching unit and a searching unit, wherein the receiving unit is configured to receive a searching request of a user, and the searching request comprises a searching keyword; the retrieval unit is configured to perform retrieval operation on at least one predetermined social network site according to the retrieval key words to generate a retrieval information set; the scoring unit is configured to score each piece of retrieval information in the retrieval information set according to website information of a social network site corresponding to the retrieval information and author information of content of the social network site corresponding to the retrieval information; and the sorting unit is configured to sort the search information according to the scores and generate a set of sorted search information as a search result.
In some embodiments, the website information includes a website level of the website.
In some embodiments, the apparatus further comprises: and the acquisition unit is configured to acquire the website information of the social network site corresponding to the retrieval information and the author information of the content of the social network site corresponding to the retrieval information.
In some embodiments, the obtaining unit is further configured to: and capturing website information of the social network site corresponding to the retrieval information and author information of the content of the social network site corresponding to the retrieval information by using a web crawler technology.
In some embodiments, the apparatus further comprises: the receiving unit is configured to receive website information, content information and/or author information of content actively pushed by the at least one predetermined social networking site.
In some embodiments, the author information comprises at least one of: author basic information and author behavior information; wherein the author base information includes at least one of: the name of the author, the grade of the author on the corresponding social network site, the attention amount of the author on the corresponding social network site and whether the author passes official authentication of the social network site; the author behavior information includes at least one of: the method comprises the following steps of publishing time of content published by an author on a corresponding social network site, replying quantity of the content published by the author on the corresponding social network site, forwarding quantity of the content published by the author on the corresponding social network site, clicking quantity of the content published by the author on the corresponding social network site and displaying quantity of the content published by the author on the corresponding social network site.
According to the retrieval method and the retrieval device, the predetermined social network sites are retrieved by utilizing the retrieval keywords in the user retrieval request to generate the retrieval information set, then the retrieval information is scored according to the social network sites corresponding to each piece of retrieval information in the retrieval information set and the author information corresponding to the retrieval information, finally the retrieval information is ranked according to the scores, and the ranked retrieval information set is used as the retrieval result, so that the author information of the social network sites is effectively utilized, and the retrieval result is more pertinent.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a retrieval method according to the present application;
FIG. 3 is a schematic diagram of an application scenario of a retrieval method according to the present application;
FIG. 4 is a flow diagram of yet another embodiment of a retrieval method according to the present application;
FIG. 5 is a schematic diagram of an embodiment of a retrieval device according to the present application;
FIG. 6 is a schematic block diagram of a computer system suitable for use in implementing a server according to embodiments of the present application.
Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 shows an exemplary system architecture 100 to which embodiments of the retrieval method or retrieval apparatus of the present application may be applied.
As shown in fig. 1, the system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as a web browser application, a search-type application, social platform software, and the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting retrieval and web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (movingpictureexpertgroupoaudiolayer iii, mpeg compression standard audio layer 3), MP4 players (movingpictureexpertgrouipauudiolayer iv, mpeg compression standard audio layer 4), laptop portable computers, desktop computers, and the like.
The server 105 may be a server that provides various services, such as a background search server with a search engine function that provides support for search requests generated on the terminal devices 101, 102, 103. The background retrieval server may analyze and perform other processing on the received data such as the retrieval request, and feed back the retrieval result (e.g., webpage data) to the terminal device.
It should be noted that the retrieval method provided in the embodiment of the present application is generally executed by the server 105, and accordingly, the retrieval device is generally disposed in the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a retrieval method according to the present application is shown. The retrieval method comprises the following steps:
step 201, receiving a retrieval request of a user.
In this embodiment, an electronic device (for example, a server shown in fig. 1, especially a server having a search engine function) on which the retrieval method operates may receive a retrieval request of a user from a terminal with which the user performs retrieval in a wired connection manner or a wireless connection manner, where the retrieval request includes a retrieval keyword. The request content of the search request includes, but is not limited to, text, pictures and voice. As an example, when the retrieval request of the user is a picture, the electronic device may call an OCR (optical character recognition) software interface to perform text recognition on the picture in the retrieval request, and obtain a recognition result including at least one retrieval keyword; when the search request of the user is voice, the electronic device may perform text recognition on the voice in the search request through an interface of voice recognition software (e.g., view), and obtain a recognition result including at least one search keyword.
Step 202, performing retrieval operation on at least one predetermined social network site according to the retrieval keywords to generate a retrieval information set.
In this embodiment, the electronic device on which the retrieval method is executed may store the contents of a plurality of predetermined social networking sites in advance, and may perform a retrieval operation on the contents so as to be presented on the browser as retrieval information.
In this embodiment, the electronic device performs a search operation on at least one predetermined social network site based on a search keyword in a search request of the user received in step 201, and generates a search information set, where the search information may be web page information or a web page snapshot.
In this embodiment, the predetermined social network site may be a manually set website; or the default website of the electronic equipment; the electronic device may further set the website to be a social website when the website meets a predetermined condition, for example, when the total posting volume of the website is greater than one million; when the total user amount of the website is more than fifty thousand, the electronic device may set the website as a social network site; when the total visit amount of the website is more than five million, the electronic device can set the website as a social website.
In this embodiment, the electronic device may match the search keyword with the search information from the predetermined social network site one by one, and determine whether each piece of search information can be put into the search information set according to the number of keywords included in the search information of each social network site. For example, if at least one search keyword is included in the search information of a certain social network site, the search information may be put into the search information set.
Step 203, for each piece of search information in the search information set, scoring the search information according to the website information of the social network site corresponding to the search information and the author information of the content of the social network site corresponding to the search information.
In this embodiment, for each piece of search information in the search information set generated in step 202, the electronic device scores the search information according to the website information of the social networking site corresponding to the search information and the author information of the content of the social networking site corresponding to the search information.
In some optional implementations of the embodiment, the website information may include a website level (PR value, PageRank value) of the website. The PR value may be a page level of a page included in a web site corresponding to the web site information. The PR value is a standard for representing the ranking of web pages or web sites, and the ranking is 0 to 10, respectively. For example, a web site with a PR value of 1 indicates that the web site is of lesser importance, while a web site with a PR value of 7 to 10 indicates that the web site is of importance.
In some optional implementations of this embodiment, the author information may include at least one of: author basic information and author behavior information. Wherein the author basic information may include at least one of: author name (author ID), author's rank on the corresponding social networking site, author's number of interests (fan number) on the corresponding social networking site, and whether the author is authenticated by the website's authority; the author behavior information may include at least one of: the method comprises the following steps of publishing time of content published by an author on a corresponding social network site, replying quantity of the content published by the author on the corresponding social network site, forwarding quantity of the content published by the author on the corresponding social network site, clicking quantity of the content published by the author on the corresponding social network site and displaying quantity of the content published by the author on the corresponding social network site.
In this embodiment, the search information may be scored according to the PR value of the social networking site corresponding to the search information and the attention amount of the author of the content of the social networking site corresponding to the search information.
The score of the retrieved information may be calculated using the following formula.
K = R f a n s ( E f a n s Max f a n s ) * R 1 * K 1
Where K is the score of the search information, RfansWeighting factor in the ranking of social networking sites for the author's amount of interest, EfansMax, the number of interests of the authorfansIn order to search the highest concern quantity of the authors in the information, which originate from the same social network site as the authors, R1Weighting factor in ranking of retrieved information for the amount of interest of the author, K1Is the PR value for that social networking site. The weighting factor may be a factor preset by the electronic device to measure the importance of a parameter. As an example, when RfansIs 0.8, K1Is 6, EfansMax of 1000fansIs 10000, R1When the score is 2, the score of the search information is 0.96.
In this embodiment, the search information may also be scored according to the PR value of the social networking site corresponding to the search information and the reply quantity of the content posted on the social networking site by the author of the content of the social networking site corresponding to the search information.
The score of the retrieved information may be calculated using the following formula.
K = R r e p l y Σ i = 1 n ( T i o l d T i n o w N i r e p l y ) * K 1 * R 2
Where K is the score of the search information, RreplyWeighting factor in ranking of social networking sites, T, for the number of replies to content posted by an author on the corresponding social networking siteioldTime when ith item of content was released for author, TinowFor the current time, NireplyNumber of replies for item i content, K1Is the PR value of the social networking site, R2And weighting coefficients in the ranking of the retrieval information for the reply quantity of the content, wherein i and n are natural numbers. As an example, when RreplyThe content of the organic acid is 1.2,is 0.999, N1replyIs the number of the alloy wires of 1000,is 0.998, N2replyIs 500, K1Is 8, R2At 0.9, the score of the search information is 12942.72.
In this embodiment, the search information may also be scored according to the PR value of the social networking site corresponding to the search information, the user level of the author of the content of the social networking site corresponding to the search information on the social networking site, the number of concerns (number of fans) of the author on the social networking site, whether the author passes the official authentication of the social networking site, the posting time of the content posted by the author on the social networking site, the reply number of the content posted by the author on the social networking site, the forwarding number of the content posted by the author on the social networking site, the number of clicks of the content posted by the author on the social networking site, and the number of impressions of the content posted by the author on the social networking site. At this time, the score of the retrieval information is related to the ranking score of the author level in the social network site, the ranking score of the historical liveness of the author in the social network site, and the ranking score of the historical influence of the author in the electronic device with the search engine function.
First, the ranking score of the author level in the retrieved information in the corresponding social networking site can be calculated using the following formula.
K 2 = R g r a d e ( E g r a d e Max g r a d e ) + R f a n s ( E f a n s Max f a n s ) + V
Wherein, K2Ranking scores for author ratings in social networking sites, RgradeWeighting factor in the ranking of social networking sites for the author's user rating, EgradeFor the user class of the author, MaxgradeFor searching the highest user level of the authors of the information, which are from the same social network site as the authors, RfansWeighting factor in the ranking of social networking sites for the author's amount of interest, EfansMax, the number of interests of the authorfansV is a weighting coefficient of the author in the ranking of the social network site through official authentication of the social network site in order to retrieve the highest attention amount of the author in the information, wherein the author is from the same social network site as the author.
The ranking score of the historical liveness of the author in the retrieved information in the social networking site may then be calculated using the following formula.
K 3 = R r e p l y Σ i = 1 n ( T i o l d T i n o w N i r e p l y ) + R s h a r e Σ i = 1 n ( T i o l d T i n o w N i s h a r e )
Wherein,K3ranking score for historical liveness of authors in social networking sites, RreplyWeighting factor in ranking of social networking sites, T, for the number of replies to content posted by an author on the corresponding social networking siteioldTime when ith item of content was released for author, TinowFor the current time, NireplyNumber of replies, R, for item i contentshareWeighting factor in the ranking of social networking sites for the number of forwards of content posted by an author on the corresponding social networking site, NishareThe forwarding number of the ith item of content, wherein i and n are natural numbers.
Next, the ranking score of the historical influence of the author in the retrieval information in the above-described electronic device having a search engine function can be calculated using the following formula. Server with search engine function
K 4 = Σ i = 1 n ( N i c l i c k N i s h o w * T i o l d T i n o w )
Wherein, K4Ranking score for author historical impact in the above-described electronic device with search engine functionality, NiclickThe click quantity of the ith content in the electronic equipment, NishowFor the amount of the ith content to be presented in the electronic device, TioldTime when ith item of content was released for author, TinowThe current time is, where i and n are both natural numbers.
Finally, the score of the retrieval information may be calculated using the following formula.
K=R1*K1*K2+R2*K1*K3+R3*K4
Where K is the score of the search information, R1Weighting factor in ranking of retrieved information for ranking of author level in social networking site, R2Weighting factor in ranking of retrieved information for ranking of historical liveness of authors in social networking sites, R3Weighting factor in ranking of retrieved information for ranking of the author's historical influence in the above-mentioned electronic device with search engine function, K1Is the PR value of the social networking site, K2Ranking scores for author ratings in social networking sites, K3Ranking score for historical liveness of authors in social networking sites, K4Rank scores for the author's historical impact in the above-described search engine enabled electronic device.
And step 204, sequencing all the retrieval information according to the scores, and generating a set of sequenced retrieval information as a retrieval result.
In this embodiment, the electronic device sorts the pieces of search information in descending order of the scores based on the scores of the pieces of search information obtained in step 203, and sets at least one piece of search information including the pieces of search information sorted as a search result.
According to the method provided by the embodiment of the application, the retrieval operation is carried out on at least one preset social network site according to the retrieval keyword in the retrieval request of the receiving user, then the retrieval information is scored according to the social network site information corresponding to each piece of retrieval information in the retrieval information set obtained by the retrieval operation and the author information corresponding to the retrieval information, the retrieval information is sequenced according to the scoring result, and the sequenced retrieval information set is obtained and used as the retrieval result. The method effectively utilizes the author information of the social network site, so that the retrieval result has higher pertinence.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the retrieval method according to the present embodiment. In the application scenario of fig. 3, a user first initiates a search request "lijiang gucheng" through a terminal device (client); then, the electronic equipment searches at least one predetermined social network site according to the search keywords 'lijiang gucheng', 'lijiang' and 'gucheng' in the search request 'lijiang gucheng', generates search information 301, search information 302 and search information 303 containing at least one of 'lijiang gucheng', 'lijiang' or 'gucheng', and puts the search information into a search information set; then, the electronic device scores the search information according to the website information of the a blog corresponding to the search information 301 and the author information of the a author corresponding to the search information in the search information set, the score is 6.8, scores the search information according to the website information of the B blog corresponding to the search information 302 and the author information of the B author corresponding to the search information, the score is 5.3, scores the search information according to the website information of the C blog corresponding to the search information 303 and the author information of the C author corresponding to the search information, and the score is 4.7; finally, the electronic device sorts the search information 301, the search information 302, and the search information 303 according to their scores, and the generated search result is shown in fig. 3.
According to the method provided by the embodiment of the application, the search information in the search information set is sequenced through the website information of the social network site corresponding to the search information and the author information of the content corresponding to the search information, so that the search result has higher pertinence.
With further reference to fig. 4, a flow 400 of yet another embodiment of a retrieval method is shown. The process 400 of the search method includes the following steps:
step 401, receiving a retrieval request of a user.
In this embodiment, an electronic device (for example, a server shown in fig. 1) on which the retrieval method operates may receive a retrieval request of a user from a terminal with which the user performs retrieval in a wired connection manner or a wireless connection manner, where the retrieval request includes a retrieval keyword.
And 402, performing retrieval operation on at least one preset social network site according to the retrieval key words to generate a retrieval information set.
In this embodiment, the electronic device performs a search operation on at least one predetermined social network site based on a search keyword in the search request of the user received in step 401, and generates a search information set. The predetermined social network site can be a manually set website; or the default website of the electronic equipment; the electronic device can also be a website set by itself when the website meets the preset conditions.
Step 403, acquiring website information of the social network site corresponding to the retrieval information and author information of the content of the social network site corresponding to the retrieval information.
In this embodiment, the electronic device obtains website information of a social network site corresponding to each piece of search information in the search information set generated in step 402 and author information of content of the social network site corresponding to each piece of search information.
In some optional implementation manners of this embodiment, the electronic device may capture, by using a web crawler technology, website information of a social network site corresponding to the search information and author information of content of the social network site corresponding to the search information, where the web crawler is also referred to as a web spider, a web robot, or a web page chaser, and is a program or a script that automatically captures web information according to a certain rule.
In some optional implementations of the embodiment, the electronic device may also passively receive website information, content information and/or author information of content actively pushed by at least one predetermined social networking site.
Step 404, for each piece of search information in the search information set, scoring the search information according to the website information of the social network site corresponding to the search information and the author information of the content of the social network site corresponding to the search information.
In this embodiment, the electronic device scores the search information with respect to the website information of the social networking site corresponding to the search information acquired in step 403 and the author information of the content of the social networking site corresponding to the search information.
In this embodiment, the search information may be scored according to the PR value of the social networking site corresponding to the search information and the attention amount of the author of the content of the social networking site corresponding to the search information.
In this embodiment, the search information may also be scored according to the PR value of the social networking site corresponding to the search information and the reply quantity of the content posted on the social networking site by the author of the content of the social networking site corresponding to the search information.
In this embodiment, the search information may also be scored according to the PR value of the social networking site corresponding to the search information, the user level of the author of the content of the social networking site corresponding to the search information on the social networking site, the number of concerns (number of fans) of the author on the social networking site, whether the author passes the official authentication of the social networking site, the posting time of the content posted by the author on the social networking site, the reply number of the content posted by the author on the social networking site, the forwarding number of the content posted by the author on the social networking site, the number of clicks of the content posted by the author on the social networking site, and the number of impressions of the content posted by the author on the social networking site.
Step 405, sorting the search information according to the scores, and generating a set of sorted search information as a search result.
In this embodiment, the electronic device sorts the pieces of search information in descending order of the scores based on the scores of the pieces of search information obtained in step 404, and sets at least one piece of search information including the pieces of search information after sorting as a search result.
As can be seen from fig. 4, compared with the embodiment corresponding to fig. 2, the flow 400 of the retrieval method in this embodiment highlights the steps of acquiring the website information and the author information. Therefore, the scheme described in the embodiment can introduce more related data of the website information and the author information, so that more comprehensive retrieval information selection and more effective retrieval results are realized.
With further reference to fig. 5, as an implementation of the method shown in the above figures, the present application provides an embodiment of a retrieval apparatus, which corresponds to the embodiment of the method shown in fig. 2, and which can be applied to various electronic devices.
As shown in fig. 5, the search apparatus 500 according to the present embodiment includes: a receiving unit 501, a retrieving unit 502, a scoring unit 503 and a sorting unit 504. The receiving unit 501 is configured to receive a retrieval request of a user, where the retrieval request includes a retrieval keyword; the retrieval unit 502 is configured to perform retrieval operation on at least one predetermined social network site according to the retrieval keyword to generate a retrieval information set; the scoring unit 503 is configured to score each piece of search information in the search information set according to website information of a social network site corresponding to the search information and author information of content of the social network site corresponding to the search information; and the sorting unit 504 is configured to sort the pieces of search information according to the scores, and generate a set of sorted search information as a search result.
In this embodiment, the receiving unit 501 of the searching apparatus 500 may receive a searching request of a user from a terminal with which the user searches through a wired connection manner or a wireless connection manner, wherein the searching request includes a searching keyword.
In this embodiment, the retrieval apparatus 500 may store the content of a plurality of predetermined social network sites in advance, and may perform a retrieval operation on the content so as to be presented on the browser as retrieval information. Thus, the search unit 502 of the search device 500 can perform a search operation on at least one predetermined social network site based on the search keyword obtained by the receiving unit 501, and generate a search information set. The predetermined social network site can be a manually set website; or the default website of the electronic equipment; the electronic device can also be a website set by itself when the website meets the preset conditions.
In this embodiment, the scoring unit 503 of the retrieval device 500 may score, for each piece of retrieval information in the retrieval information set generated in the retrieval unit 502, the piece of retrieval information according to the website information of the social networking site corresponding to the piece of retrieval information and the author information of the content of the social networking site corresponding to the piece of retrieval information.
In this embodiment, the sorting unit 504 may sort the pieces of search information in descending order of the scores according to the scores of the pieces of search information obtained by the scoring unit 503, and may use a set of at least one piece of search information including the pieces of search information after sorting as a search result.
In some optional implementations of the embodiment, the website information may include a website level (PR value, PageRank value) of the website. The PR value may be a page level of a page included in a web site corresponding to the web site information. The PR value is a standard for representing the ranking of web pages or web sites, and the ranking is 0 to 10, respectively.
In some optional implementations of the present embodiment, the above retrieval apparatus 500 further includes: an obtaining unit (not shown in the figure) is configured to obtain website information of a social network site corresponding to each piece of search information in the search information set and author information of content of the social network site corresponding to each piece of search information.
In some optional implementation manners of this embodiment, the obtaining unit may capture, by using a web crawler technology, website information of a social network site corresponding to the retrieval information and author information of content of the social network site corresponding to the retrieval information, where the web crawler is also referred to as a web spider, a web robot, or a web page chaser, and is a program or a script that automatically captures web information according to a certain rule.
In some optional implementations of the present embodiment, the above retrieval apparatus 500 further includes: a receiving unit (not shown in the figure) for receiving website information, content information and/or author information of content actively pushed by at least one predetermined social networking site.
In some optional implementations of this embodiment, the author information includes at least one of: author basic information and author behavior information; wherein the author basic information includes at least one of: the name of the author, the grade of the author on the corresponding social network site, the attention amount of the author on the corresponding social network site and whether the author passes official authentication of the social network site; the author behavior information includes at least one of: the method comprises the following steps of publishing time of content published by an author on a corresponding social network site, replying quantity of the content published by the author on the corresponding social network site, forwarding quantity of the content published by the author on the corresponding social network site, clicking quantity of the content published by the author on the corresponding social network site and displaying quantity of the content published by the author on the corresponding social network site.
Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing a server according to embodiments of the present application.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the system 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the method of the present application when executed by a Central Processing Unit (CPU) 601.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes a receiving unit, a retrieving unit, a scoring unit, and a sorting unit. Where the names of these units do not in some cases constitute a limitation on the unit itself, for example, a receiving unit may also be described as a "unit that receives a user's retrieval request".
As another aspect, the present application also provides a non-volatile computer storage medium, which may be the non-volatile computer storage medium included in the apparatus in the above-described embodiments; or it may be a non-volatile computer storage medium that exists separately and is not incorporated into the terminal. The non-transitory computer storage medium stores one or more programs that, when executed by a device, cause the device to: receiving a retrieval request of a user, wherein the retrieval request comprises a retrieval keyword; performing retrieval operation on at least one preset social network site according to the retrieval key words to generate a retrieval information set; scoring each piece of retrieval information in the retrieval information set according to website information of a social network site corresponding to the retrieval information and author information of content of the social network site corresponding to the retrieval information; and sorting the pieces of search information according to the scores, and generating a set of sorted search information as a search result.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by a person skilled in the art that the scope of the invention as referred to in the present application is not limited to the embodiments with a specific combination of the above-mentioned features, but also covers other embodiments with any combination of the above-mentioned features or their equivalents without departing from the inventive concept. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.

Claims (12)

1. A method of searching, the method comprising:
receiving a retrieval request of a user, wherein the retrieval request comprises a retrieval keyword;
performing retrieval operation on at least one preset social network site according to the retrieval key words to generate a retrieval information set;
scoring each piece of retrieval information in the retrieval information set according to website information of a social network site corresponding to the retrieval information and author information of content of the social network site corresponding to the retrieval information;
and sorting the pieces of search information according to the scores, and generating a set of sorted search information as a search result.
2. The method of claim 1, wherein the website information comprises a website level of the website.
3. The method according to any one of claims 1-2, wherein before scoring each piece of search information in the set of search information, the method further comprises:
and acquiring website information of the social network site corresponding to the retrieval information and author information of the content of the social network site corresponding to the retrieval information.
4. The method of claim 3, wherein the obtaining of the website information of the social networking site corresponding to the retrieval information and the author information of the content of the social networking site corresponding to the retrieval information comprises:
and capturing website information of the social network site corresponding to the retrieval information and author information of the content of the social network site corresponding to the retrieval information by using a web crawler technology.
5. The method of claim 3, further comprising: receiving website information, content information and/or author information of content actively pushed by the at least one predetermined social networking site.
6. The method of claim 5, wherein the author information comprises at least one of: author basic information and author behavior information; wherein the author base information includes at least one of: the name of the author, the grade of the author on the corresponding social network site, the attention amount of the author on the corresponding social network site and whether the author passes official authentication of the social network site; the author behavior information includes at least one of: the method comprises the following steps of publishing time of content published by an author on a corresponding social network site, replying quantity of the content published by the author on the corresponding social network site, forwarding quantity of the content published by the author on the corresponding social network site, clicking quantity of the content published by the author on the corresponding social network site and displaying quantity of the content published by the author on the corresponding social network site.
7. A retrieval apparatus, characterized in that the apparatus comprises:
the system comprises a receiving unit, a searching unit and a searching unit, wherein the receiving unit is configured to receive a searching request of a user, and the searching request comprises a searching keyword;
the retrieval unit is configured to perform retrieval operation on at least one predetermined social network site according to the retrieval key words to generate a retrieval information set;
the scoring unit is configured to score each piece of retrieval information in the retrieval information set according to website information of a social network site corresponding to the retrieval information and author information of content of the social network site corresponding to the retrieval information;
and the sorting unit is configured to sort the search information according to the scores and generate a set of sorted search information as a search result.
8. The apparatus of claim 7, wherein the website information comprises a website level of the website.
9. The apparatus according to one of claims 7-8, wherein the apparatus further comprises:
and the acquisition unit is configured to acquire the website information of the social network site corresponding to the retrieval information and the author information of the content of the social network site corresponding to the retrieval information.
10. The apparatus of claim 9, wherein the obtaining unit is further configured to:
and capturing website information of the social network site corresponding to the retrieval information and author information of the content of the social network site corresponding to the retrieval information by using a web crawler technology.
11. The apparatus of claim 9, further comprising:
the receiving unit is configured to receive website information, content information and/or author information of content actively pushed by the at least one predetermined social networking site.
12. The apparatus of claim 11, wherein the author information comprises at least one of: author basic information and author behavior information; wherein the author base information includes at least one of: the name of the author, the grade of the author on the corresponding social network site, the attention amount of the author on the corresponding social network site and whether the author passes official authentication of the social network site; the author behavior information includes at least one of: the method comprises the following steps of publishing time of content published by an author on a corresponding social network site, replying quantity of the content published by the author on the corresponding social network site, forwarding quantity of the content published by the author on the corresponding social network site, clicking quantity of the content published by the author on the corresponding social network site and displaying quantity of the content published by the author on the corresponding social network site.
CN201610170303.7A 2016-03-23 2016-03-23 Search method and device Active CN105824951B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610170303.7A CN105824951B (en) 2016-03-23 2016-03-23 Search method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610170303.7A CN105824951B (en) 2016-03-23 2016-03-23 Search method and device

Publications (2)

Publication Number Publication Date
CN105824951A true CN105824951A (en) 2016-08-03
CN105824951B CN105824951B (en) 2019-10-11

Family

ID=56524074

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610170303.7A Active CN105824951B (en) 2016-03-23 2016-03-23 Search method and device

Country Status (1)

Country Link
CN (1) CN105824951B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122414A (en) * 2017-03-31 2017-09-01 广东神马搜索科技有限公司 Search result recommends method, equipment, search engine and electronic equipment
CN112307185A (en) * 2017-06-01 2021-02-02 互动解决方案公司 Demonstration device
CN113468425A (en) * 2021-06-30 2021-10-01 北京百度网讯科技有限公司 Knowledge content distribution method and device, electronic equipment and storage medium
CN114372190A (en) * 2022-03-22 2022-04-19 湖南大学 A kind of Internet massive data retrieval method and retrieval system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426610A (en) * 2012-01-13 2012-04-25 中国科学院计算技术研究所 Microblog rank searching method and microblog searching engine
CN102737090A (en) * 2012-03-21 2012-10-17 袁行远 Webpage searching result ordering method and device
CN103246670A (en) * 2012-02-09 2013-08-14 深圳市腾讯计算机系统有限公司 Microblog sorting, searching, display method and system
CN103455615A (en) * 2013-09-10 2013-12-18 中国地质大学(武汉) Method for sequencing filtering and retrieving WeChat accounts
CN103823906A (en) * 2014-03-19 2014-05-28 北京邮电大学 Multi-dimension searching sequencing optimization algorithm and tool based on microblog data
WO2014102734A1 (en) * 2012-12-27 2014-07-03 Ramana Ch Venkata Systems and methods for collecting, sorting and posting information on a social media profile

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102426610A (en) * 2012-01-13 2012-04-25 中国科学院计算技术研究所 Microblog rank searching method and microblog searching engine
CN103246670A (en) * 2012-02-09 2013-08-14 深圳市腾讯计算机系统有限公司 Microblog sorting, searching, display method and system
CN102737090A (en) * 2012-03-21 2012-10-17 袁行远 Webpage searching result ordering method and device
WO2014102734A1 (en) * 2012-12-27 2014-07-03 Ramana Ch Venkata Systems and methods for collecting, sorting and posting information on a social media profile
CN103455615A (en) * 2013-09-10 2013-12-18 中国地质大学(武汉) Method for sequencing filtering and retrieving WeChat accounts
CN103823906A (en) * 2014-03-19 2014-05-28 北京邮电大学 Multi-dimension searching sequencing optimization algorithm and tool based on microblog data

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122414A (en) * 2017-03-31 2017-09-01 广东神马搜索科技有限公司 Search result recommends method, equipment, search engine and electronic equipment
CN112307185A (en) * 2017-06-01 2021-02-02 互动解决方案公司 Demonstration device
CN113468425A (en) * 2021-06-30 2021-10-01 北京百度网讯科技有限公司 Knowledge content distribution method and device, electronic equipment and storage medium
CN114372190A (en) * 2022-03-22 2022-04-19 湖南大学 A kind of Internet massive data retrieval method and retrieval system
CN114372190B (en) * 2022-03-22 2022-05-17 湖南大学 A kind of Internet massive data retrieval method and retrieval system

Also Published As

Publication number Publication date
CN105824951B (en) 2019-10-11

Similar Documents

Publication Publication Date Title
CN107679211B (en) Method and device for pushing information
CN107172151B (en) Method and device for pushing information
US9043268B2 (en) Method and system for displaying links to search results with corresponding images
US8290927B2 (en) Method and apparatus for rating user generated content in search results
US8612416B2 (en) Domain-aware snippets for search results
US9172666B2 (en) Locating a user based on aggregated tweet content associated with a location
US7966325B2 (en) System and method for ranking search results using social information
US20060287988A1 (en) Keyword charaterization and application
CN106708817B (en) Information searching method and device
CN102855256B (en) For determining the method, apparatus and equipment of Website Evaluation information
US20140095308A1 (en) Advertisement distribution apparatus and advertisement distribution method
US10019419B2 (en) Method, server, browser, and system for recommending text information
US7962523B2 (en) System and method for detecting templates of a website using hyperlink analysis
US9460165B2 (en) Retrieval device, retrieval system, retrieval method, retrieval program, and computer-readable recording medium storing retrieval program
CN113656737B (en) Web page content display method, device, electronic device and storage medium
CN105824951A (en) Retrieval method and retrieval device
US20170235835A1 (en) Information identification and extraction
CN106294417A (en) A kind of data reordering method, device and electronic equipment
CN119202152A (en) Problem-solving method, device, computer equipment and medium based on artificial intelligence
JP6684894B2 (en) Method and apparatus for push information distribution
CN109408725B (en) Method and apparatus for determining user interest
US20130230248A1 (en) Ensuring validity of the bookmark reference in a collaborative bookmarking system
CN112182390A (en) Letter pushing method and device, computer equipment and storage medium
CN111125548A (en) Public opinion supervision method and device, electronic equipment and storage medium
US20150169523A1 (en) Smart Scoring And Filtering of User-Annotated Geocoded Datasets

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant