[go: up one dir, main page]

TWI501096B - Ranking user generated web content - Google Patents

Ranking user generated web content Download PDF

Info

Publication number
TWI501096B
TWI501096B TW099137022A TW99137022A TWI501096B TW I501096 B TWI501096 B TW I501096B TW 099137022 A TW099137022 A TW 099137022A TW 99137022 A TW99137022 A TW 99137022A TW I501096 B TWI501096 B TW I501096B
Authority
TW
Taiwan
Prior art keywords
user
content
generated
users
score
Prior art date
Application number
TW099137022A
Other languages
Chinese (zh)
Other versions
TW201140346A (en
Inventor
Xiance Si
Jian Gong Deng
Huacheng Ke
Dong Zhang
Zoltan I Gyongyi
Edward Y Chang
Original Assignee
Google Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google Inc filed Critical Google Inc
Publication of TW201140346A publication Critical patent/TW201140346A/en
Application granted granted Critical
Publication of TWI501096B publication Critical patent/TWI501096B/en

Links

Landscapes

  • Information Transfer Between Computers (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Description

排名用戶產生的網路內容Ranking user generated web content

本說明書大體描述用於分析網路內容(包括用戶產生之網路內容)之技術。This specification generally describes techniques for analyzing web content, including user-generated web content.

網站或電子社區(例如,由同一實體代管之網站之集合)可代管由網站之用戶建立及/或上載的一或多個類型之電子內容。舉例而言,論壇、電子相簿及視訊共用網站向用戶提供張貼或上載用戶產生之內容以與其他用戶共用的能力。一些代管網站需要每一用戶在貢獻內容之前使用識別資訊登入。以此方式,內容可與一登錄用戶正面地相關聯。在一些情形中,用戶可彼此互動。舉例而言,在一論壇內,第一用戶可張貼一問題或評論,且其他用戶可回應由該第一用戶進行之張貼。A website or electronic community (eg, a collection of websites hosted by the same entity) may host one or more types of electronic content created and/or uploaded by users of the website. For example, forums, e-books, and video-sharing sites provide users with the ability to post or upload user-generated content to share with other users. Some hosting sites require each user to log in using identifying information before contributing content. In this way, content can be positively associated with a login user. In some cases, users can interact with each other. For example, in a forum, a first user can post a question or comment, and other users can respond to postings by the first user.

本說明書描述用於加權電子社區中之用戶之間的互動且基於用戶之間的互動產生用戶憑證計分的技術。一般而言,可分析用戶產生之內容項目(例如,對一網誌之評論或在一問答網站上張貼之答案)以指派品質因子。可另外分析用戶產生之內容項目以評定輸入之品質且識別用戶(例如,上載對第一用戶張貼之問題的回應之一或多個用戶)之間的個別互動。該等互動可表示於一社交或用戶活動圖中,該社交或用戶活動圖具有基於所指派之品質因子而指派給用戶對之間的有向鏈接的權重。可使用此等權重及對應互動來產生用戶憑證計分。This specification describes techniques for weighting interactions between users in an electronic community and generating user credential scores based on interactions between users. In general, user-generated content items (eg, comments on a blog or an answer posted on a question and answer website) can be analyzed to assign a quality factor. User-generated content items may be additionally analyzed to assess the quality of the input and identify individual interactions between the user (eg, uploading one or more users of the response to the question posted by the first user). The interactions may be represented in a social or user activity map having weights assigned to directed links between pairs of users based on the assigned quality factors. These weights and corresponding interactions can be used to generate user credential scores.

一般而言,本說明書中描述之標的物之一態樣可體現於包括以下動作之方法中:藉由一電腦之操作來識別用戶之間經由一電子網路之多個互動,其中每一互動係在一對用戶之間;及向每一互動指派一表示該互動之品質之加權因子。使用一或多個處理器產生用於多個用戶中之每一者之用戶憑證計分。該等用戶憑證計分係基於用於多個互動中之每一者之該等加權因子。將該等用戶憑證計分與一用戶識別符相關聯地儲存於一電腦可讀儲存器件上。此態樣之其他實施例包括對應系統、裝置,及經組態以執行該等方法之動作、編碼於電腦可讀儲存器件上之電腦程式。In general, one aspect of the subject matter described in this specification can be embodied in a method that includes the following actions: by a computer operation to identify multiple interactions between users via an electronic network, each interaction Between a pair of users; and assign each interaction a weighting factor that represents the quality of the interaction. User credential scores for each of a plurality of users are generated using one or more processors. The user credential scores are based on the weighting factors for each of the plurality of interactions. The user credential scores are stored in association with a user identifier on a computer readable storage device. Other embodiments of this aspect include corresponding systems, devices, and computer programs that are configured to perform the actions of the methods, encoded on a computer readable storage device.

此等及其他實施例可各自視情況單獨地或組合地包括以下特徵中之一或多者。接收一搜尋查詢,且使用一處理器識別回應於該搜尋查詢之用戶產生之內容條目或項目,且至少部分地基於與該等項目相關聯之用戶憑證計分來排名該等所識別之用戶產生之內容項目。識別回應於該搜尋查詢之用戶產生之內容項目可涉及指派用於每一項目之內容相關性之一量測,且基於該等用戶憑證計分來排名該等用戶產生之內容項目可涉及組合與該等項目相關聯之該等用戶憑證計分與用於該等項目之內容相關性之該等量測。可正規化該等用戶憑證計分及內容相關性之該等量測,其中組合與該等項目相關聯之該等用戶憑證計分與用於該等項目之內容相關性之該等量測可涉及組合該等經正規化之用戶憑證計分與內容相關性之該等經正規化之量測。可使用一處理器產生一基於用戶之間的互動識別用戶之間的鏈接之用戶活動圖,判定每一用戶之權威度計分(authority score),且判定每一用戶之貢獻度計分。一特定用戶之權威度計分可係基於在該用戶活動圖中該特定用戶所鏈接至之用戶的貢獻度計分,且該貢獻度計分可係基於在該用戶活動圖中該特定用戶所鏈接至之用戶的權威度計分。該等權威度計分及該等貢獻度計分係使用一反覆更新程序直至該反覆更新程序到達一預定收斂臨限值來產生。該等用戶互動可對應於在一問答網站、一佈告欄網站、一網誌,或一社交網路網站中之至少一者上的用戶產生之內容。該加權因子可包括多個品質因子之組合,且品質因子可包括:一用戶之一內容項目與另一用戶之一相關聯的先前內容項目之一相關性、一內容項目相對於其他內容項目之原創性、一內容項目對應於該內容項目中之非通用術語之一量測的一涵蓋範圍、內容項目之豐富度,或內容項目之及時性。可基於該等用戶憑證計分獎勵用戶。These and other embodiments may each include one or more of the following features, individually or in combination, as appropriate. Receiving a search query and using a processor to identify content items or items generated by a user responsive to the search query, and ranking the identified users based at least in part on user credential scores associated with the items Content item. Identifying a user-generated content item responsive to the search query may involve assigning a measure of content relevance for each item, and ranking the user-generated content items based on the user credential scores may involve combining The user credentials associated with the projects are scored and the measurements relating to the content of the content of the projects. The measurements of the user credential scores and content relevance may be normalized, wherein the combinations of the user credential scores associated with the items and the content relevance for the items may be combined These normalized measurements involving the combination of such normalized user voucher scores and content relevance are involved. A processor may be used to generate a user activity map that identifies links between users based on interactions between users, determine an authority score for each user, and determine a contribution score for each user. The authority score of a particular user may be based on the contribution score of the user to which the particular user is linked in the user activity map, and the contribution score may be based on the particular user in the user activity map The authority rating of the user linked to it. The authority scores and the contribution scores are generated using a repeated update procedure until the repeated update procedure reaches a predetermined convergence threshold. The user interactions may correspond to user generated content on at least one of a question and answer website, a bulletin board website, a blog, or a social networking website. The weighting factor can include a combination of a plurality of quality factors, and the quality factor can include one of a previous content item associated with one of the user's content items and one of the other users, and one content item relative to the other content item The originality, a content item corresponds to a coverage of one of the non-generic terms in the content item, the richness of the content item, or the timeliness of the content item. The rewarding user can be scored based on the user credentials.

一般而言,本說明書中描述之標的物之另一態樣可體現於包括以下動作之方法中:接收且發佈用於跨一網路存取之用戶產生之內容,儲存用戶產生之內容,及識別與該所儲存之用戶產生之內容相關的用戶對之間的互動。基於每一互動之品質之一客觀量測產生一用於該互動之加權因子,基於該等所識別之互動及用於該等互動之該等加權因子產生用於每一用戶之用戶憑證計分,及基於該等用戶憑證計分排名用戶或用戶產生之內容。In general, another aspect of the subject matter described in this specification can be embodied in a method comprising: receiving and distributing content generated by a user for accessing a network, storing user generated content, and Identifying interactions between user pairs associated with the content generated by the stored user. An objective measure for each interaction based on one of the quality of each interaction produces a weighting factor for the interaction, based on the identified interactions and the weighting factors for the interactions to generate a user voucher score for each user And ranking the user or user generated content based on the user credentials.

此等及其他實施例可各自視情況包括以下特徵中之一或多者。可使用一或多個伺服器接收且發佈用戶產生之內容,用戶產生之內容可儲存於一或多個儲存器件內,且可使用一或多個處理器識別該等互動,產生加權因子,產生用戶憑證計分,且排名用戶或用戶產生之內容。藉由基於該等所識別之互動及該等加權因子反覆地更新一權威度計分及一相關貢獻度計分而產生每一用戶之用戶憑證計分。可接收一搜尋查詢,且識別回應於該搜尋查詢之多個用戶產生之內容項目。至少部分地基於與每一用戶產生之內容項目相關聯的用戶之用戶憑證計分來排名該等所識別之用戶產生之內容項目,及基於內容項目之該排名產生一組搜尋結果。可基於與每一用戶產生之內容項目相關聯之相關性的一量測及與每一用戶產生之內容項目相關聯的用戶之用戶憑證計分的一加權組合來排名該等所識別之用戶產生之內容項目。該互動之品質之客觀量測係自表示以下各者之因子之一組合導出:一用戶之內容項目與另一用戶之一相關聯先前內容項目的一相關性、一內容項目相對於其他內容項目之原創性,及內容項目中之非通用術語之一涵蓋範圍。該等互動中之一特定互動可包括第一用戶對由一第二用戶張貼之電子資訊之一電子回應,且用於該特定互動之加權因子可與以下各者相關:該第一用戶之該電子回應與由該第二用戶張貼之該電子資訊之相關性、該電子回應中之相對非通用資訊之一涵蓋範圍,或該電子回應之一相對原創性。These and other embodiments may each include one or more of the following features, as appropriate. The user-generated content can be received and published using one or more servers, and the user-generated content can be stored in one or more storage devices, and the interaction can be identified using one or more processors to generate a weighting factor. The user credentials are scored and the content generated by the user or user is ranked. A user credential score for each user is generated by repeatedly updating an authority score and a related contribution score based on the identified interactions and the weighting factors. A search query can be received and content items generated by a plurality of users responsive to the search query can be identified. Ranking the identified user-generated content items based at least in part on the user's user credential score associated with each user-generated content item, and generating a set of search results based on the ranking of the content items. Ranking the identified users based on a weighted combination of user's user voucher scores associated with each user-generated content item based on a measure of relevance associated with each user-generated content item Content item. The objective measure of the quality of the interaction is derived from a combination of one of the following factors: a relevance of a user's content item to one of the other users' previous content items, and a content item relative to other content items. Originality, and coverage of one of the non-generic terms in the content project. One of the interactions may include the first user electronically responding to one of the electronic messages posted by a second user, and the weighting factor for the particular interaction may be related to: the first user The electronic response is related to the electronic information posted by the second user, one of the relatively non-universal information in the electronic response, or one of the electronic responses is relatively original.

可實施本說明書中描述的標的物之特定實施例以實現以下優點中之一或多者。可評定張貼內容或其他用戶產生之內容之品質,且將其用於以下目的:產生搜尋結果,針對高品質輸入獎勵用戶,或基於低品質貢獻限制存取。可評估用戶之間的互動或關係以識別相對有權威的或可靠的貢獻者。搜尋引擎可使用用戶憑證計分來產生搜尋結果。舉例而言,可藉由使用用戶A與用戶B之間的關係強度連同用戶B之計分來調整由用戶B建立或與用戶B相關(例如,由用戶B評論,或由用戶B回覆等)之內容出現在由用戶A執行之搜尋之結果中的相對位置,從而個人化搜尋結果。使用此個人化能力,針對某一不同用戶C,調整可能根據用戶B與用戶C之間的關係強度而不同。Particular embodiments of the subject matter described in this specification can be implemented to realize one or more of the following advantages. The quality of posted content or other user-generated content can be assessed and used for the purpose of generating search results, rewarding users for high quality input, or restricting access based on low quality contributions. Interactions or relationships between users can be assessed to identify relatively authoritative or reliable contributors. The search engine can use the user credential score to generate search results. For example, it can be adjusted by user B or related to user B by using the strength of the relationship between user A and user B along with the score of user B (for example, by user B, or by user B, etc.) The content appears in a relative position in the results of the search performed by User A, thereby personalizing the search results. Using this personalization capability, for a different user C, the adjustment may vary depending on the strength of the relationship between User B and User C.

在隨附圖式及下文之描述中闡述一或多個實施例之細節。標的物之其他特徵、目標及優點將自描述及圖式且自申請專利範圍變得顯而易見。The details of one or more embodiments are set forth in the accompanying drawings and claims. Other features, objectives, and advantages of the subject matter will be apparent from the description and drawings.

如今電子地可用之許多媒體內容為用戶產生的,亦即,由未由代管該內容之網站雇用的人產生。術語「媒體內容」(或簡稱「內容」)可指代單一電子文件、可包括音訊、視覺及/或文本組成部分的相關電子文件之集合,或電子文件之一部分。在一些實例中,文字、相片或視訊可由一已登錄用戶添加至一代管網站以與其他(已登錄或未登錄)用戶共用。由諸用戶供應之媒體內容之品質及標的物相關性可不同。Many of the media content that is available electronically today is user generated, that is, generated by people who are not employed by the website hosting the content. The term "media content" (or simply "content") may refer to a single electronic document, a collection of related electronic files that may include audio, visual, and/or text components, or a portion of an electronic document. In some instances, text, photos, or video may be added to a web portal by a logged-in user for sharing with other (logged in or not logged in) users. The quality of the media content supplied by the users and the relevance of the subject matter may vary.

通常,代管用戶產生之內容的網站提供用於使用戶與彼此關於用戶產生之內容進行互動以產生額外用戶產生之內容的構架。舉例而言,第一用戶可張貼一問題至論壇,且第二用戶可張貼對該問題之回應。兩個用戶之間的此資訊交換可描述為互動。可貢獻於一或多個互動之其他例示性用戶活動包括:在問答(Q&A)網站上提交問題或答案、在網誌或線上佈告欄上張貼主題或討論、提交對在網站上可用之內容之評級,或檢視由操作第一用戶器件之第一用戶提供之新內容(例如,第二用戶在第二計算器件上檢視新內容)。用於排名用戶產生之內容的系統及方法可提供電子社區中之用戶之間的互動之品質加權,且基於用戶之間的互動產生用戶憑證計分。每一用戶及每一用戶產生之內容項目可與基於用戶之間的互動之相對品質值相關聯。舉例而言,可基於對用戶產生之內容的回應之品質的分析及張貼該內容之用戶之貢獻度計分向用戶指派權威度計分,且可基於張貼或上載之用戶產生之內容的品質之分析及回應該內容之用戶之權威度計分向用戶指派貢獻度計分。此外,可將用戶互動視為信任投票,使得許多互動(特定而言尤其涉及有權威及/或有貢獻用戶之互動)趨向於增加用戶憑證計分。該等互動可表示於一社交或用戶活動圖中,該社交或用戶活動圖具有指派給用戶對之間的有向鏈接(例如,張貼問題之用戶與回答該問題之用戶之間的鏈接)之權重。可使用此等權重及對應互動來產生用戶憑證計分。Typically, websites hosting user-generated content provide a framework for interacting with each other with respect to user-generated content to generate additional user-generated content. For example, the first user can post a question to the forum and the second user can post a response to the question. This exchange of information between two users can be described as interaction. Other exemplary user activities that can contribute to one or more interactions include submitting questions or answers on the Q&A website, posting topics or discussions on the blog or online bulletin board, and submitting content available on the website. Rating, or view new content provided by the first user operating the first user device (eg, the second user views the new content on the second computing device). Systems and methods for ranking user-generated content can provide quality weighting of interactions between users in an electronic community, and generate user credential scores based on interactions between users. Each user and each user generated content item can be associated with a relative quality value based on the interaction between the users. For example, an authority score can be assigned to a user based on an analysis of the quality of the response to the user-generated content and a contribution score of the user posting the content, and can be based on the quality of the content generated by the user posted or uploaded. The authoritative score of the user who analyzes and responds to the content assigns a contribution score to the user. In addition, user interaction can be viewed as a trust vote, such that many interactions (particularly involving interactions with authoritative and/or contributing users) tend to increase user credential scoring. The interactions may be represented in a social or user activity map having a directed link (eg, a link between the user posting the question and the user answering the question) assigned to the user pair. Weights. These weights and corresponding interactions can be used to generate user credential scores.

所描述之技術可提供一或多個益處,例如,排名用戶可信度以對電子環境內之整個社區加以改良。舉例而言,可自動地發現濫用電子環境之特殊權限(例如,濫發垃圾內容或對其他人進行騷擾性評論)之用戶,而可升級或獎勵頻繁地貢獻於高品質互動之用戶。該等技術亦可用以藉由提供與用戶產生之內容相關聯的品質量測(例如,主題相關性、貢獻度、描述性,或內容來源之信譽)而改良包括用戶產生之內容的搜尋結果之排名。預先存在之用戶關係(例如,基於先前互動之數目及/或品質)亦可用以(例如)藉由增加與已在先前與進行搜尋之特定用戶互動之用戶相關聯的搜尋結果之相關性計分(例如,基於該等其他用戶之用戶憑證計分與正進行搜尋的用戶與該等其他用戶之間的關係強度的組合)來個人化用於該特定用戶之搜尋結果。在一些實施中,用戶憑證計分(及基於該等計分之任何排名)對於與用戶產生之內容相關聯之不同類別或標籤可為不同的。舉例而言,可針對與園藝相關之用戶產生之內容及與用餐相關之用戶產生之內容獨立地計算用戶憑證計分。因此,參與與園藝及用餐相關之論壇的特定用戶可具有與每一類別相關聯的不同用戶憑證計分。The described techniques may provide one or more benefits, such as ranking user credibility to improve the entire community within the electronic environment. For example, users who abuse the special rights of the electronic environment (eg, spamming spam or harassing comments to others) can be automatically discovered, and users who frequently contribute to high quality interactions can be upgraded or rewarded. The techniques can also be used to improve search results including user-generated content by providing quality measures (eg, topic relevance, contribution, descriptive, or reputation of the content source) associated with user-generated content. Ranking. Pre-existing user relationships (eg, based on the number and/or quality of previous interactions) may also be used, for example, to score by correlating search results associated with users who have previously interacted with a particular user conducting a search. (For example, based on a combination of the user credentials of the other users and the strength of the relationship between the user being searched and the other users) to personalize the search results for that particular user. In some implementations, user credential scoring (and any ranking based on such scoring) can be different for different categories or tags associated with user generated content. For example, the user credential score can be calculated independently for content generated by the user associated with the gardening and content generated by the user associated with the meal. Thus, a particular user participating in a forum related to gardening and dining may have different user credential scores associated with each category.

圖1為用於排名搜尋用戶產生之內容的搜尋結果之系統100的方塊圖。在系統100內,操作用戶器件104之一或多個用戶可經由網路102連接至網路伺服器106及搜尋伺服器108以搜尋、上載及擷取電子內容。網路伺服器106及搜尋伺服器108各自直接或經由網路102連接至互動處理伺服器110。互動處理伺服器110可與網路伺服器106合作以分析用戶產生之電子內容且導出與用戶產生之電子內容相關聯的品質量測。搜尋伺服器108可另外與互動處理伺服器110協作以基於所導出之品質量測強化用戶產生之內容之搜尋結果排名。1 is a block diagram of a system 100 for ranking search results for searching for content generated by a user. Within system 100, one or more users operating user device 104 can connect to network server 106 and search server 108 via network 102 to search for, upload, and retrieve electronic content. Network server 106 and search server 108 are each coupled to interactive processing server 110 either directly or via network 102. The interactive processing server 110 can cooperate with the web server 106 to analyze the electronic content generated by the user and derive a quality metric associated with the electronic content generated by the user. The search server 108 can additionally cooperate with the interactive processing server 110 to enforce ranking of search results for user-generated content based on the derived quality measurements.

電腦網路102(例如,區域網路(LAN)、廣域網路(WAN)、網際網路,或其組合)連接網路伺服器106、搜尋伺服器108,及用戶器件104a、104b及104c。實例用戶器件104包括個人電腦、行動通信器件、智慧型電話、個人資料助理(PDA),及電視機上盒。儘管僅展示一個搜尋伺服器108及一個網路伺服器106,但系統100可包括任何數目之網路伺服器、搜尋伺服器及用戶器件。Computer network 102 (e.g., a local area network (LAN), a wide area network (WAN), the Internet, or a combination thereof) connects network server 106, search server 108, and user devices 104a, 104b, and 104c. The example user device 104 includes a personal computer, a mobile communication device, a smart phone, a personal data assistant (PDA), and a television set box. Although only one search server 108 and one network server 106 are shown, system 100 can include any number of network servers, search servers, and user devices.

網路伺服器106可為接收對內容之請求且回應於該請求而擷取所請求之內容的通用內容伺服器。在一些實例中,網路伺服器106可與新內容提供者、零售商、獨立網誌、社交網路網站,或經由網路102提供及/或接收內容之任何其他實體相關。網路伺服器106包括用戶產生之內容之儲存庫112及用戶資料儲存庫126(例如,每一儲存庫包括一或多個電子儲存器件,該一或多個電子儲存器件包括於網路伺服器106內或耦接至網路伺服器106)。在一些實施中,用戶產生之內容可包括內容項目自身(例如,由一特定用戶上載之一或多個檔案),以及與個別用戶產生之內容相關聯的詮釋資料(例如,用戶識別、對相關聯用戶產生之內容之參考、用戶產生之內容之類別、用戶產生之內容上載至網路伺服器之日期、表示已請求/檢視用戶內容項目之次數的計數,及其類似者)。The web server 106 can be a universal content server that receives the request for content and retrieves the requested content in response to the request. In some examples, web server 106 can be associated with a new content provider, retailer, standalone blog, social networking website, or any other entity that provides and/or receives content via network 102. The web server 106 includes a repository 112 of user generated content and a user profile repository 126 (eg, each repository includes one or more electronic storage devices, the one or more electronic storage devices being included in the web server Within 106 or coupled to the network server 106). In some implementations, the user-generated content can include the content item itself (eg, uploading one or more files by a particular user), as well as interpretive material associated with the content generated by the individual user (eg, user identification, relevance) A reference to the content generated by the user, a category of content generated by the user, a date uploaded by the user to the web server, a count indicating the number of times the user content item has been requested/viewed, and the like.

當用戶向網路伺服器106提供用戶產生之內容112時,可建立用戶識別符(用戶ID),或用戶可使用用戶ID及密碼登錄網路伺服器106。每一用戶可提供額外設定檔資訊(例如,性別、年齡、位置、電子郵件位址等)。與每一用戶相關聯之資料可儲存於用戶資料儲存庫126內。作為一實例,每一用戶記錄可包括由用戶上載至網路伺服器106且儲存於用戶產生之內容之儲存庫112內的對用戶內容項目之參考。儲存於用戶產生之內容之儲存庫112中的用戶產生之內容可由網路伺服器106回應於來自用戶器件104經由網路102之請求而提供。When the user provides the user-generated content 112 to the web server 106, a user identifier (user ID) can be established, or the user can log in to the web server 106 using the user ID and password. Each user can provide additional profile information (eg, gender, age, location, email address, etc.). The data associated with each user can be stored in the user profile repository 126. As an example, each user record may include a reference to a user content item uploaded by the user to the web server 106 and stored in the repository 112 of the user generated content. User generated content stored in the repository 112 of user generated content may be provided by the web server 106 in response to a request from the user device 104 via the network 102.

自網路伺服器106可用之內容可使用搜尋伺服器108搜尋。搜尋伺服器108包括搜尋引擎114及搜尋索引儲存庫116。為促進回應於用戶查詢而產生搜尋結果,搜尋引擎114可產生或存取由諸如網路伺服器106之發行者提供之內容的索引(例如,網頁之索引)以用於稍後搜尋及識別回應於查詢之內容項目。操作一用戶器件(例如,用戶器件104a)之用戶可提交一搜尋查詢至搜尋伺服器108。搜尋引擎114可回應於搜尋索引儲存庫116內之搜尋查詢來搜尋網路內容。搜尋伺服器108可接著回應於該搜尋查詢而將搜尋結果返回至用戶。搜尋結果(例如)可包括鏈接至經由網路伺服器106可用之網頁的網路內容參考(例如,網頁標題之清單、自彼等網頁提取之文字之片段,或至彼等網頁之超文字鏈接)。The content available from the web server 106 can be searched using the search server 108. The search server 108 includes a search engine 114 and a search index repository 116. To facilitate the generation of search results in response to user queries, search engine 114 may generate or access an index (eg, an index of a web page) of content provided by an issuer such as web server 106 for later searching and identifying responses. The content item for the query. A user operating a user device (e.g., user device 104a) can submit a search query to the search server 108. The search engine 114 can search for web content in response to a search query within the search index repository 116. The search server 108 can then return the search results to the user in response to the search query. The search results, for example, may include web content references that link to web pages available via web server 106 (eg, a list of web page titles, segments of text extracted from such web pages, or hypertext links to their web pages). ).

搜尋結果可按以預定數目個(例如,十個)搜尋結果分組的方式呈現給用戶。搜尋結果藉由結果排名模組118部分地基於與由搜尋結果識別之內容項目相關的計分(例如,資訊擷取(「IR」)計分)且視情況基於由相關性評分模組120提供之文件相關性計分的每一文件之獨立排名來排名。The search results can be presented to the user in a manner that is grouped by a predetermined number (eg, ten) of search results. The search results are based in part on the scores associated with the content items identified by the search results (e.g., information capture ("IR") scores) and are optionally provided by the relevance score module 120. The file rank is ranked by the independent ranking of each file.

搜尋伺服器108可與互動處理伺服器110通信以獲得可應用於用戶產生之內容項目之相關性評分的資料。互動處理伺服器110包括一內容分析模組122。內容分析器122可分析在用戶產生之內容(例如,儲存於用戶產生之內容之儲存庫112內的用戶產生之內容)內表示之內容及互動,且產生與個別內容項目(互動資料)之相關性及品質相關的計分,及提供用戶產生之內容(用戶資料)之每一用戶之權威度或貢獻度。互動處理伺服器110可儲存與每一用戶相關聯之資料(例如,儲存於憑證資料儲存庫138內)及與由用戶產生之內容表示之內容及互動相關聯的資料(例如,儲存於互動資料儲存庫124內)。The search server 108 can communicate with the interactive processing server 110 to obtain data that can be applied to relevance scores for content items generated by the user. The interactive processing server 110 includes a content analysis module 122. The content analyzer 122 can analyze the content and interactions represented within the user-generated content (eg, user-generated content stored in the repository 112 of user-generated content) and generate correlations with individual content items (interactive materials). Sexuality and quality related scoring, and the authority or contribution of each user who provides user-generated content (user data). The interactive processing server 110 can store data associated with each user (e.g., stored in the voucher data repository 138) and associated with content and interactions represented by the content generated by the user (e.g., stored in interactive material) Within repository 124).

在操作中,互動處理伺服器110擷取由網路伺服器106收集之用戶產生之內容。儘管將網路伺服器106展示為具有至互動處理伺服器110之連接,但在一些實施中,互動處理伺服器110可能能夠直接存取用戶產生之內容之儲存庫112。在其他實施中,互動處理伺服器110將用戶產生之內容儲存於臨時儲存位置或互動資料儲存庫124中。互動處理伺服器110另外接收或能夠存取可包括於憑證資料儲存庫138內之用戶資料記錄之一部分(例如,用戶識別及相關上載內容)。In operation, the interactive processing server 110 retrieves user generated content collected by the web server 106. Although web server 106 is shown as having a connection to interactive processing server 110, in some implementations, interactive processing server 110 may be able to directly access repository 112 of user generated content. In other implementations, the interactive processing server 110 stores the user generated content in a temporary storage location or interactive data repository 124. The interactive processing server 110 additionally receives or has access to a portion of the user profile record (eg, user identification and related upload content) that may be included in the credential repository 138.

文章分析器128可分析每一用戶產生之內容項目或文章以產生一品質值。在一些實例中,在線上論壇之場地內,分析一問題以判定其與論壇主題之相關性、語言適當性(例如,無粗話),及/或相對於先前張貼之問題之原創性。在一些實例中,分析在論壇內對一問題之回應以判定其與問題之相關性、回應之特異性、相對於先前張貼之回應之原創性,或相對於問題之原始張貼內容之時間戳記的及時性。文章分析器128可向每一內容項目指派一品質計分。The article analyzer 128 can analyze each user generated content item or article to generate a quality value. In some instances, within the venue of the online forum, a question is analyzed to determine its relevance to the forum topic, language appropriateness (eg, no swearing), and/or originality relative to previously posted questions. In some instances, the analysis responds to a question within the forum to determine its relevance to the question, the specificity of the response, the originality of the response relative to the previous posting, or the timestamp of the original posting relative to the question. Timeliness. The article analyzer 128 can assign a quality score to each content item.

內容分析器內之社交繪圖模組132基於兩個用戶之間關於用戶產生之內容項目的關聯(例如,由一用戶張貼之對由另一用戶張貼之網誌的評論)而產生表示不同用戶之間互動的社交或用戶活動圖。用戶活動圖可由表示個別用戶的節點及表示該兩個用戶之間互動的有向及/或無向鏈接來建構。舉例而言,第一用戶可上載關於在佐治亞州亞特蘭大(Atlanta Georgia)可得之最佳披薩的問題,且接著第二用戶可上載關於在亞特蘭大之披薩餐廳的回應。每一用戶將由用戶活動圖中之一節點表示,且互動將由表示第二用戶對第一用戶之問題之回應的張貼內容的此等兩個節點之間的有向鏈接表示。在一些實施中,用戶活動圖亦識別基於用戶之間的其他類型之連接的用戶之間的關係。舉例而言,用戶A可為用戶B之朋友,或用戶C可對由用戶D張貼之任何內容感興趣。此等類型之關係可為外顯的(用戶A宣告用戶B為其朋友)或內隱的(基於許多先前互動之歷史、內容分析,或其類似者,可能推斷出用戶A為用戶B之朋友)。在社交或用戶活動圖中反映之互動亦可基於此等關係。The social mapping module 132 within the content analyzer generates an indication of different users based on an association between two users regarding a user-generated content item (eg, a posting by a user to a blog posted by another user). Interactive social or user activity map. The user activity map can be constructed from nodes representing individual users and directed and/or undirected links representing interactions between the two users. For example, the first user may upload questions about the best pizza available in Atlanta Georgia, and then the second user may upload a response to the pizza restaurant in Atlanta. Each user will be represented by one of the nodes in the user activity map, and the interaction will be represented by a directed link between the two nodes representing the posting of the second user's response to the first user's question. In some implementations, the user activity map also identifies relationships between users based on other types of connections between users. For example, user A may be a friend of user B, or user C may be interested in any content posted by user D. These types of relationships may be explicit (user A declares User B as their friend) or implicit (based on the history of many previous interactions, content analysis, or the like, it may be inferred that User A is a friend of User B) ). Interactions reflected in social or user activity diagrams may also be based on such relationships.

品質加權模組130可向由社交繪圖模組132產生之用戶活動圖內之每一鏈接指派權重。該等權重(例如)可表示在互動內涉及之用戶內容項目中之一者或兩者之品質(例如,問題相對於論壇之品質,或答案相對於問題之品質)。The quality weighting module 130 can assign weights to each link within the user activity map generated by the social mapping module 132. Such weights, for example, may represent the quality of one or both of the user content items involved in the interaction (eg, the quality of the question relative to the forum, or the quality of the answer relative to the question).

憑證評分模組134可分析由社交繪圖模組132產生之用戶活動圖,以向與用戶產生之內容項目相關聯的用戶指派憑證計分。用戶憑證計分可用以表示一特定用戶之信譽或可信度。用戶憑證計分可部分地基於由用戶提供之內容項目之品質(例如,如由文章分析器128判定)以及用戶已涉及於其中的互動之品質。舉例而言,當一用戶以一高品質答案回應一高品質問題時,該互動可能正面地影響該用戶之憑證計分。或者,若一用戶以一低品質答案回應一問題,則該互動可能負面地影響該用戶之憑證計分。若用戶回應由具有一高憑證計分之用戶張貼的問題,與用戶回應由具有一低憑證計分之用戶張貼的問題的情況相比,該互動可能對該用戶之憑證計分產生一更加正面之影響。在另一實例中,若由一用戶張貼之問題接收到具有高憑證計分之用戶的高品質回應,則該互動可對張貼該問題之用戶之憑證計分產生正面影響。在一些實施中,用戶憑證計分可部分地藉由使用考量加權鏈接之超文字誘導最高選擇(Hypertext Induced Top Selection,HITS)演算法來計算。The voucher scoring module 134 can analyze the user activity map generated by the social mapping module 132 to assign a voucher score to a user associated with the user-generated content item. User credential scores can be used to indicate the credibility or credibility of a particular user. The user credential score may be based in part on the quality of the content item provided by the user (e.g., as determined by article parser 128) and the quality of the interaction in which the user has been involved. For example, when a user responds to a high quality question with a high quality answer, the interaction may positively affect the user's voucher score. Alternatively, if a user responds to a question with a low quality answer, the interaction may negatively impact the user's voucher score. If the user responds to a question posted by a user with a high voucher score, the interaction may result in a more positive score for the user's voucher than if the user responded to a question posted by a user with a low voucher score. The impact. In another example, if a high quality response from a user with a high voucher score is received by a user posting question, the interaction can have a positive impact on the voucher score of the user posting the question. In some implementations, the user credential score can be calculated, in part, by using a Hypertext Induced Top Selection (HITS) algorithm that considers weighted links.

基於用戶憑證計分,用戶排名模組136可排名已向網路伺服器106貢獻用戶產生之內容的用戶。在一些實施中,用戶排名可用以升級或獎勵貢獻大數目之高品質內容項目的用戶。舉例而言,可獎勵最高排名用戶禮券、獎品或其他激勵物品。在另一實例中,可將前一或多個用戶報告給網路伺服器106,其中用戶狀態之指示(例如,「前10」、「前100」、「星級貢獻者」等)可嵌入於包括由彼等用戶建立的用戶產生之內容的網頁內。在其他實施中,用戶排名可用以幫助識別不良品質用戶。舉例而言,最低排名用戶可被進一步評估為潛在濫發垃圾內容者。Based on the user credential score, the user ranking module 136 can rank users who have contributed user generated content to the web server 106. In some implementations, user rankings can be used to upgrade or reward users who contribute a large number of high quality content items. For example, the highest ranked user gift certificate, prize, or other incentive item may be awarded. In another example, the previous user or users may be reported to the web server 106, wherein the indication of the user status (eg, "top 10", "top 100", "star contributor", etc.) may be embedded Within a web page that includes content generated by users created by their users. In other implementations, user rankings can be used to help identify poor quality users. For example, the lowest ranking user can be further evaluated as a potential spammer.

在用戶將用戶產生之內容提交給網路伺服器106(例如,每個項目或成批地提交)時,用戶產生之內容可提供至互動處理伺服器110,在互動處理伺服器110處,該內容可由文章分析器128基於貢獻之品質來分析且評分。用戶憑證計分可由憑證評分模組134基於互動至由社交繪圖模組132產生之用戶活動圖的添加及由品質加權模組130計算之相關聯互動權重來連續地或週期性地更新。由文章分析器128、品質加權模組130,及社交繪圖模組132產生之互動資料可儲存於互動資料儲存庫124內。由憑證評分模組134產生之用戶憑證計分可儲存於憑證資料儲存庫138內。When the user submits the user-generated content to the web server 106 (eg, per project or batch submission), the user-generated content can be provided to the interactive processing server 110, at the interactive processing server 110, The content can be analyzed and scored by the article analyzer 128 based on the quality of the contribution. The user credential score may be continuously or periodically updated by the credential scoring module 134 based on the addition of the user activity map generated by the social mapping module 132 and the associated interactive weights calculated by the quality weighting module 130. The interactive data generated by the article analyzer 128, the quality weighting module 130, and the social mapping module 132 can be stored in the interactive data repository 124. The user credential score generated by the voucher scoring module 134 can be stored in the voucher data repository 138.

當搜尋伺服器108接收到具有包括用戶產生之內容之回應性網頁結果的搜尋查詢時,結果排名模組118可向互動處理伺服器110提供包括於查詢結果清單內之每一用戶產生之內容項目的識別。互動處理伺服器110可擷取與所識別之用戶產生之內容相關聯的用戶憑證計分,且將該等用戶憑證計分提供至搜尋伺服器108。結果排名模組118可正規化用戶憑證計分,且視情況正規化由相關性評分模組120計算出之相關性計分。結果排名模組118可接著組合用戶憑證計分與相關性計分以產生一經排名之搜尋結果清單。在一些實施中,可自對應於查詢及內容項目之特徵向量之內積計算資訊擷取(IR)計分,且搜尋結果之排名可基於為IR計分與用戶憑證計分之組合的相關性計分。搜尋伺服器108可接著回應起始對經排名之搜尋結果清單之查詢的用戶。用戶可選擇結果清單內之一結果以自網路伺服器106擷取用戶產生之內容或其他內容。When the search server 108 receives a search query having a responsive web page result including user generated content, the results ranking module 118 can provide the interactive processing server 110 with each user generated content item included in the query result list. Identification. The interactive processing server 110 may retrieve user credential scores associated with the identified user generated content and provide the user credential scores to the search server 108. The results ranking module 118 can normalize the user credential scores and normalize the relevance scores calculated by the relevance scoring module 120 as appropriate. The results ranking module 118 can then combine the user credential scores and relevance scores to generate a ranked search results list. In some implementations, the information retrieval (IR) score can be calculated from the inner product of the feature vectors corresponding to the query and the content item, and the ranking of the search results can be based on the correlation of the combination of the IR score and the user credential score. Score. The search server 108 can then respond to the user who initiated the query for the ranked search results list. The user can select one of the results in the results list to retrieve user generated content or other content from the web server 106.

儘管將網路伺服器106、互動處理伺服器110,及搜尋伺服器108各自表示為個別機器,但在一些實施中,該等伺服器106、108或110中之一或多者組合於一單一伺服器內。類似地,在一些實施中,用戶資料儲存庫126及憑證資料儲存庫138組合於可駐留於相同儲存器件內的相同記錄內。在一些實施中,文章分析器128、品質加權模組130、社交繪圖模組132、憑證評分模組134,或用戶排名模組136之功能性組合於一單一軟體應用程式內。其他實施為可能的。Although the web server 106, the interactive processing server 110, and the search server 108 are each represented as an individual machine, in some implementations, one or more of the servers 106, 108, or 110 are combined into a single unit. Inside the server. Similarly, in some implementations, user profile repository 126 and voucher repository 138 are combined in the same record that can reside within the same storage device. In some implementations, the article analyzer 128, the quality weighting module 130, the social mapping module 132, the voucher scoring module 134, or the user ranking module 136 are functionally combined within a single software application. Other implementations are possible.

圖2為與潛水旅遊(scuba travel)相關之問答電子討論版之網頁200的實例螢幕截圖。用戶可存取或登錄電子討論版以張貼與潛水旅遊相關之問題或回應由另一用戶張貼之問題。網頁200包括一系列問題202,每一者為一特定用戶識別符204所有。最後張貼行206列出哪個用戶204最近回應相關聯問題202。總檢視行208列出已檢視與每一問題202相關聯之Q&A交換的用戶之數目。總回應行210列出提供至每一問題202之回應之數目。時間行212提供對每一問題202之最近回應之時間戳記。搜尋方塊214向與網頁200互動之用戶供應搜尋電子討論版之機會。在一些實施中,網頁200描繪一較大電子社區之一特定論壇。舉例而言,電子討論網站可關於許多潛水主題(例如,水下攝影、裝備建議、安全提示等),其中一獨立論壇專用於每一主題。2 is an example screen shot of a web page 200 of a question and answer electronic discussion board related to scuba travel. Users can access or log into the eDiscovery Edition to post questions related to dive trips or respond to questions posted by another user. Web page 200 includes a series of questions 202, each of which is owned by a particular user identifier 204. The last posted line 206 lists which user 204 has recently responded to the associated question 202. The total view line 208 lists the number of users who have reviewed the Q&A exchange associated with each question 202. The total response line 210 lists the number of responses provided to each question 202. Time line 212 provides a timestamp of the most recent response to each question 202. The search block 214 provides an opportunity for the user interacting with the web page 200 to search for an electronic discussion board. In some implementations, web page 200 depicts a particular forum for one of the larger electronic communities. For example, an electronic discussion website may be related to many diving topics (eg, underwater photography, equipment advice, safety reminders, etc.), with an independent forum dedicated to each topic.

舉例而言,藉由選擇一問題202,可向用戶呈現包括該問題之全部文字(若其歸因於長度而在網頁200內截斷)連同與該問題相關之一系列答案的另一網頁。在一些實施中,用戶亦可選擇一用戶識別符(例如,用戶識別符techdiver 204a)以接收關於該用戶之額外資訊。在一些實例中,選擇該用戶識別符techdiver 204a可啟動一額外網頁,該額外網頁顯示關於用戶techdiver 204a之設定檔資訊(例如,位置、性別、年齡、潛水興趣、登錄電子討論版之日期、最近訪問電子討論版之時間戳記、用戶之憑證評分/排名等)及/或由用戶techdiver 204a貢獻之每一文章(問答)之清單。在一些實施中,由一用戶204檢視關於另一用戶204之資訊可構成影響用戶憑證計分的互動。For example, by selecting a question 202, the user may be presented with all of the text including the question (if it is truncated within the web page 200 due to length) along with another web page of a series of answers related to the question. In some implementations, the user may also select a user identifier (eg, user identifier techdiver 204a) to receive additional information about the user. In some examples, selecting the user identifier techdiver 204a may launch an additional web page displaying profile information about the user techdiver 204a (eg, location, gender, age, diving interest, date of login to the electronic discussion board, most recent A list of timestamps for accessing the eDiscovery, user's voucher score/rank, etc.) and/or each article (question and answer) contributed by the user techdiver 204a. In some implementations, viewing information about another user 204 by one user 204 can constitute an interaction that affects user credential scoring.

每一問答對可被稱為創作用戶產生之內容之用戶之間的互動。舉例而言,根據總檢視行208,由用戶headed4mex 204b貢獻之問題202b已被檢視46次,其中根據總回應行210,具有總共11個回應。11個回應中之每一者(且在一些狀況下,46次檢視中之每一者)建立回應之作者與用戶headed4mex 204b之間的互動。用戶可涉及於一個以上互動中。舉例而言,用戶maddog 204e提交第五個問題202e以及第八個問題202h。在另一實例中,用戶cheapdives 204c貢獻第三個問題202c且貢獻對第十個問題202j之最近回應。Each question and answer pair can be referred to as an interaction between users who create content generated by the user. For example, according to the total view line 208, the question 202b contributed by the user headed4mex 204b has been reviewed 46 times, with a total of 11 responses according to the total response line 210. Each of the 11 responses (and in some cases, each of the 46 views) establishes an interaction between the author of the response and the user headed4mex 204b. Users can be involved in more than one interaction. For example, user maddog 204e submits a fifth question 202e and an eighth question 202h. In another example, user cheapdives 204c contributes a third question 202c and contributes to the most recent response to the tenth question 202j.

圖3A及圖3B為實例用戶活動圖300之圖式,其描繪涉及登錄到如關於圖2所描述之電子討論版之用戶204的例示性互動。舉例而言,用戶活動圖300可由如參看圖1所描述之社交繪圖模組132產生。用戶活動圖(亦稱作社交圖)表示實體及實體之間的互動(連接)。在此實例中,將用戶表示為圖中之節點,且將互動表示為連接節點之線。該等節點及連接中之每一者可儲存為物件,或以其他方式定義於儲存於電腦可讀儲存器件上之資料結構中。舉例而言,互動可涉及兩個個別用戶之間的通信,例如,由用戶lobstahdive 204n對由用戶yellowbcd 204m提交之問題的回應。一對用戶可涉及於多個互動中。3A and 3B are diagrams of an example user activity diagram 300 depicting an exemplary interaction involving logging into a user 204 of an electronic discussion board as described with respect to FIG. 2. For example, user activity map 300 may be generated by social mapping module 132 as described with reference to FIG. User activity diagrams (also known as social graphs) represent interactions (connections) between entities and entities. In this example, the user is represented as a node in the graph and the interaction is represented as the line connecting the nodes. Each of the nodes and connections may be stored as an object or otherwise defined in a data structure stored on a computer readable storage device. For example, the interaction may involve communication between two individual users, for example, a response by the user lobstahdive 204n to the question submitted by the user yellowbcd 204m. A pair of users can be involved in multiple interactions.

如圖3A中展示,用戶204藉由表示問答互動之線連接,其模型化涉及問題202的用戶之間的互動及由在最後張貼行206(如圖2中展示)內列出之用戶提供的對問題之回應。舉例而言,由用戶techdiver 204a張貼之問題202a由用戶dive4fun 204j回答。用戶活動圖300內之邊302表示此互動。在另一實例中,用戶lobstahdive 204n及用戶yellowbcd 204m藉由表示三個個別互動(例如,由用戶lobstahdive 204n張貼之問題係由用戶yellowbcd 204m回應,或由用戶yellowbcd 204m張貼之問題係由用戶lobstahdive 204n回應)之三條邊304連接。在一些實施中,該等邊為有向的。舉例而言,邊302可自用戶dive4fun 204j指向用戶techdiver 204a,該互動與指向問題202a之回應相關。As shown in FIG. 3A, user 204 is connected by a line representing a question-and-answer interaction that models the interactions between users involved in question 202 and provided by users listed in last posted line 206 (shown in FIG. 2). Response to the question. For example, question 202a posted by user techdiver 204a is answered by user dive4fun 204j. The edge 302 within the user activity map 300 represents this interaction. In another example, the user lobstahdive 204n and the user yellowbcd 204m are represented by three individual interactions (eg, the question posted by the user lobstahdive 204n is answered by the user yellowbcd 204m, or the question posted by the user yellowbcd 204m is lobstahdive 204n by the user Respond to) the three sides of the 304 connection. In some implementations, the equilateral edges are directional. For example, edge 302 can be directed from user dive4fun 204j to user techdiver 204a, which is related to a response to question 202a.

未必所有用戶均由用戶活動圖內之互動互連在一起。舉例而言,用戶beachdweller 204g、underh2o 204p、headed4mex 204b及diverdan 204k與用戶204之群組中之其餘者分離。在一用戶活動圖內,任何數目之互動可表示於用戶之間。在一些實施中,可對一用戶活動圖內之邊加權。舉例而言,一品質計分可與一對用戶204之間的每一互動相關聯。在一些實施中,如關於圖1描述之品質加權模組130可部分地基於與由在互動內涉及之用戶中之一者或兩者貢獻的用戶產生之內容相關聯的品質計分而將權重(例如,0.3或0.5)應用於用戶活動圖300之邊。Not all users are interconnected by interactions within the user activity diagram. For example, users beachdweller 204g, underh2o 204p, headed4mex 204b, and diverdan 204k are separated from the rest of the group of users 204. Within a user activity map, any number of interactions can be represented between users. In some implementations, the edges within a user activity map can be weighted. For example, a quality score can be associated with each interaction between a pair of users 204. In some implementations, the quality weighting module 130 as described with respect to FIG. 1 can be weighted in part based on quality scores associated with user-generated content contributed by one or both of the users involved in the interaction. (eg, 0.3 or 0.5) is applied to the side of the user activity map 300.

當一用戶張貼一新問題至電子討論版時,不存在其他用戶涉及於該互動內。在一些實施中,如圖3B中展示,引入一預設用戶forum_travel 352來建立張貼該新問題之用戶與論壇之間的互動。舉例而言,與該預設用戶forum_travel 352與用戶beachdweller 204g之間的互動相關聯的品質加權可部分地基於由用戶beachdweller 204g張貼之問題相對於電子討論版之主題(例如,潛水旅遊)之相關性。舉例而言,可向由用戶beachdweller 204g張貼之關於低廉的郵購處方藥之問題授予一低品質評級,而向由用戶headed4mex張貼之關於潛入普拉亞戴爾卡門(Playa del Carmen)附近之地下暗河(cenote)之問題授予一高品質評級。When a user posts a new question to the electronic discussion board, there are no other users involved in the interaction. In some implementations, as shown in FIG. 3B, a default user forum_travel 352 is introduced to establish an interaction between the user who posted the new question and the forum. For example, the quality weighting associated with the interaction between the preset user forum_travel 352 and the user beachdweller 204g may be based in part on the relevance of the issue posted by the user beachdweller 204g relative to the subject of the electronic discussion board (eg, diving tour). Sex. For example, a low quality rating may be awarded to a question posted by the user beachdweller 204g regarding a low-priced mail-order prescription drug, and to the underground dark river posted near the Playa del Carmen (by Playa de Carmen) posted by the user headed4mex ( The issue of cenote) is awarded a high quality rating.

圖4為用於部分地基於涉及每一用戶之互動而產生用戶憑證計分之程序400的流程圖。在一些實施中,藉由如圖1中描述之內容分析器122執行程序400。4 is a flow diagram of a procedure 400 for generating user credential scores based in part on interactions involving each user. In some implementations, program 400 is executed by content analyzer 122 as described in FIG.

在402處,計算與用戶互動之內容相關聯的品質量測。在一些實例中,用戶互動包括藉由在問答(Q&A)網站上提交問題或答案,或在網誌或線上佈告欄上張貼主題或討論而上載新的用戶產生之內容。可評估每一用戶產生之內容項目以判定一品質計分。該品質計分可基於許多因素,例如,內容項目之相關性、內容項目之原創性,或內容項目之聚焦程度。在一些實施中,用戶產生之內容項目之品質計分包括個別因素計分之集合。At 402, a quality measure associated with the content of the user interaction is calculated. In some instances, user interaction includes uploading new user-generated content by submitting a question or answer on a Q&A website, or by posting a topic or discussion on a blog or online bulletin board. Each user generated content item can be evaluated to determine a quality score. This quality score can be based on a number of factors, such as the relevance of the content item, the originality of the content item, or the degree of focus of the content item. In some implementations, the quality score of a user-generated content item includes a collection of individual factor scores.

可部分地基於引入內容項目之上下文來評估用戶產生之內容項目之品質。在一些實例中,使用一相關性因子來評估一用戶產生之內容項目相對於網站、佈告欄或主題論壇之主題,或相對於內容項目所回應之問題的相關性。在一些實施中,使用潛在狄利克雷分配(Latent Dirichlet Allocation,LDA)計算一相關性因子。The quality of the user-generated content item can be evaluated based in part on the context in which the content item is introduced. In some instances, a correlation factor is used to assess the relevance of a user-generated content item relative to a website, bulletin board, or topic forum, or to a question responsive to a content item. In some implementations, a correlation factor is calculated using Latent Dirichlet Allocation (LDA).

另一例示性評估因素為涵蓋範圍,其係指所張貼之內容項目在內容項目內使用之詞彙方面的通用性或特異性。涵蓋範圍可用作用戶貢獻之聚焦程度之指示。在一些實施中,涵蓋範圍因子使用用戶產生之內容項目內的字詞之逆向文件頻率(inverse document frequency,IDF)來量測。Another exemplary evaluation factor is coverage, which refers to the versatility or specificity of the vocabulary used by the posted content item within the content item. Coverage can be used as an indication of the degree of focus of user contributions. In some implementations, the coverage factor is measured using an inverse document frequency (IDF) of words within a user generated content item.

用戶內容項目可另外基於其原創性來評估。舉例而言,張貼至Q&A網站之一問題可與張貼至該網站之其他問題相比較,或張貼至Q&A網站之一答案可與同一或類似問題之其他答案相比較。在一些實施中,一內容項目之原創性可部分地基於與由同一用戶貢獻之其他內容項目的比較。舉例而言,跨Q&A網站之多個主題張貼相同訊息(例如,垃圾內容)之用戶可藉由檢查由該用戶提供之所有內容的原創性來發現。在一些實施中,經由(例如)雙語評估替代(Bilingual Evaluation Understudy,BLEU)評分方法量測原創性計分。在一些實施中,如圖1中描述之文章分析器128將品質計分應用於用戶產生之內容項目。User content items can additionally be evaluated based on their originality. For example, a question posted to a Q&A website can be compared to other questions posted to the website, or an answer posted to one of the Q&A websites can be compared to other answers to the same or similar questions. In some implementations, the originality of a content item can be based in part on comparisons with other content items contributed by the same user. For example, a user who posts the same message (eg, spam) across multiple topics of the Q&A website can be discovered by examining the originality of all content provided by the user. In some implementations, the original score is measured via, for example, a Bilingual Evaluation Understudy (BLEU) scoring method. In some implementations, the article analyzer 128 as depicted in FIG. 1 applies a quality score to a user-generated content item.

在404處,基於用戶之間的互動識別用戶之間的鏈接。可使用兩個用戶之間的互動(諸如「回覆」活動)設立該等個別用戶之間的鏈接。一常見類型之互動可涉及第一用戶上載新的用戶產生之內容且第二用戶以某一方式(例如,檢視、排名、評級,或上載一新的用戶產生之內容回應)回應該用戶產生之內容。在另一實例中,當用戶上載一新的問題至Q&A網站或在線上佈告欄上張貼一討論時,將互動視為係在該用戶與網站或佈告欄之間。在一些實例中,鏈接基於以下各者連接兩個用戶:由第二用戶張貼的回應於由第一用戶張貼之問題的答案,由第二用戶張貼的參考由第一用戶提供之內容的評級,或由第二用戶張貼的回應於由第一用戶張貼之網誌條目的評論。在一些實施中,一虛擬用戶(例如,節點)表示互動之一端。舉例而言,若第一用戶張貼一新問題至佈告欄,則該佈告欄可由一虛擬用戶表示。舉例而言,以此方式,若一用戶張貼一無關主題至一討論版,且無用戶回應該無關文章,則程序400具有用於應用一品質計分至該用戶產生之內容且相應地調整該用戶之憑證計分的方法。At 404, links between users are identified based on interactions between users. A link between these individual users can be established using an interaction between two users, such as a "reply" activity. A common type of interaction may involve the first user uploading new user generated content and the second user responding in a certain manner (eg, viewing, ranking, rating, or uploading a new user generated content response) to the user. content. In another example, when a user uploads a new question to a Q&A website or posts a discussion on the online bulletin board, the interaction is considered to be between the user and the website or bulletin board. In some examples, the link connects two users based on the following: a response posted by the second user in response to the question posted by the first user, and a rating posted by the second user referenced by the content provided by the first user, Or a comment posted by the second user in response to the blog entry posted by the first user. In some implementations, a virtual user (eg, a node) represents one end of an interaction. For example, if the first user posts a new question to the bulletin board, the bulletin board can be represented by a virtual user. For example, in this way, if a user posts an irrelevant topic to a discussion board, and no user responds to an irrelevant article, the program 400 has a content for applying a quality score to the user and adjusts the content accordingly. The method of scoring a user's voucher.

在406處,向互動指派一表示互動之品質之加權因子。與前述相關性、涵蓋範圍及原創性因素及/或其他因素(例如,貢獻之及時性、包括多媒體、包括豐富媒體等)相關之品質計分可個別地加權並組合以產生用於內容項目之單一總品質計分。在一些實例中,該品質計分為一數字評級(例如,1與10或1與100等之間的值),或一類別評級(例如,正面、中性,或負面)。在一些實施中,與個別品質因子相關之品質計分可基於習得係數加權,且經組合以產生一品質加權。舉例而言,習得係數可基於每一品質因子相對於電子社區之相對重要性產生且應用於個別品質因子。該等係數可能取決於電子社區之類型而不同。舉例而言,在一些實施中,對用戶文章之回應之及時性比用戶文章之涵蓋範圍更重要。在一些實施中,如圖1中描述之品質加權模組130計算與互動鏈接相關聯之加權因子。At 406, the interaction is assigned a weighting factor that represents the quality of the interaction. Quality scores relating to the aforementioned correlations, coverage and originality factors and/or other factors (eg, timeliness of contributions, including multimedia, including rich media, etc.) may be individually weighted and combined to produce content items for use in content items. Single total quality score. In some examples, the quality score is divided into a numerical rating (eg, a value between 1 and 10 or 1 and 100, etc.), or a category rating (eg, positive, neutral, or negative). In some implementations, quality scores associated with individual quality factors can be weighted based on learned coefficients and combined to produce a quality weight. For example, the learned coefficients can be generated based on the relative importance of each quality factor relative to the electronic community and applied to individual quality factors. These factors may vary depending on the type of e-community. For example, in some implementations, the timeliness of responses to user articles is more important than the coverage of user articles. In some implementations, the quality weighting module 130 as depicted in FIG. 1 calculates a weighting factor associated with the interactive link.

在408處,計算用戶憑證計分。舉例而言,用戶憑證計分試圖量化由一特定用戶對電子社區進行之貢獻之值。概括而言,一用戶可貢獻新內容或提供關於由另一用戶貢獻之內容的回饋。在產生一用戶之憑證計分時,可獨立地考慮由該用戶採用之每一角色。首先,可基於用戶已對電子社區(例如,網站、討論版、論壇等)進行之貢獻之品質來向該用戶指派一貢獻度計分。舉例而言,貢獻度計分表示一特定用戶已問多少個問題及電子社區內其他有權威用戶對該等問題產生多少興趣的量測。第二,可基於一用戶已對由其他用戶貢獻至電子社區的用戶產生之內容進行的回應或評級之品質向該用戶指派一權威度計分。舉例而言,該權威度計分可表示一用戶已提供至電子社區之有用回應之數目的量測。取決於用戶與電子社區互動之方式,用戶可能具有一高貢獻度計分及一低權威度計分,或具有一低貢獻度計分及一高權威度計分。在一些實施中,該等計分可根據一時間衰減計算,使得較舊互動與最近互動相比對加權及/或計分具有較小影響。At 408, a user credential score is calculated. For example, user credential scoring attempts to quantify the value of a contribution made by a particular user to an e-community. In summary, a user can contribute new content or provide feedback about content contributed by another user. Each character employed by the user can be independently considered when generating a user's voucher score. First, the user can be assigned a contribution score based on the quality of the contribution that the user has made to the e-community (eg, website, discussion board, forum, etc.). For example, a contribution score indicates how many questions a particular user has asked and how much other authoritative users in the e-community have generated interest in the questions. Second, the user can be assigned an authority score based on the quality of the response or rating of a user's content that has been generated by other users contributing to the electronic community. For example, the authority score can represent a measure of the number of useful responses a user has provided to the e-community. Depending on how the user interacts with the e-community, the user may have a high contribution score and a low authority score, or have a low contribution score and a high authority score. In some implementations, the scores can be calculated based on a time decay such that the older interaction has less impact on weighting and/or scoring than the most recent interaction.

在一些實施中,使用超鏈接誘導主題搜尋(Hyperlink-Induced Topic Search,HITS)演算法之經修改版本(如結合下文之圖5所描述)來產生每一用戶之貢獻度計分及權威度計分。舉例而言,一特定用戶之貢獻度計分可藉由按比例調整及加權(例如,藉由涉及於互動中之用戶產生之內容項目的品質計分)涉及於與該用戶之互動中的其他用戶之權威度計分來產生。類似地,一特定用戶之權威度計分可藉由按比例調整及加權(例如,藉由涉及於互動中之用戶產生之內容項目的品質計分)涉及於與該用戶之互動中的其他用戶之貢獻度計分來產生。以此方式,由一有權威第二用戶張貼之對由第一用戶產生之內容項目的正面(高品質)回應可增加第一用戶之貢獻度計分。或者,在計算第一用戶之貢獻度計分時,由一不信任(濫發垃圾內容者)第二用戶張貼之負面(低品質)回應可使其折減。一旦已指派貢獻度計分及權威度計分,可基於該兩個用戶憑證計分之組合(例如,平均值或某一其他線性或非線性組合)計算一總的用戶信譽計分。In some implementations, a modified version of the Hyperlink-Induced Topic Search (HITS) algorithm (as described in conjunction with Figure 5 below) is used to generate a contribution score and an authority for each user. Minute. For example, a particular user's contribution score can be related to the interaction with the user by scaling and weighting (eg, by quality scores of content items generated by users involved in the interaction) User authority scores are generated. Similarly, a particular user's authority score can be related to other users in the interaction with the user by scaling and weighting (eg, by quality scores of user-generated content items involved in the interaction). The contribution score is generated. In this manner, a positive (high quality) response to a content item generated by the first user posted by an authoritative second user may increase the contribution score of the first user. Alternatively, when calculating the contribution score of the first user, the negative (low quality) response posted by the second user by a non-trust (spammer) may be reduced. Once the contribution score and authority score have been assigned, a total user reputation score can be calculated based on the combination of the two user credential scores (eg, the average or some other linear or non-linear combination).

圖5為用於產生用戶憑證計分(例如,貢獻度計分及權威度計分)之程序500的流程圖。舉例而言,程序500可使用基於HITS鏈接之排名演算法之經修改版本執行,其中使用用戶之權威度計分來替代內容作為有權威來源之價值的習知HITS估計,且使用用戶貢獻度計分替代估計用戶作為有權威參考之價值的習知HITS中心程度計分(hub score)。此外,經修改之HITS演算法在產生用戶憑證計分時亦可使用經加權鏈接。在一些實施中,內容分析器122(如關於圖1所描述)可執行程序500。FIG. 5 is a flow diagram of a routine 500 for generating user credential scores (eg, contribution scores and authority scores). For example, the routine 500 can be executed using a modified version of a ranking algorithm based on a HITS link, where the user's authority score is used instead of the content as a conventional HITS estimate of the value of an authoritative source, and using the user contribution Sub-replacement estimates the user's known HITS center score as a value of authoritative reference. In addition, the modified HITS algorithm may also use weighted links when generating user credential scores. In some implementations, content analyzer 122 (as described with respect to FIG. 1) can execute program 500.

在502處,建構一經加權之用戶活動圖。基於在程序400(如關於圖4所描述)內導出之互動鏈接,可建構定義用戶之間的經加權互動之親和性矩陣,其中兩個用戶之間的每一有向鏈接藉由與涉及於互動中之用戶產生之內容項目相關聯的品質因子加權。舉例而言,矩陣元素A(u i ,u j ) 可由第一用戶u i 與第二用戶u j 之間的互動品質加權因子之組合(例如,總和、經加權總和等)填入。在一些實施中,藉由社交繪圖模組132(如圖1中展示)產生用戶活動圖。At 502, a weighted user activity map is constructed. Based on the interactive links derived within the program 400 (as described with respect to FIG. 4), an affinity matrix defining the weighted interactions between the users can be constructed, wherein each directed link between the two users is related to The quality factor associated with the content item generated by the user in the interaction is weighted. For example, the matrix elements A(u i , u j ) may be populated by a combination of interaction quality weighting factors (eg, sum, weighted sum, etc.) between the first user u i and the second user u j . In some implementations, a user activity map is generated by a social mapping module 132 (as shown in FIG. 1).

在504處,初始化與每一用戶相關聯之貢獻度計分及權威度計分。用於貢獻度計分及權威度計分之基值初始化系統,以使得可基於中性開始點獲得相對貢獻度計分及權威度計分。可指派一隨機初始值。在一些實施中,貢獻度計分及權威度計分各自由一向量方程式表示。舉例而言,可將貢獻度計分視為具有在第一定向(例如,自回應者至發問者)上指向之鏈接的用戶活動圖之表示,而可將權威度計分視為具有在反向定向(例如,自發問者至回應者)上指向之鏈接的用戶活動圖之表示。舉例而言,可經由內容項目品質計分之隨機取樣獲得隨機初始值。在一些實施中,初始值等於、近似或以其他方式基於在基於HITS之演算法之先前執行(例如,在額外互動發生之後更新用戶憑證值)中產生的值。At 504, the contribution score and authority score associated with each user are initialized. A base value initialization system for contribution scoring and authoritative scoring so that relative contribution scoring and authority scoring can be obtained based on the neutral starting point. A random initial value can be assigned. In some implementations, the contribution score and the authority score are each represented by a vector equation. For example, the contribution score can be viewed as a representation of a user activity map having a link to the first orientation (eg, from the responder to the questioner), and the authority score can be considered to have A representation of the user activity map of the link to the reverse direction (eg, from the sender to the responder). For example, a random initial value can be obtained via random sampling of content item quality scores. In some implementations, the initial value is equal to, approximated, or otherwise based on a value generated in a previous execution of a HITS-based algorithm (eg, updating a user credential value after an additional interaction occurs).

在506處,定義收斂臨限值。可使用收斂臨限值定義計算每一用戶之權威度計分及貢獻度計分之停止點。At 506, a convergence threshold is defined. The convergence threshold can be used to define the stopping point for the authority score and contribution score for each user.

在508處,使用收斂臨限值作為準則,基於經加權之用戶活動圖更新貢獻度計分及權威度計分。舉例而言,可使用涉及首先正規化用戶活動圖矩陣之每一列且接著正規化用戶活動圖矩陣之每一行的演算法來計算向量計分。At 508, the convergence threshold is used as a criterion to update the contribution score and the authority score based on the weighted user activity map. For example, vector scoring can be calculated using an algorithm involving first normalizing each column of the user activity graph matrix and then normalizing each row of the user activity graph matrix.

若在510處發現計分改變(在隨機地初始化之值與計算出之值之間)大於或等於收斂臨限值,則重複該演算法。在每一反覆之結束時,將先前計算值與當前計算值之間的計分之改變與收斂臨限值相比較。在一些實施中,藉由憑證評分模組134執行用戶計分之初始化及計算。在存在大數目之用戶產生之內容項目的系統之實例中,跨多個電腦處理器並行地反覆計算用戶憑證計分。The algorithm is repeated if it is found at 510 that the score change (between the randomly initialized value and the calculated value) is greater than or equal to the convergence threshold. At the end of each iteration, the change in the score between the previously calculated value and the current calculated value is compared to the convergence threshold. In some implementations, the initialization and calculation of user scores is performed by the voucher scoring module 134. In an example of a system in which a large number of user-generated content items exist, user credential scores are repeatedly calculated in parallel across multiple computer processors.

一旦計分之改變小於收斂臨限值,則在512處指派用戶憑證計分。舉例而言,與每一用戶相關聯之貢獻度計分及權威度計分可儲存於與對應用戶識別符相關聯之憑證資料儲存庫138(如圖1中展示)內。貢獻度計分及權威度計分可視情況組合以產生一總用戶信譽計分。基於用戶憑證計分(貢獻度、授權及/或信譽),可視情況排名該等用戶(例如,藉由圖1之用戶排名模組136)。Once the score change is less than the convergence threshold, the user credential score is assigned at 512. For example, the contribution scores and authority scores associated with each user may be stored in a voucher repository 138 (as shown in FIG. 1) associated with the corresponding user identifier. Contribution scores and authority scores can be combined as appropriate to generate a total user reputation score. Based on user credential scores (contribution, authorization, and/or reputation), such users may be ranked as appropriate (eg, by user ranking module 136 of FIG. 1).

可每當需要時即重複程序500以基於添加至電子社區之新的用戶產生之內容項目更新用戶憑證計分。舉例而言,程序500可每日或每週重複一次以產生經更新之用戶憑證計分,且視情況更新電子社區內之個別用戶之排名。在一些實施中,可遞增地執行程序500,例如,藉由在每一新互動之後遞增地調整用戶憑證計分。舉例而言,可使用新互動之品質調整用戶活動圖中之互動附近的用戶之用戶憑證計分(權威度及貢獻度)。在一些實施中,可定義一傳播距離參數以限制總用戶活動圖中之計分調整之傳播。舉例而言,為二之傳播距離參數值可限制對在對應於新互動之鏈接之兩個鏈接內的節點的用戶憑證計分之遞增重計算。因此,在此實例中,若用戶不具有與涉及於新互動中之用戶,或與涉及於新互動中之用戶之兩個鏈接內的用戶的互動歷史,則表示彼等用戶之節點將不受影響。因而,對於每一新互動,社交活動圖可由新鏈接更新,且亦可更新在社交活動圖上之傳播距離內的用戶之用戶憑證計分。在一些實施中,替代限制傳播距離,或除限制傳播距離之外,可限制受影響節點之數目。舉例而言,待更新之節點可根據與新鏈接之接近度及將該節點直接或間接地鏈接至藉由新鏈接連接之節點的互動之數目來選擇。可使用用戶憑證計分之遞增計算來提供實質上即時用戶排名或對用戶憑證計分之存取。甚至在遞增計分更新之情況下,用於總用戶活動圖之計分之計算亦可針對所有節點使用上文描述之反覆方法週期性地加以計算,以確保遞增更新不引起自全局地準確之計分之發散。The program 500 can be repeated whenever needed to update the user credential score based on new user-generated content items added to the e-community. For example, the process 500 can be repeated daily or weekly to generate updated user credential scores and, as appropriate, update the rankings of individual users within the electronic community. In some implementations, program 500 can be executed incrementally, for example, by incrementally adjusting user credential scores after each new interaction. For example, the user's voucher score (authority and contribution) for users in the vicinity of the interaction in the user activity map can be adjusted using the quality of the new interaction. In some implementations, a propagation distance parameter can be defined to limit the propagation of the scoring adjustments in the overall user activity map. For example, a propagation distance parameter value of two may limit the incremental recalculation of user credential scores for nodes within the two links corresponding to the link of the new interaction. Therefore, in this example, if the user does not have an interaction history with the user involved in the new interaction or with the user involved in the new interaction, the node representing the user will not be influences. Thus, for each new interaction, the social activity map can be updated by the new link and can also update the user's user voucher score within the travel distance on the social activity map. In some implementations, instead of limiting the propagation distance, or in addition to limiting the propagation distance, the number of affected nodes may be limited. For example, the node to be updated may be selected based on the proximity to the new link and the number of interactions that link the node directly or indirectly to the node connected by the new link. Incremental calculations of user credential scores can be used to provide substantially instant user ranking or access to user credential scores. Even in the case of incremental scoring updates, the calculation of the scoring for the total user activity map can be periodically calculated for all nodes using the above-described repeated methods to ensure that incremental updates do not cause global accuracy. The score is divergent.

圖6為用於使用用戶憑證計分產生搜尋結果之排名之程序600的流程圖。舉例而言,程序600可組合搜尋伺服器108(如圖1中展示)之活動與互動處理伺服器110之活動以藉由用戶產生之內容之品質量測來強化傳統搜尋結果排名。在602處,計算與涉及於互動中之用戶產生之內容項目相關聯的品質量測。在一些實例中,此等品質量測係基於許多因素,包括(但不限於)相關性、涵蓋範圍、原創性、貢獻之及時性(例如,藉由比緩慢回應價值大得多的快速回應,或反之亦然)、多媒體內容,或豐富媒體內容。6 is a flow diagram of a procedure 600 for generating a ranking of search results using user credential scores. For example, the program 600 can combine the activity of the search server 108 (shown in FIG. 1) with the activity of the interactive processing server 110 to enhance the ranking of traditional search results by product quality measurements of the content generated by the user. At 602, a quality measure associated with a user-generated content item involved in the interaction is calculated. In some instances, these quality measures are based on a number of factors including, but not limited to, relevance, coverage, originality, timeliness of contributions (eg, by a quick response that is much greater than a slow response value, or And vice versa), multimedia content, or rich media content.

在604處,基於用戶之間的互動識別用戶之間的鏈接。有向鏈接可對應於一用戶對由另一用戶提供之用戶產生之內容項目(例如,問題、網誌、多媒體內容項目等)的反應(例如,張貼之回應、排名、檢視等)。在一些實施中,該等互動發生在例如Q&A網站、論壇或線上佈告欄之電子社區內。At 604, links between users are identified based on interactions between users. A directed link may correspond to a user's reaction to a content item (eg, a question, a blog, a multimedia content item, etc.) generated by a user provided by another user (eg, posting a response, ranking, viewing, etc.). In some implementations, such interactions occur within an electronic community such as a Q&A website, a forum, or an online bulletin board.

在606處,計算用戶憑證計分。用戶憑證計分可包括一個以上個別計分,例如,貢獻度計分及權威度計分。用戶憑證計分可包括諸如信譽計分之複合計分。舉例而言,可使用如結合圖5描述之程序500計算用戶憑證計分。在一些實施中,藉由如圖1中展示之憑證評分模組134計算用戶憑證計分。At 606, a user credential score is calculated. User voucher scoring can include more than one individual scoring, such as contribution scoring and authority scoring. User credential scores may include composite scores such as credit scores. For example, the user credential score can be calculated using the procedure 500 as described in connection with FIG. In some implementations, the user credential score is calculated by the voucher scoring module 134 as shown in FIG.

在608處,正規化用戶憑證計分。為了使搜尋結果排名部分地基於供應內容項目之用戶之憑證計分,首先正規化用戶憑證計分以提供搜尋結果評分與用戶憑證評分之間的比較基礎。在一些實施中,可將用戶憑證計分變換成標準高斯分佈。在一些實施中,藉由如圖1中展示之結果排名模組118正規化用戶憑證計分。At 608, the user credentials are normalized. In order to rank the search results based in part on the voucher scores of the users who supply the content items, the user voucher scores are first normalized to provide a basis for comparison between the search result scores and the user credential scores. In some implementations, the user credential score can be transformed into a standard Gaussian distribution. In some implementations, the user credential score is normalized by the ranking module 118 as shown in FIG.

在612處,在執行步驟602至608之相同時間或不同時間,接收一搜尋查詢。該搜尋查詢可係回應於一或多個用戶產生之內容項目。在一些實例中,該搜尋查詢提交給含有用戶產生之內容項目之電子社區或提交給通用搜尋引擎。在一些實施中,由如圖1中展示之搜尋引擎114接收該搜尋查詢。At 612, a search query is received at the same time or at different times when steps 602 through 608 are performed. The search query may be in response to one or more user generated content items. In some instances, the search query is submitted to an electronic community containing user generated content items or submitted to a general search engine. In some implementations, the search query is received by a search engine 114 as shown in FIG.

在614處,識別回應於該查詢之用戶產生之內容條目。舉例而言,該等回應性內容條目可基於搜尋查詢與內容項目之文字之間的關鍵字匹配來定位。At 614, a content entry generated by a user responsive to the query is identified. For example, the responsive content items can be located based on a keyword match between the search query and the text of the content item.

在616處,基於第一排名方法將相關性計分視情況應用於所識別之內容條目。舉例而言,搜尋引擎114可將資訊擷取(IR)計分應用於用戶產生之內容條目以判定回應於該搜尋查詢的項目之第一排名。在一實例中,可使用Okapi BM25排名函數來判定每一用戶產生之內容項目相對於搜尋查詢的相關性計分。At 616, the relevance score is applied to the identified content item as appropriate based on the first ranking method. For example, search engine 114 can apply an information capture (IR) score to a user-generated content item to determine a first ranking of items that are responsive to the search query. In one example, the Okapi BM25 ranking function can be used to determine the relevance score of each user-generated content item relative to the search query.

若已將相關性計分應用於所識別之內容條目,則在618處正規化相關性計分。舉例而言,可選擇正規化技術以使得來自步驟608之經正規化之用戶憑證計分可與經正規化之相關性計分組合。在一些實施中,可將相關性計分變換成標準高斯分佈。在一些實施中,藉由如圖1中展示之結果排名模組118正規化用戶憑證計分。If the relevance score has been applied to the identified content item, the correlation score is normalized at 618. For example, the normalization technique can be selected such that the normalized user credential score from step 608 can be combined with the normalized correlation score. In some implementations, the correlation score can be transformed into a standard Gaussian distribution. In some implementations, the user credential score is normalized by the ranking module 118 as shown in FIG.

在610處,使用來自步驟608之經正規化之用戶憑證計分及視情況來自步驟618之經正規化之相關性計分,基於與所識別之內容條目相關聯之用戶憑證計分加上可選經正規化之相關性計分來排名該等條目。在一些實施中,所使用之用戶憑證計分之類型係基於由查詢搜尋結果參考之用戶產生之內容項目的類型。舉例而言,若內容項目係對一問題之回應,則用戶權威度計分可經正規化且與相關性計分組合以用於排名。另一方面,若內容項目係一問題,則用戶貢獻度計分可經正規化且與相關性計分組合以用於排名。在組合經正規化之用戶憑證計分與經正規化之相關性計分時,可將加權應用於該等計分中之一者。舉例而言,可將零與1之間的加權因子施加至經正規化之用戶憑證計分以表示用戶信譽在排名查詢搜尋結果內之用戶產生之內容項目時之相對重要性。該等查詢搜尋結果現可藉由名次分類且返回至請求者。At 610, the normalized user credential score from step 608 and the normalized correlation score from step 618 are used, based on the user credential score associated with the identified content item plus The normalized correlation score is chosen to rank the entries. In some implementations, the type of user credential score used is based on the type of content item generated by the user referenced by the query search results. For example, if the content item is a response to a question, the user authority score can be normalized and combined with the relevance score for ranking. On the other hand, if the content item is a problem, the user contribution score can be normalized and combined with the relevance score for ranking. The weighting can be applied to one of the scores when combining the normalized user credential score with the normalized correlation score. For example, a weighting factor between zero and one may be applied to the normalized user credential score to indicate the relative importance of the user's reputation as a user-generated content item within the ranking query search results. These query search results can now be sorted by ranking and returned to the requester.

實例:品質因子計分之整合Example: Integration of Quality Factor Scoring

給定與個別品質因子(例如,涵蓋範圍、原創性、相關性或及時性)相關之計分之集合,可應用以下例示性方程式來產生一單一品質指示符。此品質指示符可用作說明用戶之間的互動之用戶活動圖內的加權。Given a set of scores associated with individual quality factors (eg, coverage, originality, relevance, or timeliness), the following illustrative equations can be applied to generate a single quality indicator. This quality indicator can be used as a weighting within the user activity graph that illustrates the interaction between users.

可應用使用係數向量之線性組合來產生涉及對問題q 之回應r 的組合計分com (q i ,r ij ):Applicable coefficient vector The linear combination produces a combined score com ( q i , r ij ) involving the response r to the problem q :

com (q i ,r ij )=α 0 +α 1rel (q i ,r ij )+α 2 ‧cov(r ij )+α 3ori (r ij ) Com ( q i , r ij )= α 0 + α 1rel ( q i , r ij )+ α 2 ‧cov( r ij )+ α 3ori ( r ij )

其中rel 為相關性因子,cov 為涵蓋範圍因子,且ori 為原創性因子。Where rel is the correlation factor, cov is the coverage factor, and ori is the originality factor.

使用組合計分,可藉由將組合計分引入至以下方程式中來產生品質計分qua (q i ,r ij )(例如,涉及問題q 與回應r 之互動加權):Using combined scoring, a quality score qua ( q i , r ij ) can be generated by introducing a combined score into the following equation (eg, involving the interaction weighting of question q and response r ):

在一些實施中,可使用任何標準習得係數演算法來習得α係數。In some implementations, any standard acquisition coefficient algorithm can be used to learn the alpha coefficients.

實例:計算用戶憑證計分Example: Calculating User Credential Scores

基於產生為親和性矩陣A 之用戶活動圖(其中每一元素A (u i ,u j )含有自用戶u i 至用戶u j 之邊權重之總和),貢獻度計分及權威度計分可以用於兩者之隨機初始值開始而反覆地計算。Based on the user activity graph generated as the affinity matrix A (where each element A ( u i , u j ) contains the sum of the weights from the user u i to the user u j ), the contribution scores And authority score Can be used and The random initial values of the two start and are calculated repeatedly.

其中i為全1向量,A row 與A相同,其中A row 之列經正規化成總和為1,A col 為行經正規化成總和為1的A ,且ε為用以保證演算法之收斂的重設機率。該演算法反覆直至滿足以下收斂條件:Where i is an all-one vector, A row is the same as A, where A row is normalized to a total of 1, A col is normalized to a total of 1 A , and ε is used to ensure the convergence of the algorithm. Probability. The algorithm repeats until the following convergence conditions are met:

其中l 為預定義臨限值。Where l is a predefined threshold.

本說明書中描述之標的物及操作之實施例可實施於數位電子電路中,或電腦軟體、韌體或硬體(包括在本說明書中揭示之結構及其結構等效物)中,或其一或多者之組合中。本說明書中描述之標的物之實施例可實施為編碼於電腦儲存媒體上的用於由資料處理裝置執行或用以控制資料處理裝置之操作的一或多個電腦程式(亦即,電腦程式指令之一或多個模組)。其他或另外,該等程式指令可編碼於人工地產生之傳播信號(例如,機器產生之電信號、光學信號,或電磁信號)上,其經產生以編碼用於傳輸至合適接收器裝置以用於由資料處理裝置執行的資訊。電腦儲存媒體可為以下各者或包括於以下各者中:電腦可讀儲存器件、電腦可讀儲存基板、隨機或串列存取記憶體陣列或器件,或其一或多者之組合。此外,雖然電腦儲存媒體不為傳播信號,但電腦儲存媒體可為編碼於人工地產生之傳播信號中的電腦程式指令之來源或目的地。電腦儲存媒體亦可為以下各者或包括於以下各者中:一或多個獨立實體組件或媒體(例如,多個CD、碟片或其他儲存器件)。The embodiments of the subject matter and operation described in this specification can be implemented in a digital electronic circuit, or in a computer software, firmware or hardware (including the structures disclosed in the specification and their structural equivalents), or one of them. Or a combination of many. The embodiments of the subject matter described in this specification can be implemented as one or more computer programs (ie, computer program instructions) for use on a computer storage medium for execution by a data processing device or for controlling the operation of a data processing device. One or more modules). Alternatively or additionally, the program instructions may be encoded on a manually generated propagating signal (eg, a machine-generated electrical, optical, or electromagnetic signal) that is encoded for transmission to a suitable receiver device for use. Information that is executed by the data processing device. The computer storage medium can be, or be included in, a computer readable storage device, a computer readable storage substrate, a random or serial access memory array or device, or a combination of one or more thereof. In addition, although the computer storage medium is not intended to propagate signals, the computer storage medium may be the source or destination of computer program instructions encoded in the artificially generated propagation signals. The computer storage medium can also be included in or included in one or more separate physical components or media (eg, multiple CDs, discs, or other storage devices).

本說明書中描述之操作可實施為由一資料儲存裝置對儲存於一或多個電腦可讀儲存器件上或自其他來源接收的資料執行的操作。The operations described in this specification can be implemented as operations performed by a data storage device on data stored on one or more computer readable storage devices or received from other sources.

術語「資料處理裝置」涵蓋用於處理資料之所有種類之裝置、器件及機器,包括(作為實例)可程式化處理器、電腦、晶片上系統,或前述各者中之多者,或前述各者之組合。該裝置可包括專用邏輯電路,例如,場可程式化閘陣列(FPGA)或特殊應用積體電路(ASIC)。該裝置除硬體外亦可包括產生用於討論中之電腦程式的執行環境的程式碼,例如,構成處理器韌體、協定堆疊、資料庫管理系統、作業系統、跨平台執行環境、虛擬機,或其一或多者之組合的程式碼。該裝置及執行環境可實現各種不同計算模型基礎結構,例如,web服務、分散式計算及柵格計算基礎結構。The term "data processing apparatus" encompasses all types of devices, devices and machines for processing data, including (as an example) a programmable processor, a computer, a system on a wafer, or a plurality of the foregoing, or each of the foregoing a combination of people. The apparatus can include dedicated logic circuitry, such as a field programmable gate array (FPGA) or a special application integrated circuit (ASIC). The device may include, in addition to the hard body, code that generates an execution environment for the computer program in question, for example, a processor firmware, a protocol stack, a database management system, an operating system, a cross-platform execution environment, a virtual machine, The code of a combination of one or more of them. The device and execution environment enable a variety of different computing model infrastructures, such as web services, decentralized computing, and raster computing infrastructure.

電腦程式(亦稱為程式、軟體、軟體應用程式、指令碼或程式碼)可以任何形式之程式設計語言(包括編譯語言或解譯語言,宣告或程序性語言)來撰寫,且其可以任何形式部署,包括部署為一獨立程式或一模組、組件、次常式、物件或適合於在計算環境中使用之其他單元。一電腦程式可(但無需)對應於一檔案系統中之檔案。可將程式儲存於保持其他程式或資料(例如,儲存於標記語言文件中之一或多個指令碼)的檔案之一部分中、專用於討論中之程式之單一檔案中,或多個協同之檔案(例如,儲存一或多個模組、子程式或程式碼之部分的檔案)中。一電腦程式可經部署以在一台電腦上執行或在位於一位點處之多個電腦上執行或在分散於多個位點中且由通信網路互連的多個電腦上執行。Computer programs (also known as programs, software, software applications, scripts or code) can be written in any form of programming language (including compiled or interpreted languages, announcements or procedural languages) and can be in any form. Deployment, including deployment as a stand-alone program or a module, component, subroutine, object, or other unit suitable for use in a computing environment. A computer program can (but need not) correspond to a file in a file system. The program can be stored in a portion of a file that holds other programs or materials (for example, one or more scripts stored in a markup language file), a single file dedicated to the program under discussion, or multiple collaborative files. (for example, a file that stores one or more modules, subroutines, or parts of a code). A computer program can be deployed to execute on one computer or on multiple computers located at a single point or on multiple computers that are spread across multiple sites and interconnected by a communication network.

本說明書中描述之程序及邏輯流程可藉由執行一或多個電腦程式之一或多個可程式化處理器執行,以藉由對輸入資料進行操作且產生輸出來執行動作。該等程序及邏輯流程亦可藉由例如場可程式化閘陣列(FPGA)或特殊應用積體電路(ASIC)之專用邏輯電路執行,且裝置亦可實施為例如場可程式化閘陣列(FPGA)或特殊應用積體電路(ASIC)之專用邏輯電路。The procedures and logic flows described in this specification can be performed by executing one or more computer programs or a plurality of programmable processors to perform actions by operating on input data and generating output. The programs and logic flows may also be performed by dedicated logic circuits such as field programmable gate arrays (FPGAs) or special application integrated circuits (ASICs), and the devices may also be implemented as, for example, field programmable gate arrays (FPGAs). ) or a dedicated logic circuit for a special application integrated circuit (ASIC).

適合於執行電腦程式之處理器包括(作為實例):通用微處理器及專用微處理器兩者,及任何種類之數位電腦之任何一或多個處理器。一般而言,處理器將自一唯讀記憶體或一隨機存取記憶體或兩者接收指令及資料。電腦之主要元件係用於執行根據指令之動作之處理器及用於儲存指令及資料之一或多個記憶體器件。一般而言,電腦將亦包括用於儲存資料之一或多個大容量儲存器件(例如,磁碟、磁光碟或光碟),或操作性地耦接以自一或多個大容量儲存器件接收資料或傳送資料至一或多個大容量儲存器件,或接收及傳送兩者。然而,電腦無需具有此等器件。此外,電腦可嵌入於另一器件中,例如,行動電話、個人數位助理(PDA)、行動音訊或視訊播放器、遊戲機、全球定位系統(GPS)接收器,或攜帶型儲存器件(例如,通用串列匯流排(USB)隨身碟(flash drive))(僅舉幾例)。適合於儲存電腦程式指令及資料之器件包括所有形式之非揮發性記憶體、媒體及記憶體器件,包括(作為實例):半導體記憶體器件,例如,EPROM、EEPROM及快閃記憶體器件;磁碟,例如,內部硬碟或抽取式碟片;磁光碟;及CD-ROM及DVD-ROM碟片。處理器及記憶體可由專用邏輯電路補充,或併入於專用邏輯電路中。Processors suitable for the execution of a computer program include, by way of example, both a general purpose microprocessor and a special purpose microprocessor, and any one or more processors of any kind of digital computer. Generally, the processor will receive instructions and data from a read-only memory or a random access memory or both. The main components of the computer are used to execute the processor according to the actions of the instructions and one or more memory devices for storing the instructions and data. In general, the computer will also include one or more mass storage devices (eg, magnetic disks, magneto-optical disks, or optical disks) for storing data, or operatively coupled to receive from one or more mass storage devices. Data or transmission of data to one or more mass storage devices, or both reception and transmission. However, computers do not need to have such devices. In addition, the computer can be embedded in another device, such as a mobile phone, a personal digital assistant (PDA), a mobile audio or video player, a gaming machine, a global positioning system (GPS) receiver, or a portable storage device (eg, Universal Serial Bus (USB) flash drive (to name a few). Devices suitable for storing computer program instructions and data include all forms of non-volatile memory, media and memory devices, including (as an example): semiconductor memory devices such as EPROM, EEPROM and flash memory devices; Discs, for example, internal hard drives or removable discs; magneto-optical discs; and CD-ROM and DVD-ROM discs. The processor and memory can be supplemented by dedicated logic circuitry or incorporated in dedicated logic circuitry.

為了提供與用戶之互動,可將在本說明書中描述之標的物之實施例實施於一電腦上,該電腦具有用於向用戶顯示資訊之顯示器件(諸如,陰極射線管(CRT)或液晶顯示器(LCD)監視器),及用戶可藉以將輸入提供至電腦之鍵盤及指標器件(諸如,滑鼠或軌跡球)。可使用其他種類之器件以亦提供與用戶之互動;舉例而言,向用戶提供之回饋可為任何形式之感官回饋,例如,視覺回饋、聽覺回饋或觸覺回饋;且可以任何形式來接收來自用戶之輸入,包括聲學、語音或觸覺輸入。另外,電腦可藉由發送文件至由用戶使用之器件及自由用戶使用之器件接收文件來與用戶互動;例如,藉由回應於自網路瀏覽器接收之請求而將網頁發送至用戶之用戶端器件上之網路瀏覽器。In order to provide interaction with a user, embodiments of the subject matter described in this specification can be implemented on a computer having a display device (such as a cathode ray tube (CRT) or liquid crystal display) for displaying information to a user. (LCD) monitor), and a keyboard and indicator device (such as a mouse or trackball) that the user can provide input to the computer. Other types of devices may be used to also provide interaction with the user; for example, the feedback provided to the user may be any form of sensory feedback, such as visual feedback, audible feedback, or tactile feedback; and may receive the user from any form. Inputs, including acoustic, voice, or tactile inputs. In addition, the computer can interact with the user by sending a file to the device used by the user and the device used by the free user; for example, by sending a web page to the user's client in response to a request received from a web browser. A web browser on the device.

本說明書中描述之標的物之實施例可實施於一計算系統中,該計算系統包括後端組件(例如,資料伺服器)、中間軟體組件(例如,應用程式伺服器)或前端組件(例如,具有圖形用戶介面或網路瀏覽器之用戶端電腦,經由圖形用戶介面或網路瀏覽器,用戶可與本說明書中描述之標的物之實施互動),或此等後端、中間軟體或前端組件中之一或多者之任何組合。可藉由數位資料通信之任何形式或媒體(例如,通信網路)來互連系統之組件。通信網路之實例包括區域網路(「LAN」)及廣域網路(「WAN」)、網間網路(例如,網際網路),及同級間網路(例如,特用同級間網路)。Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a backend component (eg, a data server), an intermediate software component (eg, an application server), or a front end component (eg, A client computer with a graphical user interface or web browser that interacts with the implementation of the subject matter described in this specification via a graphical user interface or web browser, or such backend, intermediate software or front end components Any combination of one or more of them. The components of the system can be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include regional networks ("LAN") and wide area networks ("WAN"), inter-networks (eg, the Internet), and peer networks (eg, special-purpose peer networks) .

計算系統可包括用戶端及伺服器。用戶端與伺服器通常遠離彼此且通常經由通信網路互動。用戶端與伺服器之關係根據在各別電腦上執行且彼此具有一用戶端-伺服器關係之多個電腦程式而產生。在一些實施例中,伺服器將資料(例如,HTML頁)傳輸至用戶端器件(例如,為了將資料向與用戶端器件互動之用戶顯示及自該用戶接收用戶輸入)。可在伺服器處自用戶端器件接收在用戶端器件處產生之資料(例如,用戶互動之結果)。The computing system can include a client and a server. The client and server are typically remote from each other and typically interact via a communication network. The relationship between the client and the server is generated based on a plurality of computer programs executing on respective computers and having a client-server relationship with each other. In some embodiments, the server transmits the material (eg, an HTML page) to the client device (eg, to display data to and receive user input from the user interacting with the client device). The data generated at the client device (eg, the result of user interaction) can be received at the server from the client device.

雖然本說明書含有許多特定實施細節,但此等特定實施細節不應解釋為對本發明之範疇或可主張之內容的限制,而是特定發明之特定實施例所特有的特徵之描述。在本說明書中在多個獨立實施例之上下文中所描述之特定特徵亦可在單一實施例中組合地實施。相反,在單一實施例之上下文中描述的各種特徵亦可獨立地在多個實施例中或以任何合適之子組合而實施。此外,儘管可在上文將特徵描述為以特定組合起作用且甚至最初如此主張,但在一些狀況下可自經主張之組合刪除來自所主張組合之一或多個特徵,且所主張之組合可係針對子組合或子組合之變化。While the specification has been described with a particular embodiment of the invention, the specific details of the invention are not to be construed as limiting the scope of the invention. Particular features described in the context of a plurality of separate embodiments may also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can be implemented in various embodiments or in any suitable sub-combination. Moreover, although features may be described above as acting in a particular combination and even initially claimed, in some cases one or more features from the claimed combination may be deleted from the claimed combination, and the claimed combination Changes may be made to sub-combinations or sub-combinations.

類似地,雖然操作在圖式中按特定次序描繪,但不應將此理解為需要按所展示之特定次序或按順序次序執行此等操作,或需要執行所有所說明之操作來達成理想的結果。在特定情形中,多任務及並行處理可為有利的。此外,不應將在上述實施例中的各種系統組件之分離理解為在所有實施例中需要此分離,且應理解,所描述之程式組件及系統通常可在單一軟體產品中整合在一起或經封裝至多個軟體產品內。Similarly, although the operations are depicted in a particular order in the drawings, this should not be construed as being required . In certain situations, multitasking and parallel processing may be advantageous. In addition, the separation of the various system components in the above-described embodiments should not be construed as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated or integrated in a single software product. Packaged into multiple software products.

因此,已描述標的物之特定實施例。其他實施例在以下申請專利範圍之範疇內。在一些狀況下,申請專利範圍中敍述的動作可按不同次序執行且仍達成理想的結果。另外,在隨附圖式中描繪之程序未必需要所展示之特定次序或順序次序來達成理想的結果。在特定實施中,多任務及並行處理可為有利的。Thus, specific embodiments of the subject matter have been described. Other embodiments are within the scope of the following patent claims. In some cases, the actions recited in the scope of the claims can be performed in a different order and still achieve the desired results. In addition, the procedures depicted in the drawings are not necessarily in a particular order or order of order to achieve the desired results. In certain implementations, multitasking and parallel processing may be advantageous.

100...用於排名搜尋用戶產生之內容的搜尋結果之系統100. . . System for ranking search results for searching for content generated by users

102...網路102. . . network

104a...用戶器件104a. . . User device

106...網路伺服器106. . . Web server

108...搜尋伺服器108. . . Search server

110...互動處理伺服器110. . . Interactive processing server

112...用戶產生之內容之儲存庫/用戶產生之內容112. . . Repository/user-generated content of user-generated content

114...搜尋引擎114. . . Search engine

116...搜尋索引儲存庫116. . . Search index repository

118...結果排名模組118. . . Result ranking module

120...相關性評分模組120. . . Correlation score module

122...內容分析模組/內容分析器122. . . Content Analysis Module / Content Analyzer

124...互動資料儲存庫124. . . Interactive data repository

126...用戶資料儲存庫126. . . User profile repository

128...文章分析器128. . . Article parser

130...品質加權模組130. . . Quality weighting module

132...社交繪圖模組132. . . Social drawing module

134...憑證評分模組134. . . Voucher scoring module

136...用戶排名模組136. . . User ranking module

138...憑證資料儲存庫138. . . Voucher data repository

200...與潛水旅遊相關之問答電子討論版之網頁200. . . Web page of the Q&A e-discussion related to diving tourism

202a...問題202a. . . problem

202b...問題202b. . . problem

202c...問題202c. . . problem

202d...問題202d. . . problem

202e...問題202e. . . problem

202f...問題202f. . . problem

202g...問題202g. . . problem

202h...問題202h. . . problem

202i...問題202i. . . problem

202j...問題202j. . . problem

204a...用戶識別符204a. . . User identifier

204b...用戶204b. . . user

204c...用戶204c. . . user

204d...用戶204d. . . user

204e...用戶204e. . . user

204f...用戶204f. . . user

204g...用戶204g. . . user

204h...用戶204h. . . user

204i...用戶204i. . . user

204j...用戶204j. . . user

204k...用戶204k. . . user

204l...用戶204l. . . user

204m...用戶204m. . . user

204n...用戶204n. . . user

204o...用戶204o. . . user

204p...用戶204p. . . user

206...最後張貼行206. . . Last posted

208...總檢視行208. . . Total inspection line

210...總回應行210. . . Total response line

212...時間行212. . . Time line

214...搜尋方塊214. . . Search block

300...用戶活動圖300. . . User activity diagram

302...邊302. . . side

304...邊304. . . side

352...預設用戶352. . . Default user

圖1為用於排名搜尋用戶產生之內容的搜尋結果之系統的方塊圖。1 is a block diagram of a system for ranking search results for searching for content generated by a user.

圖2為用於問答電子討論版之問題之網頁的實例螢幕截圖。Figure 2 is a screenshot of an example of a web page for questions in the Q&A e-Discussion.

圖3A及圖3B為實例用戶活動圖之圖式。3A and 3B are diagrams of an example user activity diagram.

圖4為用於產生用戶憑證計分之程序的流程圖。4 is a flow chart of a procedure for generating a user credential score.

圖5為用於產生用戶憑證計分之另一程序的流程圖。Figure 5 is a flow diagram of another procedure for generating a user credential score.

圖6為使用用戶憑證計分產生搜尋結果之排名的流程圖。Figure 6 is a flow diagram showing the ranking of search results using user credential scores.

在各圖式中相同參考符號指示相同元件。The same reference symbols are used in the drawings to the

100...用於排名搜尋用戶產生之內容的搜尋結果之系統100. . . System for ranking search results for searching for content generated by users

102...網路102. . . network

104a...用戶器件104a. . . User device

106...網路伺服器106. . . Web server

108...搜尋伺服器108. . . Search server

110...互動處理伺服器110. . . Interactive processing server

112...用戶產生之內容之儲存庫/用戶產生之內容112. . . Repository/user-generated content of user-generated content

114...搜尋引擎114. . . Search engine

116...搜尋索引儲存庫116. . . Search index repository

118...結果排名模組118. . . Result ranking module

120...相關性評分模組120. . . Correlation score module

122...內容分析模組/內容分析器122. . . Content Analysis Module / Content Analyzer

124...互動資料儲存庫124. . . Interactive data repository

126...用戶資料儲存庫126. . . User profile repository

128...文章分析器128. . . Article parser

130...品質加權模組130. . . Quality weighting module

132...社交繪圖模組132. . . Social drawing module

134...憑證評分模組134. . . Voucher scoring module

136...用戶排名模組136. . . User ranking module

138...憑證資料儲存庫138. . . Voucher data repository

Claims (21)

一種分析使用者產生內容之品質的電腦實施方法,其包含:藉由一電腦之操作來識別數個用戶之間經由一電子網路之包含多個用戶產生之內容項目之複數個互動,每一互動係在一對用戶之間;指派一加權因子(weighting factor)至每一互動,其中該加權因子表示該互動之一品質;藉由一電腦之操作以針對複數個用戶中之每一者產生一用戶憑證計分,其中該等用戶憑證計分係基於該複數個互動中之每一者之該等加權因子;將與一用戶識別符相關聯之該等用戶憑證計分儲存於一電腦可讀儲存器件上;針對複數個用戶產生之內容項目之每一者,指派內容相關性之一量測;及藉由下列步驟以基於該等用戶憑證計分及已指派之內容相關性之量測,來排名該複數個用戶產生之內容項目:正規化與該複數個用戶產生之內容項目相關聯之該等用戶憑證計分及該內容相關性之量測;及將已正規化之該等用戶憑證計分與已正規化之用於該等項目之該內容相關性之量測加以組合。 A computer implemented method for analyzing quality of content generated by a user, comprising: identifying, by a computer operation, a plurality of interactions between a plurality of users including content items generated by a plurality of users via an electronic network, each The interaction is between a pair of users; assigning a weighting factor to each interaction, wherein the weighting factor represents one of the qualities of the interaction; and is generated by a computer operation for each of the plurality of users a user credential score, wherein the user credential scores are based on the weighting factors of each of the plurality of interactions; storing the user credential scores associated with a user identifier on a computer Reading the storage device; assigning one of the content relevance measurements for each of the plurality of user-generated content items; and measuring the score based on the user credentials and the assigned content relevance by the following steps To rank the plurality of user-generated content items: normalizing the user credential scores associated with the plurality of user-generated content items and the content relevance Measurement; and the user credential such scores have been normalized to the sum of the normalized correlation for the content of the measured quantity of these projects are combined. 如請求項1之方法,其進一步包含:接收一搜尋查詢;及 藉由一處理器之操作,以識別該複數個用戶產生之內容項目來回應於該搜尋查詢。 The method of claim 1, further comprising: receiving a search query; and The search query is responded to by the operation of a processor to identify the plurality of user generated content items. 如請求項2之方法,其中已指派之用於每一項目之該內容相關性之量測係基於每一項目之相關性來指派給該搜尋查詢。 The method of claim 2, wherein the measure of the content relevance that has been assigned for each item is assigned to the search query based on the relevance of each item. 如請求項1之方法,其進一步包含:藉由該系統之操作產生一用戶活動圖,該用戶活動圖基於用戶之間的該複數個互動識別用戶之間的鏈接;藉由該系統之操作判定每一用戶之一權威度計分,其中該用戶之該權威度計分係基於在該用戶活動圖中該用戶所鏈接至之用戶之貢獻度計分;及藉由該系統之操作判定每一用戶之一貢獻度計分,其中該貢獻度計分係基於在該用戶活動圖中該用戶所鏈接至之用戶之權威度計分。 The method of claim 1, further comprising: generating a user activity map by operation of the system, the user activity map identifying a link between the users based on the plurality of interactions between the users; determining by operation of the system One authority of each user is scored, wherein the authority score of the user is based on the contribution score of the user to which the user is linked in the user activity map; and each operation is determined by the operation of the system One of the user's contribution scores, wherein the contribution score is based on the authority score of the user to which the user is linked in the user activity map. 如請求項4之方法,其中該等權威度計分及該等貢獻度計分係藉由反覆一反覆更新程序直至該反覆更新程序到達一預定收斂臨限值來產生。 The method of claim 4, wherein the authority scores and the contribution scores are generated by repeating a repeated update procedure until the repeated update procedure reaches a predetermined convergence threshold. 如請求項1之方法,其中該複數個用戶互動包括在一問答網站、一佈告欄網站、一網誌,或一社交網路網站中之至少一者上的用戶產生之內容。 The method of claim 1, wherein the plurality of user interactions comprise user generated content on at least one of a question and answer website, a bulletin board website, a blog, or a social networking website. 如請求項1之方法,其中該加權因子係藉由組合複數個品質因子導出,該等品質因子中之至少一者係選自由以下各者組成之群:一用戶之一內容項目與另一用戶之一相關聯先前內容項目的一相關性、一內容項目相對於其 他內容項目之一原創性、一內容項目對應於該內容項目中之非通用術語之一量測的一涵蓋範圍、該內容項目之一豐富度,及一內容項目之一及時性。 The method of claim 1, wherein the weighting factor is derived by combining a plurality of quality factors, at least one of the quality factors being selected from the group consisting of: one user content item and another user One associated with a previous content item, a content item relative to it One of his content items is originality, a content item corresponds to a coverage of one of the non-generic terms in the content item, a richness of the content item, and a timeliness of a content item. 如請求項1之方法,其進一步包含基於該等用戶憑證計分獎勵用戶。 The method of claim 1, further comprising rewarding the user based on the user credential scores. 如請求項1之方法,其中該複數個互動包括在一電子社區中之用戶之間的互動。 The method of claim 1, wherein the plurality of interactions comprise interactions between users in an electronic community. 一種用於排名用戶或用戶產生之內容之系統,其包含:至少一伺服器,其經調適以接收且發佈用於跨一網路存取之用戶產生之內容;至少一儲存器件,其儲存用戶產生之內容;及至少一處理器,其經組態以:識別與該所儲存之用戶產生之內容相關的用戶對之間的互動;基於每一互動之品質之一客觀量測以產生該互動之一加權因子;基於所識別之該等互動及該等互動之該等加權因子產生每一用戶之一用戶憑證計分;及基於該等用戶憑證計分及針對複數個用戶產生之內容項目之各者之一內容相關性之量測來排名用戶或用戶產生之內容中之至少一者,其中該排名係由下列步驟所產生:正規化與該複數個用戶產生之內容項目相關聯之該等用戶憑證計分及該內容相關性之量測;及將已正規化之該等用戶憑證計分與已正規化之該內容 相關性之量測加以組合。 A system for ranking content generated by a user or a user, comprising: at least one server adapted to receive and publish content generated by a user for accessing across a network; at least one storage device storing the user And the at least one processor configured to: identify an interaction between user pairs associated with the stored user-generated content; objectively measure based on one of the qualities of each interaction to generate the interaction a weighting factor; generating a user credential score for each user based on the identified interactions and the weighting factors of the interactions; and scoring based on the user credential and content items generated for the plurality of users One of each of the content relevance measures to rank at least one of the user or user generated content, wherein the ranking is generated by the following steps: normalizing the content items associated with the plurality of user generated content items User voucher scoring and measurement of the relevance of the content; and scoring the normalized user credential with the normalized content The correlation measures are combined. 如請求項10之系統,其中該至少一處理器經組態以藉由基於該等所識別之互動及該等加權因子反覆地更新一權威度計分及一相關貢獻度計分而產生每一用戶之一用戶憑證計分。 The system of claim 10, wherein the at least one processor is configured to generate each of the authority scores and an associated contribution score by repeatedly updating based on the identified interactions and the weighting factors One of the user's user credentials is scored. 如請求項11之系統,其中該至少一處理器經進一步組態以:接收一搜尋查詢;識別回應於該搜尋查詢之該複數個用戶產生之內容項目;及基於內容項目之該排名產生一組搜尋結果。 The system of claim 11, wherein the at least one processor is further configured to: receive a search query; identify a plurality of user-generated content items responsive to the search query; and generate a set based on the ranking of the content items Search results. 如請求項12之系統,其中該至少一處理器經組態以基於與每一用戶產生之內容項目相關聯之相關性的一量測及與每一用戶產生之內容項目相關聯的一用戶之該用戶憑證計分的一加權組合來排名該等所識別之用戶產生之內容項目。 The system of claim 12, wherein the at least one processor is configured to measure a correlation associated with a content item generated by each user and a user associated with each user generated content item A weighted combination of the user credential scores ranks the content items generated by the identified users. 如請求項10之系統,其中該互動之品質之該客觀量測係自表示以下各者之因素之一組合導出:一用戶之一內容項目與另一用戶之一相關聯先前內容項目的一相關性、一內容項目相對於其他內容項目之一原創性,及該內容項目中之非通用術語之一涵蓋範圍。 The system of claim 10, wherein the objective measure of the quality of the interaction is derived from a combination of one of the following factors: a correlation of a content item of one user with a previous content item associated with one of the other users Sexuality, the originality of a content item relative to one of the other content items, and one of the non-generic terms in the content item. 如請求項10之系統,其中該等互動中之一特定者包括一第一用戶對由一第二用戶張貼之電子資訊的一電子回應,其中用於該特定互動之該加權因子係與以下各者中 之至少一者相關:該第一用戶之該電子回應與由該第二用戶張貼之該電子資訊的一相關性、該電子回應中之相對非通用資訊之一涵蓋範圍,或該電子回應之一相對原創性。 The system of claim 10, wherein one of the interactions comprises a first user's electronic response to electronic information posted by a second user, wherein the weighting factor for the particular interaction is Among At least one of: one of the first user's electronic response to the electronic information posted by the second user, one of the relative non-universal information in the electronic response, or one of the electronic responses Relatively original. 一種用於分析內容之品質之電子系統,該系統包含:至少一伺服器,其經調適以接收且發佈用於跨一網路存取之內容;至少一儲存器件,其儲存用戶產生之內容;及用於基於複數個互動中之每一者之品質的一客觀量測以產生該互動之一加權因子的構件,其中每一互動發生在與該所儲存之用戶產生之內容相關的一對用戶之間;用於基於該等互動之該等加權因子以產生每一用戶之一用戶憑證計分的構件;及用於基於該等用戶憑證計分及針對複數個用戶產生之內容項目之內容相關性之量測以排名用戶產生之內容的構件,其中該排名係由下列步驟所產生:正規化與該複數個用戶產生之內容項目相關聯之該等用戶憑證計分及該內容相關性之量測;及將已正規化之該等用戶憑證計分與已正規化之用於該等項目之該內容相關性之量測加以組合。 An electronic system for analyzing the quality of content, the system comprising: at least one server adapted to receive and publish content for access across a network; at least one storage device that stores user generated content; And means for generating an objective measure of the quality of each of the plurality of interactions to generate a weighting factor for the interaction, wherein each interaction occurs with a pair of users associated with the stored user generated content Means for generating a score based on the user's voucher for each user based on the weighting factors of the interactions; and for correlating content based on the user credential scores and content items generated for the plurality of users A measure of the content of the user-generated content, wherein the ranking is generated by normalizing the user voucher scores associated with the plurality of user-generated content items and the amount of relevance of the content. And combining the normalized user account scores with the normalized measurements of the content of the content for the items. 如請求項16之系統,其進一步包含用於基於該等用戶憑證計分與用於用戶產生之內容之項目的相關性計分的一組合產生對用戶產生之內容之一搜尋的搜尋結果之構件。 The system of claim 16, further comprising means for generating a search result for searching for one of the user-generated content based on a combination of the scores of the user voucher scores and the items for the user-generated content . 一種包含一電腦可讀儲存媒體之物品,該電腦可讀儲存媒體儲存可操作以使一或多個處理器執行以下動作的指令:識別用戶之間經由一電子網路之包含多個用戶產生之內容項目之複數個互動,每一互動係在一對用戶之間;向每一互動指派一加權因子,其中該加權因子表示該互動之一品質;針對複數個用戶中之每一者產生一用戶憑證計分,其中該等用戶憑證計分係基於該複數個互動中之每一者之該加權因子;將該等用戶憑證計分儲存於一電腦可讀儲存器件上;針對複數個用戶產生之內容項目,指派內容相關性之一量測;及藉由下列步驟以基於該等用戶憑證計分及已指派之內容相關性之量測,來排名該複數個用戶產生之內容項目:正規化與該複數個用戶產生之內容項目相關聯之該等用戶憑證計分及該內容相關性之量測;及將已正規化之該等用戶憑證計分與已正規化之用於該等項目之該內容相關性之量測加以組合。 An article comprising a computer readable storage medium storing instructions operable to cause one or more processors to: identify a user comprising a plurality of users via an electronic network a plurality of interactions of content items, each interaction being between a pair of users; assigning a weighting factor to each interaction, wherein the weighting factor represents one of the qualities of the interaction; generating a user for each of the plurality of users Voucher scoring, wherein the user voucher scoring is based on the weighting factor of each of the plurality of interactions; storing the user voucher scores on a computer readable storage device; generating for a plurality of users a content item, assigning one of the content relevance measurements; and ranking the plurality of user-generated content items based on the measurement of the user credential scores and the assigned content relevance by the following steps: normalization and The user account scores associated with the plurality of user-generated content items and the measurement of the content relevance; and the user credentials that will be normalized The correlation between the amount of already formalized for the content of these projects be combined measure. 如請求項18之物品,其中該電腦可讀儲存媒體進一步儲存可操作以使一或多個處理器執行以下額外動作的指令:接收一搜尋查詢;及 識別該複數個用戶產生之內容項目,以回應於該搜尋查詢。 The article of claim 18, wherein the computer readable storage medium further stores instructions operable to cause the one or more processors to perform the following additional actions: receiving a search query; Identifying the plurality of user-generated content items in response to the search query. 如請求項19之物品,其中已指派之用於每一項目之該內容相關性之量測係基於每一項目之相關性來指派給該搜尋查詢。 The item of claim 19, wherein the measure of the content relevance that has been assigned for each item is assigned to the search query based on the relevance of each item. 如請求項18之物品,其中該電腦可讀儲存媒體進一步儲存可操作以使一或多個處理器執行以下額外動作的指令:產生基於用戶之間的該複數個互動識別用戶之間的鏈接之一用戶活動圖;判定每一用戶之一權威度計分,其中該用戶之該權威度計分係基於在該用戶活動圖中該用戶所鏈接至之用戶之貢獻度計分;及判定每一用戶之一貢獻度計分,其中該貢獻度計分係基於在該用戶活動圖中該用戶所鏈接至之用戶之權威度計分。 The article of claim 18, wherein the computer readable storage medium further stores instructions operable to cause the one or more processors to perform an additional action of: generating a link between the users based on the plurality of interactions between the users a user activity map; determining one authority rating of each user, wherein the authority score of the user is based on the contribution score of the user to which the user is linked in the user activity map; and determining each One of the user's contribution scores, wherein the contribution score is based on the authority score of the user to which the user is linked in the user activity map.
TW099137022A 2009-10-30 2010-10-28 Ranking user generated web content TWI501096B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US25674509P 2009-10-30 2009-10-30

Publications (2)

Publication Number Publication Date
TW201140346A TW201140346A (en) 2011-11-16
TWI501096B true TWI501096B (en) 2015-09-21

Family

ID=46760274

Family Applications (1)

Application Number Title Priority Date Filing Date
TW099137022A TWI501096B (en) 2009-10-30 2010-10-28 Ranking user generated web content

Country Status (1)

Country Link
TW (1) TWI501096B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI739359B (en) * 2019-03-28 2021-09-11 南韓商韓領有限公司 Computer-implemented system, computer-implemented method for arranging hyperlinks on a graphical user-interface and non-transitory computer-readable storage medium

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI480739B (en) * 2011-11-24 2015-04-11 Univ Ishou Interactive data sharing system
TW201335879A (en) * 2012-02-29 2013-09-01 Da Long Technology Co Ltd Community collective interaction platform and interactive method thereof
TWI493496B (en) * 2012-07-11 2015-07-21 Mackay Memorial Hospital Medical information exchange system
CN105446972B (en) * 2014-06-17 2022-06-10 阿里巴巴集团控股有限公司 Searching method, device and system based on and fused with user relationship data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6301516B1 (en) * 1999-03-25 2001-10-09 General Electric Company Method for identifying critical to quality dependencies
US20050216434A1 (en) * 2004-03-29 2005-09-29 Haveliwala Taher H Variable personalization of search results in a search engine
CN1784675A (en) * 2003-05-13 2006-06-07 Nhn株式会社 Method for providing answers to questions on the Internet
CN101116100A (en) * 2004-05-10 2008-01-30 Google公司 System and method for grading documents containing images
TW200923807A (en) * 2007-11-23 2009-06-01 Inst Information Industry Method and system for searching knowledge owner in network community
US7603350B1 (en) * 2006-05-09 2009-10-13 Google Inc. Search result ranking based on trust

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6301516B1 (en) * 1999-03-25 2001-10-09 General Electric Company Method for identifying critical to quality dependencies
CN1784675A (en) * 2003-05-13 2006-06-07 Nhn株式会社 Method for providing answers to questions on the Internet
US20050216434A1 (en) * 2004-03-29 2005-09-29 Haveliwala Taher H Variable personalization of search results in a search engine
CN101116100A (en) * 2004-05-10 2008-01-30 Google公司 System and method for grading documents containing images
US7603350B1 (en) * 2006-05-09 2009-10-13 Google Inc. Search result ranking based on trust
TW200923807A (en) * 2007-11-23 2009-06-01 Inst Information Industry Method and system for searching knowledge owner in network community

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI739359B (en) * 2019-03-28 2021-09-11 南韓商韓領有限公司 Computer-implemented system, computer-implemented method for arranging hyperlinks on a graphical user-interface and non-transitory computer-readable storage medium
US11328328B2 (en) 2019-03-28 2022-05-10 Coupang Corp. Computer-implemented method for arranging hyperlinks on a grapical user-interface

Also Published As

Publication number Publication date
TW201140346A (en) 2011-11-16

Similar Documents

Publication Publication Date Title
CN102754094B (en) For the system and method for the content classification to user or user's generation
US11070643B2 (en) Discovering signature of electronic social networks
Yoo et al. Improving travel decision support satisfaction with smart tourism technologies: A framework of tourist elaboration likelihood and self-efficacy
Cheng et al. Social influence's impact on reader perceptions of online reviews
US10169708B2 (en) Determining trustworthiness and compatibility of a person
US8775354B2 (en) Evaluating an item based on user reputation information
US8725773B2 (en) System and method for generating a knowledge metric using qualitative internet data
US9558273B2 (en) System and method for generating influencer scores
US20140074560A1 (en) Advanced skill match and reputation management for workforces
EP2359276A1 (en) Ranking and selecting enitities based on calculated reputation or influence scores
KR20100015479A (en) Intentional Matching
Yan et al. Comparing digital libraries with virtual communities from the perspective of e-quality
Xu et al. Study partners recommendation for xMOOCs learners
US20170046346A1 (en) Method and System for Characterizing a User's Reputation
TWI501096B (en) Ranking user generated web content
Samimi et al. Creation of reliable relevance judgments in information retrieval systems evaluation experimentation through crowdsourcing: a review
Luo et al. Who have got answers? Growing the pool of answerers in a smart enterprise social QA system
Pu et al. Examining the influence of streamer source characteristics on viewers’ continuous viewing intentions in tourism live-streaming: A SEM and fsQCA approach
Silberzahn et al. Many hands make tight work: crowdsourcing research can balance discussions, validate findings and better inform policy.
US11068848B2 (en) Estimating effects of courses
Zhou et al. A new information theory-based serendipitous algorithm design
US20160092999A1 (en) Methods and systems for information exchange with a social network
Manzoor et al. Use of social networking sites for citizen journalism
Da Silva et al. Reflection on ResearchGate's Terminating ResearchGate Score, and Interest Score, as Social Media Altmetrics and Academic Evaluation Tools
Zhang et al. Tourism Experience Sharing of Long-term Living Chinese in South Korea: Case of Xiaohongshu App (RED)