[go: up one dir, main page]

HK1142700A - Identifying and changing personal information - Google Patents

Identifying and changing personal information Download PDF

Info

Publication number
HK1142700A
HK1142700A HK10108818.1A HK10108818A HK1142700A HK 1142700 A HK1142700 A HK 1142700A HK 10108818 A HK10108818 A HK 10108818A HK 1142700 A HK1142700 A HK 1142700A
Authority
HK
Hong Kong
Prior art keywords
search result
user
search
data source
reputation
Prior art date
Application number
HK10108818.1A
Other languages
Chinese (zh)
Inventor
Raefer Gabriel
Brian Kelly
James Aubry
David Thompson
Michael Fertik
Original Assignee
Reputation.Com, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Reputation.Com, Inc. filed Critical Reputation.Com, Inc.
Publication of HK1142700A publication Critical patent/HK1142700A/en

Links

Description

Identifying and altering personal information
Cross-referencing
The present application claims the benefit of priority from U.S. provisional application No. 60/898,899 entitled "idetifying and correcting personal Information," filed on 31/1/2007, which is hereby incorporated by reference in its entirety.
Technical Field
The present invention relates to methods, systems, and apparatus for facilitating identification of personal information, alteration and/or removal of such information, and generation of a subjective personal reputation score or rating based on the identified information.
Background
Since the early 1990 s, the number of people using the world wide web and the internet has grown at a great rate. As more users utilize services available on the internet by registering on websites, posting comments and information electronically, or simply interacting with companies that post information about others (e.g., online newspapers or social networking websites), more and more information about users is publicly available online. Naturally, individuals, organizations, and companies (e.g., professionals, parents, university applicants, job seekers, employers, charities, and businesses) have raised serious and justified concerns about handling more and more information that is available to them over the internet because even online content about most transient internet users may be harmful, or even spurious.
The process of rating a user in a variety of professional and/or personal contexts has become increasingly sensitive to the type and amount of information available on the internet about the user. A user may desire a simple way to assess whether someone she or she is interacting with has gained a reputation that is generally positive or negative or positive or negative with respect to some aspect of their reputation. Exemplary interactions of a user with another user include, for example, initiating a romantic relationship, providing an employment or business opportunity, or engaging in a financial transaction. As the amount of information available online about a user increases, the process of screening all that information, assessing its relative importance, classifying it, and synthesizing it into a comprehensive assessment of the user's public, online, reputation becomes more daunting.
Accordingly, a need exists for methods, apparatus, and systems that will allow parties to continue to use the internet while ensuring that information about them on the internet is not erroneous, liberal, scandalous, or otherwise impairs their reputation or well-being. There is also a need for systems that will allow parties to quickly and broadly learn about other individuals, groups, organizations, and/or how companies can recognize their reputations based on information about them available on the internet.
Disclosure of Invention
Systems, apparatuses, and methods for analyzing information about a user are presented, the systems, apparatuses, and methods including: obtaining at least one search result from a data source based on at least one search term describing the user; receiving an indication of desirability of the at least one search result; and performing an action based on the desirability (desirability) of the at least one search result.
Systems, apparatuses, and methods for determining a reputation score representing a reputation of a user are also presented, the systems, apparatuses, and methods including: collecting at least one search result from a data source; determining an effect of the at least one search result from the data source on the reputation of the user; and calculating a reputation score for the user based on the determined effect of the at least one search result from the data source on the reputation of the user.
Systems, apparatuses, and methods for analyzing information about a user are also presented, the systems, apparatuses, and methods including: obtaining at least one search result based on at least one search term describing the user; determining a relevance of the at least one search result; presenting the at least one search result to the user; receiving an indication of a relevance or desirability of a search result of the at least one search result from the user; and performing an action based on the desirability of the search result.
In some embodiments, these systems, devices, and methods may also include: determining an additional search term based on the at least one search result, and obtaining a search result using the additional search term. Determining additional search terms may be performed automatically and/or by an agent or the user.
In some implementations, an indication that the search result may be an undesired search result may be received. The performed action may cause the undesirable search results to be changed or removed at a data source from which the undesirable search results were obtained. The undesirable search results may include data about the user that may be incorrect or that may be detrimental to the reputation of the user. The actions performed may include: determining whether the undesirable search result can be changed or removed at a data source from which the undesirable search result was obtained, and if the undesirable search result can be changed or removed at the data source, causing the undesirable search result to be changed, corrected, or removed at the data source.
In some implementations, determining the relevance of the at least one search result can include: determining whether the at least one search result includes information associated with the user and/or ignoring a search result if the search result does not include information associated with the user. In some implementations, if the at least one search result does not include information associated with the user, an exclusionary search term may be added to a subsequent search, wherein the exclusionary search term may be designed to exclude the at least one search result that does not include information associated with the user.
In some implementations, obtaining at least one search result may be performed multiple times, and additional steps may be performed, such as: generating a search ranking system based on at least one search result from the multiple executions of the obtaining step, and ranking additional search results based on the search ranking system. Generating the search ranking system may be performed using a bayesian network. The bayesian network can utilize a corpus of indifferent and relevant indications of tokens.
In some implementations, the at least one search result can be obtained periodically. The period of performing the obtaining step may be determined based on user characteristics or data source characteristics.
In some implementations, obtaining at least one search result may be performed multiple times and additional steps may be performed, such as: the method includes determining a signature for a currently obtained search result, comparing the signature to a previously obtained signature for a previously obtained search result, and determining a relevance of the search result when the currently obtained signature is different from the previously obtained signature.
In some embodiments, determining the relevance may comprise: presenting the at least one search result to an agent, obtaining an indication of a classification of the at least one search result from the agent, and automatically classifying the at least one search result based on the indication from the agent.
In some implementations, obtaining the at least one search result can include: at least one search result is received, such as from an agent or user, and its relevance is determined. Determining the relevance of the search results may include obtaining an indication of a classification of the at least one search result from, for example, the agent or user, and automatically classifying the at least one search result based on the indication from, for example, the agent or user.
Systems, methods, and apparatus are also presented that determine a reputation of a user by collecting data from a data source, determining an effect of the data from the data source on the reputation of the user, and determining a reputation score of the user based on the effect of the data from the data source on the reputation. In some embodiments, the systems, methods, and apparatus may further comprise: presenting the reputation score to a third party at the request of a user to vouch that the user is reputable as indicated by the score, presenting the reputation score to the third party at the request of a third party to vouch that the user is reputable as indicated by the score, wherein the data source comprises, for example, a credit agency database, a criminal database, an insurance database, a social network database, and/or a news database.
In some implementations, determining the effect on reputation may include: classifying elements of at least one search result according to a mood and/or importance of the at least one search result, and basing the impact on reputation on the mood and/or importance classification. In some implementations, determining the effect on reputation may include associating elements of the at least one search result according to a positive-to-negative criteria, and basing the effect on reputation on the positive-to-negative association.
In some implementations, determining a reputation score for a user may include: determining at least one reputation sub-score for the user based on an effect on reputation of search results from the data source. The type of reputation sub-score may include any suitable reputation attribute, such as: as an employee, employer, important other profession, a lawyer's reputation, or as a possible parent's reputation.
Brief description of the drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description, serve to explain the principles of the invention. In these drawings:
FIG. 1 is a block diagram depicting an exemplary system for analyzing information about a user.
FIG. 2 is a flow diagram depicting a process of changing and/or removing harmful search results from a data source.
FIG. 3 is a flow chart depicting the process of categorizing the search results (sort).
FIG. 4 is a flow diagram depicting a process for determining whether a signature of a logged search result is the same as a signature of a previously logged search result.
FIG. 5 is a flow diagram depicting a process for indicating a categorization (categorization) of search results.
FIG. 6 is a flow chart depicting a process for calculating a reputation score for a user.
Detailed description of the invention
Reference will now be made in detail to exemplary embodiments of the invention, examples of which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.
FIG. 1 is a block diagram depicting an exemplary system 100 for analyzing information about a user. In the exemplary system 100, the search module 120 is coupled to the user information processing module 110, the data storage module 130, and the network 140. Search module 120 is also coupled to at least one data source, such as data sources 151, 152, and/or 153, via network 140 or via other coupling techniques (not shown). Data sources 151, 152, and/or 153 may be proprietary databases containing information about one or more users 161, 162, and/or 163. Exemplary data sources 151, 152, and/or 153 may be, for example, "blogs" or websites, such as social networking websites, news agency websites, private websites, or corporate websites. Exemplary data sources 151, 152, and/or 153 may also be cached information stored in a search database, such as GoogleTMOr Yahoo!TMThose cached information that is maintained. Exemplary data sources 151, 152 and/or 153 may further be, for example, criminal databases or lists, credit agency data sources, insuranceA database, or any electronic or other source of information about users 161, 162, and/or 163. The system 100 may include any number of data sources 151, 152, and/or 153, and the system 100 may be used by any number of users, agents (human agents), and/or third parties.
One or more users 161, 162, and/or 163 may interact with the user information processing module 110 through, for example, a personal computer, personal data device, telephone, or other device (not shown) coupled to the user information processing module 110 via the network 140 or via other coupling technology through which they may interact with the information processing module 110.
One or more users 161, 162, and/or 163 may directly or indirectly provide information or search terms identifying the user to the user information processing module 110. The user information processing module 110 or the search module 120 may use the identification information or search terms to construct a search for information or search results about the user. The search module 120 may then search the data sources 151, 152, and/or 153 for information about the user by using the at least one search term. Search results about the user may be stored in the data storage module 130 and/or analyzed by the user information processing module 110. Specific embodiments of analyzing and storing data about a user are described with reference to fig. 2, 3, 4, 5, and 6.
Network 140 may be, for example, the internet, an intranet, a local area network, a wide area network, a campus area network, a metropolitan area network, an extranet, a private extranet, any set of two or more electronic devices coupled, or any combination of these or other suitable networks.
The coupling between modules or the coupling between modules and network 140 may include, but is not limited to, electrical connections, coaxial cables, copper wire, and fiber optics, including the wires that comprise network 140. The coupling may also take the form of acoustic or light waves, such as lasers and those generated during radio wave and infrared data communications. The coupling may also be accomplished by means of transmission of control information or data to other data devices through one or more networks.
Each of the above logic modules or functional modules may include a plurality of modules. These modules may be implemented separately or their functions may be combined with the functions of other modules. Further, each of these modules may be implemented on a single component, or these modules may be implemented as a combination of components. For example, each of user information processing module 110, search module 120, and data storage module 130 may be implemented by a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a Complex Programmable Logic Device (CPLD), a Printed Circuit Board (PCB), a combination of programmable logic components and programmable interconnects, a single CPU chip, a CPU chip incorporated on a motherboard, a general purpose computer, or any other combination of devices or modules capable of performing the tasks of modules 110, 120, and/or 130. Data storage module 130 may include Random Access Memory (RAM), Read Only Memory (ROM), Programmable Read Only Memory (PROM), Field Programmable Read Only Memory (FPROM), or other dynamic storage devices for storing information and instructions to be used by user information processing module 110 or search module 120. The data storage module 130 may also include a database, one or more computer files in a directory structure, or any other suitable data storage mechanism, such as a memory.
FIG. 2 is a flow chart depicting a process for finding and changing and/or removing harmful search results for at least one user from a data source. In step 210, an instruction is received, for example, by a system or apparatus, for directing a search that includes at least one search term. These instructions may be received from, for example, the user directly, a third party, or an online data search service that the user or third party may be registered with. The instructions may also be received from a storage device.
A user or third party may register for an online data search service via, for example, a personal computer, a personal data device, or via a website. When registered, a user or third party may provide identifying information about themselves or another user that may be used by, for example, an information processing module or a search module, to construct a search related to the user or another user. In some implementations, the received instructions and/or the at least one search term may relate to, for example, a user, a group, an organization, or a company.
In step 220, the search module may obtain at least one search result based on the received instruction and/or the at least one search term. The search results may be obtained from a data source. The search results may be provided via a publicly available search engine (e.g., Google)TMSearch or Yahoo!TMSearch) or private search engines (e.g. Westlaw)TMSearch or LexisNexisTMSearch) is obtained. Search results may also be obtained via a search Application Program Interface (API) or structured data exchange (e.g., extensible markup language). The search may be performed by using at least one search term provided or generated based on information provided by, for example, a user or a third party. In an exemplary search, a user may provide search terms, such as her hometown, city of residence, and maternity, which may be used alone or in combination with one another as search terms for searching. The search results may be obtained automatically based on the instructions and/or the at least one search term or manually by a user, third party, or agent. The search results obtained in step 220 may be saved (step 230).
Once the search results are obtained, the relevance (relavancy) of the search results may be determined, as described in step 240. For example, relevance may be automatically determined based on the number of times certain types of data or units of the search results appear in the search results. The relevance of a search result may be based on, for example, the data source from which it was obtained, the content of the search result, or the type of search result found. Additionally or alternatively, the relevance of the search results may be determined directly by, for example, an agent or user.
The relevance of the obtained search results may include: a propensity (mood) and/or importance of the search result is determined. The propensity of a search result may include data related to the content of the search result and may relate to, for example, the emotional context of the search result or its data source or the nature of statements in the search result. The determination and/or assignment of a search result's propensity may be based on its positive or negative impact on reputation. For example, different portions of a search result may have different tendencies based on the content of the search result. The tendencies and sub-tendencies may be assigned numerical values. The effect of computing the propensity or sub-propensity of search results on reputation will be discussed in further detail below.
Additionally or alternatively, relevance of the obtained search results may include determining and/or assigning importance of the search results or data sources. For example, importance may range from high to low. The importance of search results or data sources may be assigned a weight value such that search results or data sources that are more important are determined to have and/or assigned a higher importance when compared to search results or data sources that are less important. Search results or data sources may be determined and/or assigned, for example, high, medium, or low importance. The importance of a data source may be determined or assigned based on, for example, the number of inbound links to the data source, the number of search engines that will report inbound links to the data source, or a synthetic measure proportional to the number of inbound links to the data source. Com.com.com.TM、iTunesTMCom or NymmesTM
The importance of the search results may be determined and/or assigned based on, for example, the ratio of the references to the search user's name to the total number of words in the search results, the presence of the user's name in the title of the search results, font designs or graphical elements surrounding the user's name, or the ranking of the user's name in the name query of the data source. The data sources may be assigned importance based on, for example, how often the data sources are accessed or how well known the data sources are. Exemplary high importance search results may include names of users that are mentioned prominently or repeatedly on the data source. The effect of computing the propensity or sub-propensity of search results on reputation will be discussed in further detail below.
In some implementations, step 240 can include generating a search result ranking system, and/or classifying search results based on the search result ranking system, examples of which are depicted in FIG. 3. In step 250, the search results may be output or displayed, for example, to a user, agent, or software program. The relevance of the search results may also be output or displayed to, for example, a user, agent, or software program. The search results and/or their relevancy may be output or displayed via email, fax, web page, or in any other suitable manner. The search results and/or their relevance may be displayed, for example, as a copy of the original search results, as a link to the search results, as a screenshot of the search results, or as any other suitable representation.
In step 260, additional search terms may be desired for the search. If additional search terms are desired for searching, the additional search terms may be used to obtain additional search results (step 270). For example, if a search for the user's name sets forth a city in which the user works, the name of the city may be added to the search terms for at least one future search. As a further example, if a new alias or username of a user is found, it may be used as an additional search term for searching. Further, it may be determined whether the search results are related to the same user. If the search results are related to the same user, the search terms may be added as described above. If the search term is related to a different user or otherwise not related to the user, then an exclusive search term may be added to the search term for the search. For example, if the user is named George Washington, it may be appropriate to add an exclusionary word as part of step 270 to ensure that search results related to "George Washington university," "George Washington President," or "George Washington Kafo" are not returned.
The additional search terms used for the search may be determined by any suitable method. For example, search results may be presented to the user and the user may select additional search terms. Alternatively, the agent may review the search results and provide additional search terms. Additional search terms may also be automatically determined, such as by a search module, user information processing module, or agent. The automatic determination of additional search terms may be based on any suitable calculation or analysis. For example, if a particular search term is typically present in a previous search related to the user, the particular search term may be used as an additional search term for a new search.
In step 280, the harmful search results may be flagged. Tagging of search results may be performed electronically, such as by a user, agent, or computer software program, via, for example, a web interface, email, mail, or fax to an agent. The search results may be flagged by placing an appropriate flag, such as in a data storage module, or otherwise indicating that the search results are to be removed or changed.
In step 290, the flagged search results may be removed and/or changed as appropriate. The user may request that all information about her in the search results be marked and changed and/or removed or that only specific information within the search results be changed or removed. The removal or change of flagged results may be accomplished via an API for the relevant data source. For example, a structured data source may have an API that allows data to be changed or removed from the data source. The search module or other suitable module may use the API of the data source to indicate to the data source that information for the user is to be removed or changed. Flagged results may also be removed or changed when a user and/or agent calls, emails, mails, or otherwise contacts an agent responsible for changing or removing information from a data source. In some cases, step 290 may include an agent, such as a lawyer, drafting a letter on behalf of the user to persuade an agent responsible for the data source to change or remove data related to the user. In other cases, step 290 may include initiating civil or criminal lawsuits against the agents or companies responsible for the data sources, such that the judicial authority may force the agents or companies responsible for the data sources to change or remove data related to the user.
In some embodiments, step 220-270 may be performed at regular, irregular, or random intervals. For example, steps 220-270 may be performed hourly, daily, or at any suitable interval. For some users, step 220-270 may be performed more often than other users based on user characteristics such as the likelihood of updates, time zone of residence, user preferences, and the like. Further, for some data sources, step 220-270 may be performed more often than other data sources. For example, if it is known that a social networking site is updated more often than a corporate website, then step 220-270 may be performed more often for the social networking site than the corporate website.
FIG. 3 is a flow chart depicting a process for categorizing search results. In step 310, the relevance of the obtained search results is determined and/or indicated-either automatically or through human intervention as described above. In step 320, a search result ranking system may be generated. The search result ranking system may rank the search results based on one or more considerations, such as their relevance, propensity, or importance, the age of the results, how damaging, beneficial, or harmless the results are to the user, or any other suitable ranking tool. In step 330, the search results may be categorized based on their ranking in the search result ranking system. The order in which the search results are categorized may define how the search results are displayed. For example, search results may be categorized such that the most recent and/or most harmful search result is displayed first, followed by the next most recent and/or most harmful search result.
In some embodiments, steps 320 and 330 may be performed using a neural network, a bayesian classifier, or any other suitable tool for generating a search ranking system. If a Bayesian classifier is used, it can be constructed using, for example, human agent and/or user input. In some implementations, the agent and/or user can indicate that the search results are "relevant" or "irrelevant. Each time a search result is marked as "relevant" or "irrelevant," the tokens from the search result can be added to an appropriate corpus of data, such as a "relevant indicated corpus of results" or an "irrelevant indicated corpus of results. Prior to collecting data for a search, a bayesian network can be seeded (seed), for example, with words collected from the user (e.g., hometown, occupation, gender, etc.) or another source. After classifying the search results as relevant or irrelevant, tokens (e.g., words or phrases) in the search results may be added to the corresponding corpus. In some implementations, only a portion of the search results may be added to the corresponding corpus. For example, common words or tokens such as "a," "the," and "may not be added to the corpus.
As part of maintaining a bayesian classifier, a hash table of tokens may be generated based on the number of occurrences of tokens in the corpus. Furthermore, for a token in either or both corpora, a "conditional prob" hash table may also be generated to indicate the conditional probability that the search result containing the token is a relevant or irrelevant indication. The conditional probability that a search result is relevant or irrelevant may be determined based on any suitable calculation, which in turn may be based on the number of occurrences of the token in the relevance-indicating corpus and the irrelevance-indicating corpus. For example, the conditional probability that a token is irrelevant to the user can be defined by the following equation:
prob=max(MIN_RELEVANT_PROB,
min(MAX_IRRELEVANT_PROB,irrelevatProb/total)),
wherein:
MIN _ retrieve _ PROB ═ 0.01 (lower threshold of correlation probability),
MAX _ IRRELEVANT _ PROB is 0.99 (upper threshold for correlation probability),
let r ═ relax _ BIAS ("number of tokens present in the corpus of" RELEVANT indications "),
let i IRRELEVANT BIAS (number of tokens appearing in the corpus of "irrelevant indications"),
RELEVANT_BIAS=2.0,
IRRELEVANT BIAS is 1.0 (in some embodiments, the word "associated with indicating" should be biased more toward the positive side of the error than the word "not associated with indicating" in order to BIAS away from the negative side of the error, which is why the associated BIAS may be larger than the unrelated BIAS),
nrel is the total number of items in the corpus indicated by the correlation,
nirrel is the total number of items in the corpus that are indifferently indicated,
relevantProb=min(1.0,r/nrel),
irrelevantProb ═ min (1.0, i/nirrel), and
total=relevantProb+irrelevantProb。
in some embodiments, if the relevance-indicating corpus and the irrelevance-indicating corpus are seeded and a particular token is given a default conditional probability of irrelevance, then the conditional probabilities calculated as above may be averaged using the default value. For example, if the user specifies that he is going to Harvard university, the token "Harvard" may be indicated as a seed for the correlation indication and the stored conditional probability for the token Harvard may be 0.01 (only 1% likelihood of irrelevance). In this case, the conditional probabilities calculated as above may be averaged using a default value of 0.01.
In some embodiments, if there are less than some threshold of terms for a token in either corpus or the two combined corpora, then the conditional probability that the token is an irrelevant indication may not be calculated. When relevance of a search result is indicated, the conditional probability that the token is an irrelevant indication may be updated based on the most recently indicated search result, e.g., as part of step 320.
When new search results are obtained, the content of the search results may be decomposed into at least one token. The probability that the token is a relevant indication and/or an irrelevant indication may then be determined based on, for example, a ranking system. The maximum probability of the associated indication and/or the unrelated indication in the token may then be used to calculate a bayesian probability. For example, if the largest N probabilities are placed in an array called "probs," then a bayesian combination probability can be calculated based on naive bayesian classifier rules, as follows:
the results of the search may be categorized by the probability that each search result is relevant and/or irrelevant.
The bayesian probabilities computed above may represent the probability that the search results are "relevant" and/or "irrelevant". This is simply a formula that repeatedly applies bayes' theorem. Other formulas may also be used to calculate the conditional probability based on the unconditional probability, such as one or more of the formulas described in "Bayes 'Theoren statics" and "Bayes' Theoren statics (Reexaminated)" of Papoulis, A., probability, random variables, and 3-5 and 4-4 in the random process (second edition, New York: McGraw-Hill, pages 38-39, 78-81, and 112-114, 1984, hereafter "Papoulis 1984"). An exemplary alternative form of bayesian theorem, described at pages 38-39 (of Papoulis 1984), may also be used to calculate the probability that a search result is "relevant" and/or "irrelevant". Similar processes may be used to correlate and/or determine the propensity and/or importance of search results or data sources.
FIG. 4 is a flow chart depicting a process for determining whether a signature recorded for a current search result is the same as a previously recorded signature. The signature of the search result may be, for example, a hash of the associated web page, an abbreviated form of the search result or information from the search result, a hash of the search result, or other computation based on the content of the search. For example, the hash may be based on the complete search result or based on a portion of the search result, such as a portion of the search result surrounding at least one search term. The search results may include, for example, a web site or web pages within a web site. The signature recorded for the search result may include information identifying the search result, such as a Uniform Resource Locator (URL) of the search result, or a classification of the type of search result, and/or a signature (signature) of the website. In step 420, the signature of the search result may then be compared to a previously obtained signature of a previously obtained search result.
In step 430, it may be determined whether the signature of the current search result is the same as the signature of a previously obtained search result. If the current search result is the same as the previously obtained search result, the current search result may not be further analyzed and the process depicted in FIG. 4 may end. If the signature of the currently obtained search result is different from the signature of the previously obtained search result, the content of the current search result may be further analyzed (step 440). For example, if a social networking site includes information about a user and searches the site on a daily basis on behalf of the user, then the signature of the most recently obtained search result (e.g., a hash of the associated web page) may be compared to the signatures of previously obtained search results. If the two signatures are the same, the content of the search results has not been changed and there is no need to further analyze the most recently obtained search results, at least until the source is searched the next time.
FIG. 5 is a diagram illustrating a method for indicating a searchA flow chart of a process of classification of results. In step 510, the search results may be presented, for example, to a user and/or agent via a graphical user interface such as a web interface (web interface), a computer program, or via any other suitable tool. The displayed search results may be obtained via any suitable tool. For example, when via one or more public search engines (e.g., Google)TMOr Yahoo!TM) Private search systems (e.g., LexisNexis)TMOr WestlawTM) Or any data source, the results of the search may be displayed to, for example, a user and/or agent.
In step 520, the search results may be identified, for example, by an agent or user. In step 525, a classification of the search results may be determined. The classification may be determined by, for example, a human agent, a user, or a bayesian classifier. Exemplary classifications include: relevance to the user, how damaging the results are to the user, or the source type of the search results (social networking site, news database, etc.). Search results may be categorized based on, for example, the judgment of the agent or user, standard rules (e.g., any page containing expletive words that involve the user may be marked as harmful), or rules for the user (e.g., the user may request that all references to her or her previous work be marked as harmful).
In step 530, the classification of the search results may be indicated to the appropriate system or module. For example, if the agent is searching for information about the user using a web browser and determines that the search results should be classified as harmful, the agent may "click" on a "bookmarking tool" using her computer mouse to indicate that the search results may be harmful. The categories may be indicated via, for example, "bookmarking tools," programmable buttons, user interface elements, or any other suitable tool. The bookmark or programmable button may be a computer program that runs at least partially as part of a web browser, or may be a computer program coupled to a web browser. The bookmarking tool may be a graphical button that, when clicked, may cause a script or program to execute, which may send an indication to the user information processing module, server module, or any other suitable module that the search results are to be flagged. When the user interface element is selected, it may cause an action to be performed, which may indicate that the viewed search results are to be flagged. The search results and one or more tags associated with the search results may be stored in a data storage module. The indicated indicia may be used in part to determine relevance of the search results, or the indicated indicia may be shown when the search results are displayed.
FIG. 6 is a flow chart describing a process for calculating a reputation score. The reputation score may represent, for example, the reputation of the user as being generally an employee, employer, significant other profession, potential parent, or any other suitable dimension (dimension) or consideration. Further, a reputation score may include one or more reputation sub-scores, which may be based on sub-elements of the user's reputation, such as the particular domain of material or type of interaction. For example, a person's reputation score may generally include a sub-score that is the individual, business partner, employee, employer, important other profession, and possibly parent's reputation. Reputation scores may be based on other scores and information, such as credit scores, eBay seller scores, or as SlashdotTMA cause and effect value (karma) on the website, or any other suitable building block.
The steps in fig. 6 may be performed to determine a single reputation score, multiple types of reputation scores, or one or more reputation sub-scores, any of which may be combined to calculate an aggregate reputation score. Reputation scoring is a tool whose effect on a user's reputation score, for example, reduces search results for, e.g., the user to a simple summary score, a rating, or any other suitable measure. The reputation score may allow, for example, a user or agent to focus on primary online influencing items that may influence the reputation score. The reputation score also allows, for example, a user or agent to track data, signatures of search results, and/or relevant changes to search results.
In step 610, the search results may be aggregated. The aggregated search results may be any data from any data source that is relevant to, for example, the user or a third party. The aggregated search results may be data obtained via the processes of fig. 2, 3, 4, and/or 5. Aggregated search results may also be data collected via other tools or may be submitted directly by, for example, a user or agent.
In step 620, the aggregated search results may be analyzed to determine their impact on reputation. This determination may be manual or automatic. For example, the agent or user may mark a search result or a segment of the search result from the aggregated search results as damaging or beneficial to some aspect of the user's reputation. The agent or user may then indicate along one or more series (spectra) how the search results affect the reputation score.
Determining the effect of aggregated search results on a reputation score may be performed by analyzing the search results and indicating based on, for example, the content, propensity, or importance of the search results. In some cases, this determination and/or its indication is automatic. For example, if the reputation of a user as an employer is determined and the aggregated search results include a record of reviewing users on a website designated as a place for posting information about "bad boss," an indication may be automatically generated to indicate that the network record may be detrimental to the reputation of the user as an employee.
In some implementations, the system can determine whether the search results positively or negatively impact the reputation of the user by determining whether any of the tokens surrounding the relevance indicating token are "positive" according to context or "negative" according to context. A surrounding set of tokens may be defined as a set of tokens within N relevant indicator tokens, where N is any positive integer. In some implementations, a set of surrounding tokens may be defined as all tokens in the search results or may be defined in any other suitable manner. The system can determine whether surrounding tokens are context-positive by querying them in a table or database of context-positive tokens. Parallel programs may be used to identify tokens that are negative in terms of context. For example, search results that relate to a user and contain expletives within N tokens of a relevant indicated token may be automatically classified as detrimental to the user's reputation score.
Further, a reputation score may be calculated based in part on any contextually positive, contextually negative, and/or indicate a propensity for a token found in a set of tokens surrounding a relevant indicated token. A language token that is negative or bad depending on context may adversely affect or otherwise numerically lower the user's reputation score. While a positive or good trending token may numerically increase or otherwise improve the reputation score depending on the context. In some embodiments, context-positive and/or context-negative tokens may have a numerical weight or multiplier associated with them. Likewise, numeric weights or multipliers can be associated with tokens based on their relevance and/or importance. More weighted tokens may have a greater impact on the reputation score of the user. Some positive and negative content determinations may also be user-specific. For example, for an educator, a record discussing a party on a website that mentions a user may be more detrimental to a reputation score than a college student. Step 620 may also include automatically determining and/or determining, by one or more users or agents, an effect on the reputation score with respect to the search results.
In step 630, a reputation score may be calculated. The reputation score may be based on any suitable calculation. For example, the reputation score may be the sum of the number of positive references minus the sum of the number of negative references in the aggregated search results. The reputation score may also be a weighted sum or average of the aggregated search results' effects on the reputation of the user. Additionally or alternatively, the reputation score may also be a sum or weighted average of reputation sub-scores, which may be calculated as described above.
Once the reputation score is calculated, it may be reported to the requesting party, as described in step 640. For example, if a potential employee wishes to learn the reputation of an employer, the potential employee may request a report of the employer's reputation score. Reputation scores may also be reported to the user.
In some embodiments, the reputation score may be reported to third parties upon request of the user, and according to each of the embodiments herein, the party that calculates and presents the reputation score may "vouch for" the user when presenting the user's reputation score. For example, if a user is attempting to become a roommate of another person and the user's reputation score is reported to the other person by a reputation reporting company, then the reputation reporting company will vouch that the user is reputable as indicated by the user reputation score.
The steps depicted in the exemplary flow diagrams of fig. 2, 3, 4, 5, and 6 may be performed by user information processing module 110, search module 120, or any other suitable module, device, apparatus, or system. Further, some of these steps may be performed by one module, apparatus, device, or system, while other steps may be performed by one or more other modules, apparatuses, devices, or systems. Moreover, in some embodiments, the steps in fig. 2, 3, 4, 5, and 6 may be performed in a different order and/or with fewer or more steps than depicted in these figures or described herein.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit being indicated by the following claims.

Claims (42)

1. A method for analyzing information about a user, comprising the steps of:
obtaining at least one search result from a data source based on at least one search term describing the user;
receiving an indication of desirability of the at least one search result; and
performing an action based on the desirability of the at least one search result.
2. The method of claim 1, further comprising the steps of:
determining additional search terms based on the at least one search result; and
using the additional search terms in the obtaining step.
3. The method of claim 2, wherein determining the additional search term is performed automatically.
4. The method of claim 2, wherein determining the additional search terms is performed by an agent or the user.
5. The method of claim 1, wherein an indication that the at least one search result is an undesired search result is received in the receiving step, and wherein the action performed comprises:
causing the undesired search results to be removed or changed at the data source from which the undesired search results were obtained.
6. The method of claim 1, wherein an indication that the at least one search result is an undesired search result is received in the receiving step, and wherein the action performed comprises:
determining whether the undesirable search results may be changed at the data source from which the undesirable search results were obtained; and
based on a determination that the undesirable search result may be changed at the data source, causing the undesirable search result to be changed at the data source.
7. The method of claim 1, wherein an indication that the at least one search result is an undesired search result is received in the receiving step, and wherein the action performed comprises:
determining whether the undesirable search results can be removed from the data source from which the undesirable search results were obtained; and
causing the undesired search results to be removed from the data source based on a determination that the undesired search results are removable from the data source.
8. The method of claim 5, wherein the undesirable search results include incorrect data about the user.
9. The method of claim 5, wherein the undesired search results include data that is detrimental to the reputation of the user.
10. The method of claim 1, further comprising:
determining a relevance of the at least one search result.
11. The method of claim 10, wherein the step of determining the relevance of the at least one search result comprises: determining whether the at least one search result includes information associated with the user.
12. The method of claim 11, further comprising:
if the at least one search result does not include information associated with the user, ignoring the at least one search result.
13. The method of claim 11, further comprising:
determining an exclusionary search term when the at least one search result does not include information associated with the user, wherein the exclusionary search term excludes search results that do not include information associated with the user; and
obtaining at least one search result of the user using the exclusionary search term.
14. The method of claim 1, further comprising:
obtaining at least one additional search result;
generating a search ranking system based on the at least one search result or the at least one additional search result; and
and classifying the search result based on a search ranking system.
15. The method of claim 14, wherein a bayesian network is used to generate the search ranking system.
16. The method of claim 15, wherein the bayesian network utilizes a corpus of indifferent and relevant indications of tokens.
17. The method of claim 1, further comprising:
determining a signature for the at least one search result;
comparing the signature to previously obtained signatures for previously obtained search results; and
determining the relevance of the at least one search result when the signature and the previously obtained signature are not identical.
18. The method of claim 1, wherein the step of obtaining at least one search result is performed periodically.
19. The method of claim 18, wherein a period of performing the obtaining step is based on a user characteristic.
20. The method of claim 18, wherein a period of performing the obtaining step is based on data source characteristics.
21. The method of claim 1, further comprising:
presenting the at least one search result to an agent;
obtaining an indication of a classification of the at least one search result from the agent; and
automatically categorizing the at least one search result based on the indication from the agent.
22. The method of claim 1, wherein the step of obtaining at least one search result comprises receiving at least one search result from an agent, and the method further comprises:
obtaining an indication of a classification of the at least one search result from the agent; and
automatically categorizing the at least one search result based on the indication from the agent.
23. The method of claim 1, further comprising:
receiving an indication of the relevance of the at least one search result from the user or agent.
24. The method of claim 1, wherein the step of obtaining the at least one search result comprises receiving at least one search result from an agent, and the method further comprises:
obtaining an indication of a classification of the at least one search result from the agent; and
automatically categorizing the at least one search result based on the indication from the agent.
25. A method for determining a reputation score representing a reputation of the user, comprising:
collecting at least one search result from a data source;
determining an effect of the at least one search result from the data source on the reputation of the user; and
calculating a reputation score for the user based on the determined effect of the at least one search result from the data source on the reputation of the user.
26. The method of claim 25, further comprising:
presenting the reputation score to a third party at the user's request to vouch that the user is reputable as indicated by the score.
27. The method of claim 25, further comprising:
presenting the reputation score to a third party at the request of the third party to vouch that the user is reputable as indicated by the score.
28. The method of claim 25, wherein the data source is one of a credit agency database, a criminal database, an insurance database, a social network database, or a news database.
29. The method of claim 25, further comprising:
classifying at least one search result as positive or negative; and
determining an effect of the at least one search result on the reputation of the user based on a positive classification or a negative classification of the at least one search result.
30. The method of claim 25, further comprising:
associating at least one search result according to a positive-to-negative criterion; and
determining an impact of the at least one search result on the reputation of the user based on a positive to negative association of the at least one search result.
31. The method of claim 25, further comprising:
calculating at least one reputation sub-score for the user based on an effect of the at least one search result from the data source on the reputation of the user.
32. The method of claim 25, further comprising:
a value representing the relevance of the at least one search result is determined.
33. The method of claim 32, further comprising:
associating a numerical weight with the numerical value for the relevance of the at least one search result; and
calculating the reputation score using the numerical value and the numerical weight of the relevance of the search result.
34. The method of claim 25, further comprising:
a value representing a propensity of the at least one search result is determined.
35. The method of claim 34, further comprising:
associating a numerical weight with the numerical value representing the propensity of the at least one search result; and
calculating the reputation score using the numerical value of the propensity of the search result and the numerical weight.
36. The method of claim 25, further comprising:
a value representing the importance of the data source is determined.
37. The method of claim 36, further comprising:
associating a numerical weight with the numerical value representing the importance of the database; and
calculating the reputation score using the numerical value of the importance of the data source and the numerical weight.
38. The method of claim 36, wherein the importance of the data source is based on an evaluation of the data source.
39. An apparatus for analyzing information about a user, comprising:
at least one module programmed to:
obtaining at least one search result from a data source based on at least one search term describing the user;
presenting the at least one search result to the user;
receiving an indication of desirability of the at least one search result from the user; and
performing an action based on the desirability of the at least one search result.
40. An apparatus for determining a reputation of a user, comprising at least one module programmed to:
obtaining at least one search result from a data source based on at least one search term describing the user;
determining an effect of the at least one search result from the data source on the reputation of the user; and
calculating a reputation score for the user based on the determined effect of the at least one search result from the data source on the reputation of the user.
41. A system for analyzing information about a user, the system comprising at least one module configured to perform the steps of:
obtaining at least one search result from a data source based on at least one search term describing the user;
presenting the at least one search result to the user;
receiving an indication of desirability of the at least one search result; and
performing an action based on the desirability of the at least one search result.
42. A system for determining a reputation score representing a reputation of a user, the system comprising at least one module configured to perform the steps of:
obtaining at least one search result from a data source based on at least one search term describing the user;
determining an effect of the search results from the data source on the reputation of the user; and
calculating a reputation score for the user based on the determined effect of the search results from the data source on the reputation of the user.
HK10108818.1A 2007-01-31 2008-01-30 Identifying and changing personal information HK1142700A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US60/898,899 2007-01-31

Publications (1)

Publication Number Publication Date
HK1142700A true HK1142700A (en) 2010-12-10

Family

ID=

Similar Documents

Publication Publication Date Title
US8027975B2 (en) Identifying and changing personal information
US12061655B2 (en) Graphical user interface for presentation of events
Olteanu et al. Web credibility: Features exploration and credibility prediction
US8024333B1 (en) System and method for providing information navigation and filtration
US9213961B2 (en) Systems and methods for generating social index scores for key term analysis and comparisons
US20140297403A1 (en) Social Analytics System and Method for Analyzing Conversations in Social Media
US20130246440A1 (en) Processing a content item with regard to an event and a location
US20130007014A1 (en) Systems and methods for determining visibility and reputation of a user on the internet
KR20100084510A (en) Identifying information related to a particular entity from electronic sources
WO2007140364A2 (en) Method for scoring changes to a webpage
EP3423927A1 (en) Domain-specific negative media search techniques
US10311072B2 (en) System and method for metadata transfer among search entities
US10380121B2 (en) System and method for query temporality analysis
Robertson et al. Engagement outweighs exposure to partisan and unreliable news within Google Search
Gezici Quantifying political bias in news articles
US11113299B2 (en) System and method for metadata transfer among search entities
KR101987301B1 (en) Sensibility level yielding system through web data Analysis associated with a stock and a social data and Controlling Method for the Same
HK1142700A (en) Identifying and changing personal information
CN115600074A (en) Event influence calculation method