HK1230300B - Systems and methods for analyzing and synthesizing complex knowledge representations - Google Patents
Systems and methods for analyzing and synthesizing complex knowledge representationsInfo
- Publication number
- HK1230300B HK1230300B HK17103671.1A HK17103671A HK1230300B HK 1230300 B HK1230300 B HK 1230300B HK 17103671 A HK17103671 A HK 17103671A HK 1230300 B HK1230300 B HK 1230300B
- Authority
- HK
- Hong Kong
- Prior art keywords
- concept
- complex
- knowledge representation
- basic
- concepts
- Prior art date
Links
Description
分案申请说明Divisional Application Instructions
本申请是国际申请日为2011年6月22日、于2012年12月21日进入中国国家阶段、申请号为201180031005.5、名称为“用于分析和合成复杂知识表示的系统和方法”的中国发明专利申请的分案申请。This application is a divisional application of the Chinese invention patent application with the international application date of June 22, 2011, which entered the Chinese national phase on December 21, 2012, with application number 201180031005.5 and the name “System and method for analyzing and synthesizing complex knowledge representation”.
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求对通过引用而完全结合于此、于2010年6月22日提交、标题为"Systemsand Methods for Analyzing and Synthesizing Complex Knowledge Representations"的第61/357,266号的美国临时申请的优先权。这一申请也构建于在由相同申请人和/或受让人提交的多份在先申请中公开的概念上,这些在先申请包括阅读者除了参考下文讨论的背景之外也作为背景而参考的以下申请:于2011年6月15日提交、标题为"Systems andMethods for Analyzing and Synthesizing Complex Knowledge Representations"的第13/161,165号美国专利申请;于2009年6月4日提交、标题为"System,Method and ComputerProgram for Transforming an Existing Complex Data Structure to AnotherComplex Data Structure"的第12/477,977号美国专利申请;于2006年8月31日提交、标题为"Complex-Adaptive System for Providing a Faceted Classification"、现为第7,596,574号美国专利的第11/469,258号美国专利申请;以及于2006年3月30日提交、标题为"System,Method,and Computer Program for Constructing and Managing DimensionalInformation Structures"的第11/392,937号美国专利申请。This application claims priority to U.S. Provisional Application No. 61/357,266, filed June 22, 2010, entitled “Systems and Methods for Analyzing and Synthesizing Complex Knowledge Representations,” which is hereby incorporated by reference in its entirety. This application also builds upon concepts disclosed in multiple prior applications filed by the same applicant and/or assignee, including the following, to which the reader is referred by way of background in addition to the background discussed below: U.S. patent application Ser. No. 13/161,165, filed Jun. 15, 2011, entitled “Systems and Methods for Analyzing and Synthesizing Complex Knowledge Representations”; U.S. patent application Ser. No. 12/477,977, filed Jun. 4, 2009, entitled “System, Method and Computer Program for Transforming an Existing Complex Data Structure to Another Complex Data Structure”; U.S. patent application Ser. No. 11/469,258, filed Aug. 31, 2006, entitled “Complex-Adaptive System for Providing a Faceted Classification”, now U.S. Patent No. 7,596,574; and U.S. patent application Ser. No. 13/161,165, filed Jun. 15, 2011, entitled “Systems and Methods for Analyzing and Synthesizing Complex Knowledge Representations”; U.S. patent application Ser. No. 12/477,977, filed Jun. 4, 2009, entitled “System, Method and Computer Program for Transforming an Existing Complex Data Structure to Another Complex Data Structure”; U.S. patent application Ser. No. 11/469,258, filed Aug. 31, 2006, entitled “Complex-Adaptive System for Providing a Faceted Classification”; and U.S. patent application Ser. No. 13/161,165, filed Mar. 30, 2006, entitled “System, Method, and Computer Program for Constructing and Managing Dimensional Information Structures"
背景技术Background Art
广而言之。知识表示是如下活动,该活动使抽象知识作为具体数据结构而明确以支持基于机器的存储、管理和推理系统。存在用于利用根据各种类型的知识表示模型构造的知识表示(KR)的常规方法和系统,这些KR包括:结构化受控词汇表,比如分类法、辞典和按面分类;形式规范,比如语义网络和本体论;以及非结构化形式,比如基于自然语言的文档。Broadly speaking, knowledge representation is the activity of making abstract knowledge explicit as concrete data structures to support machine-based storage, management, and reasoning systems. Conventional methods and systems exist for utilizing knowledge representation (KR) constructed according to various types of knowledge representation models, including structured controlled vocabularies such as taxonomies, thesauri, and faceted classifications; formal specifications such as semantic networks and ontologies; and unstructured forms such as natural language-based documents.
分类法是如下KR结构,该KR结构将类别组织成分级树并且关联类别与相关对象(比如物理项(item)、文档或者其它数字内容。通常在也被称为父类型-子类型关系、广义性-专门性关系或者父-子关系的继承关系方面组织分类法中的类别或者概念。在这样的关系中,子类别或者概念具有与它的父代相同的性质、行为和约束以及一个或者多个附加性质、行为或者约束。例如,知识语句“狗是哺乳动物”可以在分类法中由通过父-子分级关系链接的、被标注为“哺乳动物”和“狗”的概念/类别编码。这样的表示对狗(子概念)是哺乳动物(父概念)类型、但是并非每个哺乳动物必然是狗这样的概念编码。A taxonomy is a KR structure that organizes categories into a hierarchical tree and associates categories with related objects such as physical items, documents, or other digital content. Categories or concepts in a taxonomy are typically organized in terms of inheritance relationships, also known as parent-child relationships, generality-specialization relationships, or parent-child relationships. In such a relationship, a subcategory or concept has the same properties, behaviors, and constraints as its parent, as well as one or more additional properties, behaviors, or constraints. For example, the knowledge statement "a dog is a mammal" can be encoded in a taxonomy by concepts/categories labeled "mammal" and "dog" linked by a parent-child hierarchical relationship. Such a representation encodes the concept that a dog (child concept) is of type mammal (parent concept), but not every mammal is necessarily a dog.
辞典是表示术语(比如用于信息检索的搜索关键词)的KR,这些术语经常编码为单字词名词概念。在辞典中的术语/概念之间的链接通常被划分成以下三个类型的关系:分级关系、等价关系和关联关系。分级关系用来链接在范围上比彼此更窄和更宽的术语,这与在分类法中的概念之间的关系相似。沿用前例,“狗”和“哺乳动物”是由分级关系链接的术语。等价关系链接可以作为搜索术语而相互代替的术语,比如同义词或者近义词。例如,可以在一些上下文中通过等价关系链接术语“狗”和“犬科”。关联关系链接如下有关术语,这些术语的关系既不是分级也不是等价。例如,搜索术语“狗”的用户也可能想要看见从“饲养员”的搜索中返回的项,并且可以针对该对术语在辞典数据结构中对关联关系进行编码。A dictionary is a KR that represents terms (such as search keywords for information retrieval), which are often encoded as single-word noun concepts. The links between terms/concepts in a dictionary are generally divided into the following three types of relationships: hierarchical relationships, equivalence relationships, and association relationships. Hierarchical relationships are used to link terms that are narrower and wider in scope than each other, which is similar to the relationship between concepts in a taxonomy. Following the previous example, "dog" and "mammal" are terms linked by a hierarchical relationship. Equivalence relationships link terms that can replace each other as search terms, such as synonyms or near-synonyms. For example, the terms "dog" and "canine" can be linked by an equivalence relationship in some contexts. An association relationship links related terms whose relationships are neither hierarchical nor equivalent. For example, a user searching for the term "dog" may also want to see items returned from a search for "breeder," and the association relationship can be encoded in the dictionary data structure for this pair of terms.
按面分类基于信息具有多维质量并且可以按照许多不同方式来编码这样的原理。信息领域的主题被细分成用于表示这一维度性的面(或者更简称为类别)。领域的属性在面分级中有关。继而基于这些属性对领域内的对象进行描述和分类。例如,可以使用颜色面、材料面、款式面等对物理或者基于web的服装店中提供销售的衣物汇集进行分类,其中每个面具有表示不同类型的颜色、材料、款式等的多个分级属性。按面分类经常在按面搜索系统中例如用来允许用户按照任何希望的面排序(比如按照先颜色后款式、按照先款式后颜色、按照先材料后颜色后款式或者按照任何其它希望的面优先顺序搜索衣物汇集。这样的按面分类与通过分类法的分类对照,在后者中,类别分级是固定的。Facet classification is based on the principle that information has a multidimensional quality and can be encoded in many different ways. The subject matter of an information domain is subdivided into facets (or more simply, categories) that represent this dimensionality. The attributes of the domain are related in the facet hierarchy. Objects within the domain are then described and classified based on these attributes. For example, a collection of clothing offered for sale in a physical or web-based clothing store can be classified using color facets, material facets, style facets, etc., where each facet has multiple hierarchical attributes representing different types of colors, materials, styles, etc. Facet classification is often used in facet search systems, for example, to allow users to search for clothing collections according to any desired facet sorting (e.g., color first, then style, style first, then color, then style, or any other desired facet priority order). Such facet classification contrasts with classification by taxonomy, in which the category hierarchy is fixed.
语义网络是表示在概念之间的各种类型的语义关系的网络结果或者对网络结构进行编码或者实例化的数据结构。语义网络通常被表示为由表示概念的顶点和用链接概念对的关系类型标注的边构成的有向或者无向图。语义网络的示例是WordNet,这是英语语言的词汇数据库。在WordNet中定义的一些常见类型的语义关系是整体-部分关系(A是B的部分)、上下位关系(A是B的一种)、同义词(A与B同义)和反义词(A与B反义)。A semantic network is a network representation of various types of semantic relationships between concepts, or a data structure that encodes or instantiates the network structure. A semantic network is typically represented as a directed or undirected graph consisting of vertices representing concepts and edges labeled with the type of relationship linking pairs of concepts. An example of a semantic network is WordNet, a lexical database for the English language. Some common types of semantic relationships defined in WordNet are whole-part relationships (A is a part of B), hyponymy (A is a type of B), synonyms (A and B are synonyms), and antonyms (A and B are antonyms).
本体论是对概念和在那些概念之间的关系进行编码的KR结构,该KR结构限于它用来建模的现实或者虚拟世界的特定领域。在本体论中包括的概念通常表示术语在它们应用于被建模或者分类的领域时的特定含义,并且包括的概念关系通常表示那些概念在领域内被相关的方式。例如,与字词“卡”对应的概念可以在关于纸牌游戏领域的本体论和关于计算机硬件领域的本体论中具有不同含义。An ontology is a KR structure that encodes concepts and the relationships between those concepts, and is limited to the specific domain of the real or virtual world it is used to model. The concepts included in an ontology generally represent the specific meanings of terms when they are applied to the domain being modeled or classified, and the included conceptual relationships generally represent how those concepts are related within the domain. For example, the concept corresponding to the word "card" can have different meanings in an ontology about the domain of card games than in an ontology about the domain of computer hardware.
一般而言,所有上文讨论的KR类型以及其它常规示例是用于在抽象概念和在那些概念之间的关系方面对人类知识进行建模并且用于使该知识可由机器(比如计算机)访问以用于执行各种需要知识的任务的工具。这样,人类用户和软件开发者常规地使用他们的人类知识来构造KR数据结构,并且将完成的KR数据结构人工编码成机器可读形式作为将存储于机器存储器中并且由各种机器执行的功能访问的数据结构。Generally speaking, all of the KR types discussed above, as well as other conventional examples, are tools for modeling human knowledge in terms of abstract concepts and the relationships between those concepts, and for making that knowledge accessible to machines (such as computers) for performing various knowledge-requiring tasks. Thus, human users and software developers conventionally use their human knowledge to construct KR data structures, and manually encode the completed KR data structures into a machine-readable form as data structures to be stored in machine memory and accessed by various machine-executed functions.
发明内容Summary of the Invention
一个实施例涉及一种用于生成复杂知识表示的方法,该方法包括:接收指示请求上下文的输入;用处理器将一个或者多个规则应用于表示至少一个基本概念、至少一个基本概念关系或者至少一个基本概念和至少一个基本概念关系的基本数据结构;基于一个或者多个规则的应用,根据请求上下文合成一个或者多个附加概念、一个或者多个附加概念关系或者一个或者多个附加概念和一个或者多个附加概念关系;并且使用附加概念中的至少一个附加概念、附加概念关系中的至少一个附加概念关系或者附加概念中的至少一个附加概念和附加概念关系中的至少一个附加概念关系,根据请求上下文生成复杂知识表示。One embodiment relates to a method for generating a complex knowledge representation, the method comprising: receiving input indicating a request context; applying, with a processor, one or more rules to a basic data structure representing at least one basic concept, at least one basic concept relationship, or at least one basic concept and at least one basic concept relationship; synthesizing, based on the application of the one or more rules, one or more additional concepts, one or more additional concept relationships, or one or more additional concepts and one or more additional concept relationships according to the request context; and generating a complex knowledge representation according to the request context using at least one additional concept among the additional concepts, at least one additional concept relationship among the additional concept relationships, or at least one additional concept among the additional concepts and at least one additional concept relationship among the additional concept relationships.
另一实施例涉及一种用于生成复杂知识表示的系统,该系统包括存储处理器可执行指令的至少一个非瞬态计算机可读存储介质,这些指令在由至少一个处理器执行时,执行:接收指示请求上下文的输入;将一个或者多个规则应用于表示至少一个基本概念、至少一个基本概念关系或者至少一个基本概念和至少一个基本概念关系的基本数据结构;基于一个或者多个规则的应用,根据请求上下文合成一个或者多个附加概念、一个或者多个附加概念关系或者一个或者多个附加概念和一个或者多个附加概念关系;并且使用附加概念中的至少一个附加概念、附加概念关系中的至少一个附加概念关系或者附加概念中的至少一个附加概念和附加概念关系中的至少一个附加概念关系,根据请求上下文生成复杂知识表示。Another embodiment relates to a system for generating a complex knowledge representation, the system comprising at least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one processor, perform: receiving input indicating a request context; applying one or more rules to a basic data structure representing at least one basic concept, at least one basic concept relationship, or at least one basic concept and at least one basic concept relationship; synthesizing one or more additional concepts, one or more additional concept relationships, or one or more additional concepts and one or more additional concept relationships according to the request context based on the application of the one or more rules; and generating a complex knowledge representation according to the request context using at least one additional concept among the additional concepts, at least one additional concept relationship among the additional concept relationships, or at least one additional concept among the additional concepts and at least one additional concept relationship among the additional concept relationships.
另一实施例涉及用多个计算机可执行指令编码的至少一种非瞬态计算机可读存储介质,这些指令用于生成复杂知识表示,其中指令在被执行时,执行:接收指示请求上下文的输入;将一个或者多个规则应用于表示至少一个基本概念、至少一个基本概念关系或者至少一个基本概念和至少一个基本概念关系的基本数据结构;基于一个或者多个规则的应用,根据请求上下文合成一个或者多个附加概念、一个或者多个附加概念关系或者一个或者多个附加概念和一个或者多个附加概念关系;并且使用附加概念中的至少一个附加概念、附加概念关系中的至少一个附加概念关系或者附加概念中的至少一个附加概念和附加概念关系中的至少一个附加概念关系,根据请求上下文生成复杂知识表示。Another embodiment relates to at least one non-transitory computer-readable storage medium encoded with a plurality of computer-executable instructions for generating a complex knowledge representation, wherein the instructions, when executed, perform: receiving input indicating a request context; applying one or more rules to a basic data structure representing at least one basic concept, at least one basic concept relationship, or at least one basic concept and at least one basic concept relationship; synthesizing one or more additional concepts, one or more additional concept relationships, or one or more additional concepts and one or more additional concept relationships according to the request context based on the application of the one or more rules; and generating a complex knowledge representation according to the request context using at least one additional concept among the additional concepts, at least one additional concept relationship among the additional concept relationships, or at least one additional concept among the additional concepts and at least one additional concept relationship among the additional concept relationships.
另一实施例涉及一种用于解构原有知识表示的方法,该方法包括:接收与原有知识表示对应的输入;用处理器应用一个或者多个规则以将原有知识表示解构成一个或者多个基本概念、一个或者多个基本概念关系或者一个或者多个基本概念和一个或者多个基本概念关系;并且在基本数据结构中包括基本概念中的至少一个基本概念、基本概念关系中的至少一个基本概念关系或者基本概念中的至少一个基本概念和基本概念关系中的至少一个基本概念关系的表示。Another embodiment relates to a method for deconstructing an original knowledge representation, the method comprising: receiving input corresponding to the original knowledge representation; applying one or more rules with a processor to deconstruct the original knowledge representation into one or more basic concepts, one or more basic concept relationships, or one or more basic concepts and one or more basic concept relationships; and including in a basic data structure a representation of at least one basic concept among the basic concepts, at least one basic concept relationship among the basic concept relationships, or at least one basic concept among the basic concepts and at least one basic concept relationship among the basic concept relationships.
另一实施例涉及一种用于解构原有知识表示的系统,该系统包括存储处理器可执行指令的至少一个非瞬态计算机可读存储介质,这些指令在由至少一个处理器执行时,执行:接收与原有知识表示对应的输入;应用一个或者多个规则以将原有知识表示解构成一个或者多个基本概念、一个或者多个基本概念关系或者一个或者多个基本概念和一个或者多个基本概念关系;并且在基本数据结构中包括基本概念中的至少一个基本概念、基本概念关系中的至少一个基本概念关系或者基本概念中的至少一个基本概念和基本概念关系中的至少一个基本概念关系的表示。Another embodiment relates to a system for deconstructing an original knowledge representation, the system comprising at least one non-transitory computer-readable storage medium storing processor-executable instructions, which, when executed by at least one processor, perform: receiving input corresponding to the original knowledge representation; applying one or more rules to deconstruct the original knowledge representation into one or more basic concepts, one or more basic concept relationships, or one or more basic concepts and one or more basic concept relationships; and including in a basic data structure a representation of at least one basic concept among the basic concepts, at least one basic concept relationship among the basic concept relationships, or at least one basic concept among the basic concepts and at least one basic concept relationship among the basic concept relationships.
另一实施例涉及用多个计算机可执行指令编码的至少一种非瞬态计算机可读存储介质,这些指令用于解构原有知识表示,其中指令在被执行时,执行:接收与原有知识表示对应的输入;应用一个或者多个规则以将原有知识表示解构成一个或者多个基本概念、一个或者多个基本概念关系或者一个或者多个基本概念和一个或者多个基本概念关系;并且在基本数据结构中包括基本概念中的至少一个基本概念、基本概念关系中的至少一个基本概念关系或者基本概念中的至少一个基本概念和基本概念关系中的至少一个基本概念关系的表示。Another embodiment relates to at least one non-transitory computer-readable storage medium encoded with a plurality of computer-executable instructions for deconstructing an original knowledge representation, wherein the instructions, when executed, perform: receiving input corresponding to the original knowledge representation; applying one or more rules to deconstruct the original knowledge representation into one or more basic concepts, one or more basic concept relationships, or one or more basic concepts and one or more basic concept relationships; and including in a basic data structure a representation of at least one of the basic concepts, at least one of the basic concept relationships, or at least one of the basic concepts and at least one of the basic concept relationships.
另一实施例涉及一种用于支持在知识表示之间的语义互操作性的方法,该方法包括:对于多个输入知识表示中的每个输入知识表示,用处理器应用一个或者多个规则以将输入知识表示解构成一个或者多个基本概念、一个或者多个基本概念关系或者一个或者多个基本概念和一个或者多个基本概念关系;并且用处理器在共享基本数据结构中针对多个输入知识表示中的每个输入知识表示包括基本概念中的至少一个基本概念、基本概念关系中的至少一个基本概念关系或者基本概念中的至少一个基本概念和基本概念关系中的至少一个基本概念关系的表示。Another embodiment relates to a method for supporting semantic interoperability between knowledge representations, the method comprising: for each input knowledge representation in a plurality of input knowledge representations, applying, with a processor, one or more rules to deconstruct the input knowledge representation into one or more basic concepts, one or more basic concept relationships, or one or more basic concepts and one or more basic concept relationships; and comprising, with the processor, in a shared basic data structure, a representation of at least one basic concept among the basic concepts, at least one basic concept relationship among the basic concept relationships, or at least one basic concept among the basic concepts and at least one basic concept relationship among the basic concept relationships for each input knowledge representation in the plurality of input knowledge representations.
另一实施例涉及一种用于支持在知识表示之间的语义互操作的系统,该系统包括存储处理器可执行指令的至少一个非瞬态计算机可读存储介质,这些指令在由至少一个处理器执行时,执行:对于多个输入知识表示中的每个输入知识表示,应用一个或者多个规则以将输入知识表示解构成一个或者多个基本概念、一个或者多个基本概念关系或者一个或者多个基本概念和一个或者多个基本概念关系;并且在共享基本数据结构中针对多个输入知识表示中的每个输入知识表示包括基本概念中的至少一个基本概念、基本概念关系中的至少一个基本概念关系或者基本概念中的至少一个基本概念和基本概念关系中的至少一个基本概念关系的表示。Another embodiment relates to a system for supporting semantic interoperability between knowledge representations, the system comprising at least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one processor, perform: for each input knowledge representation in a plurality of input knowledge representations, applying one or more rules to deconstruct the input knowledge representation into one or more basic concepts, one or more basic concept relationships, or one or more basic concepts and one or more basic concept relationships; and including, in a shared basic data structure, for each input knowledge representation in the plurality of input knowledge representations, a representation of at least one of the basic concepts, at least one of the basic concept relationships, or at least one of the basic concepts and at least one of the basic concept relationships.
另一实施例涉及用多个计算机可执行指令编码的至少一种非瞬态计算机可读存储介质,这些指令用于支持在知识表示之间的语义互操作性,其中指令在被执行时,执行:对于多个输入知识表示中的每个输入知识表示,应用一个或者多个规则以将输入知识表示解构成一个或者多个基本概念、一个或者多个基本概念关系或者一个或者多个基本概念和一个或者多个基本概念关系;并且在共享基本数据结构中针对多个输入知识表示中的每个输入知识表示包括基本概念中的至少一个基本概念、基本概念关系中的至少一个基本概念关系或者基本概念中的至少一个基本概念和基本概念关系中的至少一个基本概念关系的表示。Another embodiment relates to at least one non-transitory computer-readable storage medium encoded with a plurality of computer-executable instructions for supporting semantic interoperability between knowledge representations, wherein the instructions, when executed, perform: for each input knowledge representation in a plurality of input knowledge representations, applying one or more rules to deconstruct the input knowledge representation into one or more base concepts, one or more base concept relationships, or one or more base concepts and one or more base concept relationships; and including, in a shared base data structure, for each input knowledge representation in the plurality of input knowledge representations, a representation of at least one of the base concepts, at least one of the base concept relationships, or at least one of the base concepts and at least one of the base concept relationships.
另一实施例涉及一种用于合成复杂指示表示的计算机实施的方法,所述方法包括:从数据消耗者接收上下文;根据所述上下文标识基本知识表示中的包括第一概念的一个或者多个基本组成;以及通过经由至少一个处理器对存储的指令的执行向所述一个或者多个基本组成应用一个或者多个规则来生成复杂知识表示,其中生成所述复杂知识表示包括合成在所述基本知识表示中不存在的复杂概念,以及在所述复杂知识表示中包括合成的所述复杂概念,其中合成所述复杂概念包括接合所述第一概念和不与所述第一概念分级地有关的基本概念以形成合成的所述复杂概念。Another embodiment relates to a computer-implemented method for synthesizing a complex indication representation, the method comprising: receiving a context from a data consumer; identifying one or more basic components in a basic knowledge representation including a first concept based on the context; and generating a complex knowledge representation by applying one or more rules to the one or more basic components through execution of stored instructions by at least one processor, wherein generating the complex knowledge representation comprises synthesizing complex concepts that do not exist in the basic knowledge representation, and including the synthesized complex concepts in the complex knowledge representation, wherein synthesizing the complex concept comprises joining the first concept and basic concepts that are not hierarchically related to the first concept to form the synthesized complex concept.
其中,在所述方法中,其中生成所述复杂知识表示包括在所述复杂知识表示中包括内在关系,所述内在关系在集合中接合概念以创建复杂概念。Wherein, in the method, generating the complex knowledge representation includes including internal relationships in the complex knowledge representation, the internal relationships joining concepts in a set to create a complex concept.
其中,在所述方法中,其中生成所述复杂知识表示包括在所述复杂知识表示中对在所述第一概念和通过接合所述第一概念与所述基本概念而合成的所述复杂概念之间的内在关系进行编码。Wherein, in the method, generating the complex knowledge representation includes encoding, in the complex knowledge representation, an intrinsic relationship between the first concept and the complex concept synthesized by joining the first concept with the basic concept.
其中,在所述方法中,其中所述第一概念和与所述第一概念接合以合成所述复杂概念的所述基本概念形成用于所述复杂概念的概念定义。Wherein, in the method, the first concept and the basic concept joined with the first concept to synthesize the complex concept form a concept definition for the complex concept.
其中,在所述方法中,其中生成所述复杂知识表示包括在所述复杂知识表示中包括外在关系,所述外在关系描述在概念对之间的特征。Wherein, in the method, generating the complex knowledge representation includes including extrinsic relations in the complex knowledge representation, the extrinsic relations describing features between pairs of concepts.
其中,在所述方法中,其中生成所述复杂知识表示包括在所述复杂知识表示中对在所述第一概念和所述复杂概念之间的所述内在关系与在所述第一概念和第二概念之间的外在关系二者进行编码。Wherein, in the method, generating the complex knowledge representation includes encoding both the intrinsic relationship between the first concept and the complex concept and the extrinsic relationship between the first concept and the second concept in the complex knowledge representation.
其中,在所述方法中,其中在所述第一概念和所述第二概念之间的所述外在关系对在所述第一概念和所述第二概念之间的分级关系进行编码。Wherein, in the method, the extrinsic relationship between the first concept and the second concept encodes a hierarchical relationship between the first concept and the second concept.
另一实施例涉及至少一个非瞬态计算机可读存储介质,所述至少一个非瞬态计算机可读存储介质存储计算机可执行指令,所述计算机可执行指令在被执行时执行用于合成复杂知识表示的方法,所述方法包括:从数据消耗者接收上下文;根据所述上下文标识基本知识表示中的包括第一概念的一个或者多个基本组成;以及通过经由至少一个处理器对存储的指令的执行向所述一个或者多个基本组成应用一个或者多个规则来生成复杂知识表示,其中生成所述复杂知识表示包括合成在所述基本知识表示中不存在的复杂概念,以及在所述复杂知识表示中包括合成的所述复杂概念,其中合成所述复杂概念包括接合所述第一概念和不与所述第一概念分级地有关的基本概念以形成合成的所述复杂概念。Another embodiment relates to at least one non-transitory computer-readable storage medium storing computer-executable instructions that, when executed, perform a method for synthesizing a complex knowledge representation, the method comprising: receiving a context from a data consumer; identifying one or more basic components in a basic knowledge representation that include a first concept based on the context; and generating a complex knowledge representation by applying one or more rules to the one or more basic components through execution of the stored instructions by at least one processor, wherein generating the complex knowledge representation comprises synthesizing complex concepts that do not exist in the basic knowledge representation, and including the synthesized complex concepts in the complex knowledge representation, wherein synthesizing the complex concept comprises joining the first concept and basic concepts that are not hierarchically related to the first concept to form the synthesized complex concept.
其中,在所述至少一个非瞬态计算机可读存储介质中,其中生成所述复杂知识表示包括在所述复杂知识表示中包括内在关系,所述内在关系在集合中接合概念以创建复杂概念。Wherein, in the at least one non-transitory computer-readable storage medium, wherein generating the complex knowledge representation comprises including internal relationships in the complex knowledge representation, the internal relationships joining concepts in a set to create a complex concept.
其中,在所述至少一个非瞬态计算机可读存储介质中,其中生成所述复杂知识表示包括在所述复杂知识表示中对在所述第一概念和通过接合所述第一概念与所述基本概念而合成的所述复杂概念之间的内在关系进行编码。Wherein, in the at least one non-transitory computer-readable storage medium, generating the complex knowledge representation includes encoding, in the complex knowledge representation, an intrinsic relationship between the first concept and the complex concept synthesized by joining the first concept with the basic concept.
其中,在所述至少一个非瞬态计算机可读存储介质中,其中所述第一概念和与所述第一概念接合以合成所述复杂概念的所述基本概念形成用于所述复杂概念的概念定义。Wherein, in the at least one non-transitory computer-readable storage medium, the first concept and the basic concepts joined with the first concept to synthesize the complex concept form a concept definition for the complex concept.
其中,在所述至少一个非瞬态计算机可读存储介质中,其中生成所述复杂知识表示包括在所述复杂知识表示中包括外在关系,所述外在关系描述在概念对之间的特征。Wherein, in the at least one non-transitory computer-readable storage medium, wherein generating the complex knowledge representation comprises including extrinsic relations in the complex knowledge representation, the extrinsic relations describing characteristics between pairs of concepts.
其中,在所述至少一个非瞬态计算机可读存储介质中,其中生成所述复杂知识表示包括在所述复杂知识表示中对在所述第一概念和所述复杂概念之间的所述内在关系与在所述第一概念和第二概念之间的外在关系二者进行编码。wherein, in the at least one non-transitory computer-readable storage medium, generating the complex knowledge representation comprises encoding both the intrinsic relationship between the first concept and the complex concept and the extrinsic relationship between the first concept and the second concept in the complex knowledge representation.
其中,在所述至少一个非瞬态计算机可读存储介质中,其中在所述第一概念和所述第二概念之间的所述外在关系对在所述第一概念和所述第二概念之间的分级关系进行编码。wherein, in the at least one non-transitory computer-readable storage medium, wherein the extrinsic relationship between the first concept and the second concept encodes a hierarchical relationship between the first concept and the second concept.
另一实施例涉及一种装置,包括:至少一个处理器;以及至少一个存储介质,所述至少一个存储介质存储处理器可执行指令,所述处理器可执行指令在由所述至少一个处理器执行时执行用于合成复杂知识表示的方法,所述方法包括:从数据消耗者接收上下文;根据所述上下文标识基本知识表示中的包括第一概念的一个或者多个基本组成;以及通过向所述一个或者多个基本组成应用一个或者多个规则来生成复杂知识表示,其中生成所述复杂知识表示包括合成在所述基本知识表示中不存在的复杂概念,以及在所述复杂知识表示中包括合成的所述复杂概念,其中合成所述复杂概念包括接合所述第一概念和不与所述第一概念分级地有关的基本概念以形成合成的所述复杂概念。Another embodiment relates to an apparatus comprising: at least one processor; and at least one storage medium storing processor-executable instructions, the processor-executable instructions, when executed by the at least one processor, performing a method for synthesizing a complex knowledge representation, the method comprising: receiving a context from a data consumer; identifying one or more basic components in a basic knowledge representation including a first concept based on the context; and generating a complex knowledge representation by applying one or more rules to the one or more basic components, wherein generating the complex knowledge representation comprises synthesizing a complex concept that does not exist in the basic knowledge representation, and including the synthesized complex concept in the complex knowledge representation, wherein synthesizing the complex concept comprises joining the first concept and a basic concept that is not hierarchically related to the first concept to form the synthesized complex concept.
其中,在所述装置中,其中生成所述复杂知识表示包括在所述复杂知识表示中包括内在关系,所述内在关系在集合中接合概念以创建复杂概念。Wherein, in the apparatus, generating the complex knowledge representation comprises including internal relationships in the complex knowledge representation, the internal relationships joining concepts in a set to create a complex concept.
其中,在所述装置中,其中生成所述复杂知识表示包括在所述复杂知识表示中对在所述第一概念和通过接合所述第一概念与所述基本概念而合成的所述复杂概念之间的内在关系进行编码。Herein, in the apparatus, generating the complex knowledge representation includes encoding, in the complex knowledge representation, an intrinsic relationship between the first concept and the complex concept synthesized by joining the first concept with the basic concept.
其中,在所述装置中,其中所述第一概念和与所述第一概念接合以合成所述复杂概念的所述基本概念形成用于所述复杂概念的概念定义。Wherein, in the apparatus, the first concept and the basic concept joined with the first concept to synthesize the complex concept form a concept definition for the complex concept.
其中,在所述装置中,其中生成所述复杂知识表示包括在所述复杂知识表示中包括外在关系,所述外在关系描述在概念对之间的特征。Wherein, in the apparatus, generating the complex knowledge representation includes including extrinsic relations in the complex knowledge representation, the extrinsic relations describing features between pairs of concepts.
其中,在所述装置中,其中生成所述复杂知识表示包括在所述复杂知识表示中对在所述第一概念和所述复杂概念之间的所述内在关系与在所述第一概念和第二概念之间的外在关系二者进行编码。Wherein, in the apparatus, generating the complex knowledge representation includes encoding both the intrinsic relationship between the first concept and the complex concept and the extrinsic relationship between the first concept and the second concept in the complex knowledge representation.
前文是由所附权利要求限定的本发明的非限制性发明内容。The foregoing is a non-limiting summary of the invention as defined by the appended claims.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
附图并非旨在于按比例绘制。在附图中,在各图中图示的每个相同或者接近相同的部件。出于清楚的目的,可以不在每幅图中标注每个部件。在附图中:The accompanying drawings are not intended to be drawn to scale. In the accompanying drawings, each identical or nearly identical component is illustrated in various figures. For the sake of clarity, not every component may be labeled in every figure. In the accompanying drawings:
图1是图示了根据本发明的一些实施例的用于实施原子知识表示模型的示例性系统的框图;FIG1 is a block diagram illustrating an exemplary system for implementing an atomic knowledge representation model according to some embodiments of the present invention;
图2A图示了根据本发明的一些实施例的示例性复杂知识表示;FIG2A illustrates an exemplary complex knowledge representation according to some embodiments of the present invention;
图2B图示了根据本发明的一些实施例的原子知识表示模型的示例性基本数据结构;FIG2B illustrates an exemplary basic data structure of an atomic knowledge representation model according to some embodiments of the present invention;
图3图示了根据本发明的一些实施例的示例性数据规划;FIG3 illustrates an exemplary data plan according to some embodiments of the present invention;
图4图示了根据本发明的一些实施例的用于分析复杂知识表示的示例性方法;FIG4 illustrates an exemplary method for analyzing complex knowledge representations according to some embodiments of the present invention;
图5是图示了根据本发明的一些实施例的用于实施分析和合成复杂知识表示的示例性分布式系统的框图;5 is a block diagram illustrating an exemplary distributed system for implementing analysis and synthesis of complex knowledge representations according to some embodiments of the present invention;
图6是图示了根据本发明的一些实施例的用于分析复杂知识表示以生成基本数据结构的示例性方法的流程图;6 is a flow chart illustrating an exemplary method for analyzing a complex knowledge representation to generate an elementary data structure according to some embodiments of the present invention;
图7是图示了根据本发明的一些实施例的用于从基本数据结构合成复杂知识表示的示例性方法;FIG7 illustrates an exemplary method for synthesizing complex knowledge representations from basic data structures according to some embodiments of the present invention;
图8是图示了根据本发明的一些实施例的示例性知识处理规则集的表;FIG8 is a table illustrating an exemplary knowledge processing rule set according to some embodiments of the present invention;
图9图示了可以从示例性自然语言文本推导的知识表示的示例;FIG9 illustrates an example of a knowledge representation that can be derived from an exemplary natural language text;
图10图示了可以从示例性辞典分析的基本数据结构的示例;并且FIG10 illustrates an example of a basic data structure that can be parsed from an exemplary thesaurus; and
图11是图示了用于在实现本发明的一些实施例时使用的示例性计算系统的框图。11 is a block diagram illustrating an exemplary computing system for use in implementing some embodiments of the present invention.
具体实施方式DETAILED DESCRIPTION
如上文讨论的那样,通过常规方法创建的知识表示(KR)数据结构对针对特定领域或者上下文建模的特定人类知识集进行编码和表示。由于KR通常由人类开发者构造并且以完整形式编程到机器存储器中,所以常规KR仅包含人类用户原先用来对它编程的人类知识子集。As discussed above, knowledge representation (KR) data structures created by conventional methods encode and represent a specific set of human knowledge modeled for a specific domain or context. Since KRs are typically constructed by human developers and programmed into machine memory in a complete form, conventional KRs contain only a subset of the human knowledge that the human user originally used to program it.
例如,KR可以对知识语句“狗是哺乳动物”编码,并且它也可以表达关于作为哺乳动物的动物的语句或者断言,比如“哺乳动物产生乳汁以喂养它们的幼崽”。发明人已经认识到这样的事实组合在与适当逻辑和语义规则组合时可以支持广泛人类推理从而使起初在KR内未作为事实而播种的显式推论(比如“狗产生乳汁用于喂养它们的幼崽”)成为显式。发明人已经领会到通过这样的推断扩展KR数据结构可以用来支持各种基于知识的活动和任务,比如推断/推理(如上文所示)、信息检索、数据挖掘和其它形式的分析。For example, a KR may encode the knowledge statement "Dogs are mammals," and it may also express statements or assertions about animals that are mammals, such as "Mammals produce milk to feed their pups." The inventors have recognized that such fact combinations, when combined with appropriate logical and semantic rules, can support a wide range of human reasoning, thereby making explicit inferences (such as "Dogs produce milk to feed their pups") that were not initially seeded as facts within the KR explicit. The inventors have appreciated that extending the KR data structure with such inferences can be used to support a variety of knowledge-based activities and tasks, such as inference/reasoning (as described above), information retrieval, data mining, and other forms of analysis.
然而,如上文讨论的那样,用于对KR进行构造和编码的方法已经常规上限于人工输入完整KR结构以用于由机器(比如计算机)访问和使用。沿用上例,虽然充当KR设计者的人类个人可以隐式理解为什么“狗产生乳汁用于喂养它们的幼崽”这一事实成立,但是为了使它成立而必须保持的性质(在这一情况下为比如传递性和继承性这样的性质)并不常规地是KR的显式部分。换言之,可以指导创建新知识的任何根本规则集并未常规地编码为KR的部分,而是实际上由人类设计者在构造KR时从系统以外应用。However, as discussed above, methods for constructing and encoding KRs have conventionally been limited to manually inputting complete KR structures for access and use by machines (e.g., computers). Continuing with the above example, while a human acting as a KR designer may implicitly understand why the fact that "dogs produce milk to feed their pups" holds true, the properties that must hold for it to hold true (in this case, properties such as transitivity and inheritance) are not conventionally an explicit part of the KR. In other words, any underlying set of rules that can guide the creation of new knowledge is not conventionally encoded as part of the KR, but is actually applied from outside the system by the human designer when constructing the KR.
常规方式的先前未认识到的结果是可以在KR中表达知识以用于由机器使用,但是KR本身不能由机器创建。人类被迫对知识领域建模以用于由机器消耗。遗憾的是,由于人类知识如此极为广泛并且在许多情况下是主观的,所以对所有知识领域建模在技术上不可行。A previously unrecognized consequence of conventional approaches is that knowledge can be expressed in KRs for machine consumption, but the KRs themselves cannot be created by machines. Humans are forced to model knowledge domains for machine consumption. Unfortunately, because human knowledge is so vast and, in many cases, subjective, modeling all knowledge domains is technically infeasible.
另外,由于如此多的知识必须被显式编码为数据,所以所得数据结构随着知识领域增长而快速变得无比庞大。由于常规KR并未用它们的用于知识创建的根本理论或者实践来编码为组成知识表示模型的数据的部分,所以它们的所得数据结构可能变得很复杂和不实用。换言之,由于知识表示无法由机器创建,所以它常规地必须被提供为显式数据或者以别的方式通过逻辑或者统计手段来演绎或者归纳。Furthermore, because so much knowledge must be explicitly encoded as data, the resulting data structures quickly become enormous as the knowledge domain grows. Because conventional KR does not encode its underlying theories or practices for knowledge creation as part of the data that makes up the knowledge representation model, its resulting data structures can become complex and impractical. In other words, because knowledge representation cannot be created by a machine, it must conventionally be provided as explicit data or otherwise deduced or inductively summarized through logical or statistical means.
因此,常规KR方式导致多个问题:Therefore, the conventional KR approach leads to multiple problems:
大而复杂的数据结构:常规地对知识表示进行编码的数据结构构建和维护起来复杂。甚至相对简单的机器可读知识领域(比如关于狗和哺乳动物的简单语句)也可能生成数量级比它的自然语言对应数据更大的数据量。Large and complex data structures: Data structures that encode knowledge representations are conventionally complex to construct and maintain. Even relatively simple machine-readable knowledge domains (such as simple statements about dogs and mammals) can generate data volumes orders of magnitude larger than their natural language counterparts.
依赖于领域专家:指引KR的实践的根本理论必须在KR数据结构的常规创建中由人类表达。这是在产生这些至关重要的数据资产时排除多数人和所有机器的耗时活动。作为结果,迄今为止的大多数人类知识已经保持隐式并且在计算范畴以外。Reliance on Domain Expertise: The underlying theories guiding the practice of KR must be expressed by humans in the routine creation of KR data structures. This is a time-consuming activity that excludes most humans and all machines from the production of these crucial data assets. As a result, most human knowledge to date has remained implicit and outside the realm of computation.
在使用之前创建数据:知识常规地在它被调用以用于特定使用时这样的时间之前被建模为数据,如果无需该知识,则这存在高成本并且潜在地存在浪费。因而,发明人已经认识到如果知识可以仅随需由机器创建,则可以大量减少数据产生和存储要求。Creating data before use: Knowledge is conventionally modeled as data before it is called upon for a specific use, which is costly and potentially wasteful if the knowledge is not needed. Therefore, the inventors have realized that if knowledge can be created by machines only on demand, data generation and storage requirements can be greatly reduced.
大规模数据和处理成本:常规KR系统必须在创建新事实或者答复查询的服务中在很大数据结构内推理。这一规模负担表示常规KR系统中的重大挑战,这是可以通过更多使用一种用于创建根本数据结构的及时方法而不是常规的在使用之前的数据方法来减少的负担。Large-scale data and processing costs: Conventional KR systems must reason over large data structures in order to create new facts or answer queries. This scale burden represents a significant challenge in conventional KR systems, a burden that can be reduced by using more of a just-in-time approach to creating the underlying data structures rather than the conventional data-before-use approach.
集成和互操作性挑战:语义互操作性(两个不同KR共享知识的能力)是在不同模型之下创建并且按照不同方式表达各种KR时的大规模困难挑战,该挑战经常应对主观和含糊主题。精确度和准确推理的能力经常跨越多个不同KR而丢失。就这一点而言,发明人已经领会到如果包括用于如何创建知识的根本理论作为KR的部分,则跨越不同KR的知识协调可能变成棘手问题。Integration and interoperability challenges: Semantic interoperability (the ability of two different KRs to share knowledge) is a massively difficult challenge when various KRs are created under different models and expressed in different ways, often dealing with subjective and ambiguous topics. Precision and the ability to accurately reason are often lost across multiple different KRs. In this regard, the inventors have appreciated that knowledge coordination across different KRs can become a thorny issue if the underlying theory for how knowledge is created is included as part of the KRs.
因而,根据本公开内容的一些实施例提供一种对知识创建规则进行编码以使创建知识表示的过程自动化的系统。一些实施例组合新的知识表示综合方式与用于创建和管理从这样的方式推导的所得数据结构的计算系统。Thus, some embodiments according to the present disclosure provide a system for encoding knowledge creation rules to automate the process of creating knowledge representations.Some embodiments combine new knowledge representation synthesis approaches with computing systems for creating and managing the resulting data structures derived from such approaches.
一些实施例组合更压缩的(原子)数据集与对根本知识创建进行编码的生成规则集而不是将领域中的所有知识建模为显式数据。这样的规则可以在一些实施例中在需要或者希望创建新知识并且显式表达它为数据时由系统应用。从上文讨论中应当领会到,这样的技术的益处可以在至少一些情形中是明显减少系统中的数据量以及提供用于基于机器的新知识创建(合成)的新能力和应用。然而,应当领会到并非根据本发明的每个实施例可以解决常规方式的每个标识出的问题,并且一些实施例可以不解决这些问题中的任何问题。一些实施例也可以解决除了这里叙述的问题之外的问题。另外,并非每个实施例可以提供这里讨论的益处中的所有或者任何益处,并且一些实施例可以提供未叙述的其它益处。Some embodiments combine more compressed (atomic) data sets with generative rule sets that encode fundamental knowledge creation rather than modeling all knowledge in the field as explicit data. Such rules can be applied by the system in some embodiments when it is necessary or desirable to create new knowledge and explicitly express it as data. It will be appreciated from the above discussion that the benefit of such technology can be, in at least some cases, to significantly reduce the amount of data in the system and to provide new capabilities and applications for machine-based new knowledge creation (synthesis). However, it will be appreciated that not every embodiment according to the present invention can solve every identified problem in a conventional manner, and some embodiments may not solve any of these problems. Some embodiments may also solve problems other than those described here. In addition, not every embodiment can provide all or any of the benefits discussed here, and some embodiments may provide other benefits that are not described.
一些实施例也提供用于复杂知识表示,比如分类法、本体论和按面分类,以不仅在数据级而且在语义级互操作(含义互操作性)的技术。Some embodiments also provide techniques for complex knowledge representations, such as taxonomies, ontologies, and faceted classifications, to interoperate not only at the data level but also at the semantic level (meaning interoperability).
可以在一些实施例中赋予的并且可以跨越许多新的和现有应用领域而应用的其它益处包括:由更简单和更经济的数据结构所赋予的知识表示的产生和应用二者中的成本更低;用于新知识创建的可能性;由及时随需知识所赋予的更可缩放的系统;以及支持来自用户和数据消耗者的“上下文”作为输入变量。在及时基础上应用合成和分析知识处理规则以创建知识表示数据结构的、根据本公开内容的一些实施例的动态性质可以提供可以提供比预先对整个知识领域进行分析和建模的常规方法更经济的益处。Other benefits that may be conferred in some embodiments and applicable across many new and existing application domains include: lower costs in both the generation and application of knowledge representations enabled by simpler and more economical data structures; the potential for new knowledge creation; a more scalable system enabled by just-in-time, on-demand knowledge; and support for "context" from users and data consumers as input variables. The dynamic nature of some embodiments of the present disclosure, which apply synthetic and analytical knowledge processing rules on a just-in-time basis to create knowledge representation data structures, can provide benefits that are more economical than conventional approaches that analyze and model the entire knowledge domain in advance.
通过在KR内并入根本知识创建规则集,可以减少系统中的数据量,从而提供更经济的数据管理系统并且提供用于知识管理的全新应用。因此,在一些实施例中,可以通过减少数据可伸缩性的负担而除非需要数据则才创建它来降低KR系统的产生和维护成本。一旦被创建,对复杂知识建模的数据结构在一些实施例中比在常规系统中相较而言更小,因为它们仅包含与手头的任务相关的数据。这转而可以减少下游应用(比如在这些知识模型之上工作的推论引擎或者数据挖掘工具)的成本。By incorporating a fundamental knowledge creation rule set into KR, the amount of data in the system can be reduced, thereby providing a more economical data management system and providing new applications for knowledge management. Therefore, in some embodiments, the cost of creating and maintaining a KR system can be reduced by reducing the burden of data scalability and not creating data until it is needed. Once created, the data structures that model complex knowledge are, in some embodiments, smaller than in conventional systems because they only contain data relevant to the task at hand. This in turn can reduce the cost of downstream applications (such as inference engines or data mining tools that work on these knowledge models).
根据本公开内容的一些实施例的综合计算的方式也支持知识表示和数据管理中的全新能力。一些实施例可以提供改进的对“可能性”(即从现有数据之中创建全新知识的表示)的支持。例如,这样的可能性的能力可以对于创造活动(比如教育、新闻和艺术)有用。The integrated computing approach according to some embodiments of the present disclosure also supports entirely new capabilities in knowledge representation and data management. Some embodiments may provide improved support for "possibilities" (i.e., creating entirely new knowledge representations from existing data). For example, such possibilities may be useful for creative activities such as education, journalism, and art.
图1图示了根据本发明的一些实施例的示例性系统100,可以在一些实施例中运用该系统以用于实施在复杂知识表示(KR)的分析和合成中涉及到的原子知识表示模型(AKRM)。在示例性系统100中,AKRM可以被编码为计算机可读数据并且存储于一个或者多个有形非瞬态计算机可读存储介质上。例如,AKRM可以用如下数据规划存储于非易失性计算机存储器中的数据集110中,下文给出该数据集的一些示例,该数据规划被设计成支持基本和复杂知识表示数据结构。FIG1 illustrates an exemplary system 100 according to some embodiments of the present invention, which can be used in some embodiments to implement an atomic knowledge representation model (AKRM) involved in the analysis and synthesis of complex knowledge representations (KRs). In exemplary system 100, the AKRM can be encoded as computer-readable data and stored on one or more tangible, non-transitory computer-readable storage media. For example, the AKRM can be stored in a dataset 110 in non-volatile computer memory using the following data structure, some examples of which are provided below. The data structure is designed to support both basic and complex knowledge representation data structures.
在一些实施例中,AKRM可以包括一个或者多个基本数据结构120和一个或者多个知识处理规则130。在一些实施例中,规则130可以由系统100用来解构(分析)一个或者多个复杂KR以生成基本数据结构120。例如,系统100可以包括一个或者多个计算机处理器和一个或者多个计算机存储器硬件部件,并且可以用计算机可执行指令对存储器编码,这些计算机可执行指令在由一个或者多个处理器执行时,使系统100的一个或者多个处理器在分析一个或者多个复杂KR时使用规则130以生成AKRM的基本数据结构120。也可以用如下指令对存储器编码,这些指令对一个或者多个处理器编程以使用规则130以从基本数据结构120合成新复杂KR。在一些实施例中,计算机存储器可以被实施为用计算机可执行指令编码的一个或者多个有形非瞬态计算机可读存储介质,这些计算机可执行指令在被执行时使一个或者多个处理器执行这里描述的功能中的任何功能。In some embodiments, the AKRM may include one or more base data structures 120 and one or more knowledge processing rules 130. In some embodiments, the rules 130 may be used by the system 100 to deconstruct (analyze) one or more complex KRs to generate the base data structures 120. For example, the system 100 may include one or more computer processors and one or more computer memory hardware components, and the memory may be encoded with computer-executable instructions that, when executed by the one or more processors, cause the one or more processors of the system 100 to use the rules 130 when analyzing the one or more complex KRs to generate the base data structures 120 of the AKRM. The memory may also be encoded with instructions that program the one or more processors to use the rules 130 to synthesize new complex KRs from the base data structures 120. In some embodiments, the computer memory may be implemented as one or more tangible, non-transitory, computer-readable storage media encoded with computer-executable instructions that, when executed, cause the one or more processors to perform any of the functions described herein.
不同于先前知识表示系统,根据本发明的一些实施例的系统(比如系统100)可以组合数据结构和知识处理规则以创建被编码为数据的知识表示模型。在一些实施例中,规则可以未被编码为知识(例如,作为规则或者公理,这些规则或者公理描述特定领域内的知识边界或者约束),而实际上被编码为用于创建表示新知识的数据结构的构造和解构规则。除了用于生成如下隐式事实的“推论规则”之外(这些隐式事实是由原有KR给定的显式概念的逻辑结果),在一些实施例中还可以用“知识处理规则”对知识表示模型进行编码,可以应用这些“知识处理规则”以从原有KR数据结构创建可以并非隐式的新知识。Unlike previous knowledge representation systems, systems according to some embodiments of the present invention (such as system 100) can combine data structures and knowledge processing rules to create a knowledge representation model that is encoded as data. In some embodiments, the rules may not be encoded as knowledge (for example, as rules or axioms that describe the boundaries or constraints of knowledge within a specific domain), but are actually encoded as construction and deconstruction rules for creating data structures that represent new knowledge. In addition to "inference rules" for generating implicit facts such as the following (these implicit facts are logical consequences of explicit concepts given by the original KR), in some embodiments, the knowledge representation model can also be encoded with "knowledge processing rules", which can be applied to create new knowledge that may not be implicit from the original KR data structure.
例如,始于两个显式知识语句“玛丽是人”和“所有人是人类”,可以应用推论规则以确定显式知识语句“玛丽是人类”,这是先前两个语句的逻辑结果。在根据本发明的一些实施例的不同示例中,始于两个显式知识语句“玛丽是鲍勃的好友”和“鲍勃是查理的好友”,可以应用对好友关系的含义建模的示例性知识处理规则以确定新知识语句“玛丽是查理的好友”。显然,这样的知识处理规则的应用可以产生新知识,该新知识未必是在原有输入KR中给定的显式知识的逻辑结果。如上文描述的那样,根据本发明的一些实施例的知识表示模型(包括与对概念和概念关系进行编码的数据结构相关联地存储的知识处理规则(如有别于或者补充逻辑推理规则的))可以对如何可以创建和/或分解新的和潜在非隐式知识的框架建模。For example, starting with two explicit knowledge statements "Mary is a person" and "All people are human", inference rules can be applied to determine the explicit knowledge statement "Mary is a human", which is the logical result of the previous two statements. In a different example according to some embodiments of the present invention, starting with two explicit knowledge statements "Mary is Bob's friend" and "Bob is Charlie's friend", exemplary knowledge processing rules that model the meaning of friend relationships can be applied to determine the new knowledge statement "Mary is Charlie's friend". Obviously, the application of such knowledge processing rules can produce new knowledge, which is not necessarily a logical result of the explicit knowledge given in the original input KR. As described above, the knowledge representation model according to some embodiments of the present invention (including knowledge processing rules stored in association with data structures that encode concepts and concept relationships (such as those that are different from or supplement logical reasoning rules)) can model a framework for how new and potentially non-implicit knowledge can be created and/or decomposed.
这样聚焦于知识的合成可以将系统(比如系统100)带入新应用领域中。尽管现有系统聚焦于演绎推理(即其中通过精确演绎现有事实和论证(argument)来收集见解),但是根据本发明的一些实施例的系统可以支持归纳推理以及其它类型的理论构建(即其中现有事实可以用来支持新知识的概率预测)。This focus on knowledge synthesis can bring systems such as system 100 into new application areas. While existing systems focus on deductive reasoning (i.e., where insights are gleaned by precisely deducing existing facts and arguments), systems according to some embodiments of the present invention can support inductive reasoning and other types of theory building (i.e., where existing facts can be used to support probabilistic predictions of new knowledge).
在根据本发明的一些实施例中,系统(比如系统100)可以宽松地基于基于概念语义的框架,从而对语义原语(例如“原子”或者“基本”概念)和规则(原则)进行编码,这些规则指导如何可以组合这样的原子结构以创建更复杂的知识。然而,应当领会到,根据本发明的实施例的系统可以在许多这样的框架内工作,因为本发明的方面不限于知识表示的任何特定理论、模型或者实践。在一些实施例中,系统(比如系统100)可以被设计成与对这些框架建模的广泛方法和技术(例如,实施为软件应用或者部件)对接。例如,对接分析部件(比如分析引擎150)可以将输入复杂KR 160解构成基本数据结构120。合成部件(比如合成引擎170)可以使用基本数据结构120来构造新输出复杂KR 190。In some embodiments according to the present invention, the system (such as system 100) can be loosely based on a framework based on conceptual semantics, thereby encoding semantic primitives (such as "atoms" or "basic" concepts) and rules (principles), which guide how such atomic structures can be combined to create more complex knowledge. However, it should be appreciated that the system according to an embodiment of the present invention can work within many such frameworks because aspects of the present invention are not limited to any specific theory, model or practice of knowledge representation. In some embodiments, the system (such as system 100) can be designed to dock with a wide range of methods and techniques (such as, implemented as software applications or components) that model these frameworks. For example, the docking analysis component (such as analysis engine 150) can decompose the input complex KR 160 into a basic data structure 120. The synthesis component (such as synthesis engine 170) can use the basic data structure 120 to construct a new output complex KR 190.
在一些实施例中,分析引擎150可以例如通过由系统100的一个或者多个处理器执行适当计算机可读指令、通过应用知识处理规则130中的一个或者多个知识处理规则以将输入KR 160的数据结构解构成更基本构造来分析输入复杂KR 160。在一些实施例中,在AKRM 110的基本数据结构120内包括的最基本构造可以表示信息和信息关系的基础构建块的最小集合,这些信息和信息关系聚合地提供用来对输入数据结构进行分类的信息携带能力。可以从任何适当来源(包括来自与系统100交互的用户或者软件应用的直接输入)获得输入KR 160。在一些实施例中,可以通过与各种数据库技术(比如关系数据库系统或者基于图形的数据库系统)对接获得输入KR 160。应当领会到,可以按照任何适当形式以任何适当方式获得输入KR 160,因为本发明的方面就这一点而言不受限制。In some embodiments, the analysis engine 150 can analyze the input complex KR 160, for example, by executing appropriate computer-readable instructions by one or more processors of the system 100 and applying one or more knowledge processing rules 130 to deconstruct the data structure of the input KR 160 into more basic constructs. In some embodiments, the most basic constructs included within the basic data structure 120 of the AKRM 110 can represent a minimal set of fundamental building blocks of information and information relationships that, collectively, provide the information-carrying capacity used to classify the input data structure. The input KR 160 can be obtained from any suitable source, including direct input from a user or software application interacting with the system 100. In some embodiments, the input KR 160 can be obtained by interfacing with various database technologies, such as a relational database system or a graph-based database system. It should be appreciated that the input KR 160 can be obtained in any suitable form and in any suitable manner, as aspects of the present invention are not limited in this respect.
例如,图2A图示了例如使用系统100的用户或者软件应用可以向分析引擎150输入的小型复杂KR 200(在这一示例中为分类法)。复杂KR 200包括由各种分级关系链接的概念集。例如,被标注为“动物”的概念210在父-子关系中链接到被标注为“宠物”的概念220和被标注为“山居动物”的概念230。在分级的每级,概念实体表示含义单位,该含义单位可以被组合以创建更复杂语义或者可能被解构成更基本语义。例如,“山居动物”的复杂含义可以包括概念“山居”和“动物”。For example, Figure 2A illustrates a small complex KR 200 (in this example, a taxonomy) that a user or software application using the system 100 may input into the analysis engine 150. The complex KR 200 includes a set of concepts that are linked by various hierarchical relationships. For example, a concept 210 labeled "animals" is linked in a parent-child relationship to a concept 220 labeled "pets" and a concept 230 labeled "mountain animals." At each level of the hierarchy, the concept entities represent units of meaning that can be combined to create more complex semantics or potentially deconstructed into more basic semantics. For example, the complex meaning of "mountain animals" may include the concepts "mountain" and "animals."
在一些实施例中,系统100可以例如通过分析引擎150解构复杂KR(比如复杂KR200)以发现基本概念,这些基本概念包括复杂KR的复杂概念。例如,图2B图示了可以从复杂KR 200的分析和解构中产生的基本数据结构300。在基本数据结构300中,已经发现了被标注为“山居动物”的复杂概念230包括被标注为“山居”的更基本概念235和被标注为“动物”的更基本概念240。在这一示例中,“山居”和“动物”表示比被标注为“山居动物”的更复杂概念更基本的概念,因为可以组合概念“山居”和“动物”以创建被标注为“山居动物”的概念。类似地,已经发现了被标注为“家养狗”的复杂概念250包括被标注为“家养”的更基本概念255和被标注为“狗”的更基本概念260,并且已经发现了被标注为“暹罗族猫”的复杂概念270包括被标注为“暹罗族”的更基本概念275和被标注为“猫”的更基本概念280。此外,每个新发现的基本概念已经从包括它的复杂概念继承了概念关系。因此,“家养”、“狗”、“暹罗族”和“猫”是“宠物”的子代;“山居”和“动物”(概念240)是“动物”(概念210)的子代;并且“山居”和“动物”(概念240)均为被标注为“狮子”的概念290和被标注为“山羊”的概念295二者的父代。In some embodiments, the system 100 can deconstruct a complex KR (such as complex KR 200) to discover basic concepts, which include the complex concepts of the complex KR, for example, through the analysis engine 150. For example, FIG2B illustrates a basic data structure 300 that can be generated from the analysis and deconstruction of the complex KR 200. In the basic data structure 300, it has been discovered that the complex concept 230 labeled "mountain-dwelling animals" includes a more basic concept 235 labeled "mountain-dwelling" and a more basic concept 240 labeled "animals". In this example, "mountain-dwelling" and "animals" represent more basic concepts than the more complex concept labeled "mountain-dwelling animals" because the concepts "mountain-dwelling" and "animals" can be combined to create the concept labeled "mountain-dwelling animals". Similarly, a complex concept 250 labeled "domestic dog" has been discovered to include a more basic concept 255 labeled "domestic" and a more basic concept 260 labeled "dog," and a complex concept 270 labeled "Siamese cat" has been discovered to include a more basic concept 275 labeled "Siamese" and a more basic concept 280 labeled "cat." Furthermore, each newly discovered basic concept has inherited concept relationships from the complex concept that includes it. Thus, "domestic," "dog," "Siamese," and "cat" are children of "pet"; "mountain dweller" and "animal" (concept 240) are children of "animal" (concept 210); and "mountain dweller" and "animal" (concept 240) are both parents of concept 290 labeled "lion" and concept 295 labeled "goat."
注意,虽然标签“动物”归于基本数据结构300中的概念210和概念240二者,但是两个概念仍然可以表示在知识表示分级内作用不同的不同抽象含义。在一些实施例中,“标签”或者“符号”可以被接合成抽象概念以提供用于概念和关系的人类和/或机器可读术语或者标签以及提供用于各种基于符号的处理方法(比如文字分析)的基础。标签可以提供人类和/或机器可辨别的知识表示实体并且可以从源领域的唯一词汇表中推导。因此,由于可以从在领域中呈现的语言和术语抽取向每个概念元素分配的标签,标签本身可能并未完全描述它们用来命名的抽象概念和概念关系,因为在人类知识中理解那些抽象实体。Note that although the label "animal" is attributed to both concept 210 and concept 240 in the basic data structure 300, the two concepts may still represent different abstract meanings that function differently within the knowledge representation hierarchy. In some embodiments, "labels" or "symbols" can be combined into abstract concepts to provide human and/or machine readable terms or labels for concepts and relationships and to provide a basis for various symbol-based processing methods (such as text analysis). Labels can provide knowledge representation entities that are discernible to humans and/or machines and can be derived from a unique vocabulary in the source domain. Therefore, because the labels assigned to each concept element can be extracted from the language and terminology present in the domain, the labels themselves may not fully describe the abstract concepts and concept relationships that they are used to name, as those abstract entities are understood in human knowledge.
类似地,在一些实施例中,应当领会到在知识表示模型中的抽象概念与那些概念可以用来描述或者分类的对象之间的差异。对象可以是现实物理或者虚拟世界中的可以由概念描述的任何项(例如,对象的示例是文档、网页、人物等)。例如,现实世界中的个人可以由被标注为“鲍勃”的概念抽象表示。待描述、分类或者分析的领域中的信息可以涉及虚拟或者物理对象、过程和在这样的信息之间的关系。在一些示例性实施例中,可以在对驻留于网页内的内容分类时使用如这里描述的复杂KR。其它类型的领域在一些实施例中可以包括文档贮存库、用于音乐的推荐系统、软件代码贮存库、工作流程和业务过程的模型等。Similarly, in some embodiments, it should be appreciated that there is a difference between abstract concepts in a knowledge representation model and the objects that those concepts can be used to describe or classify. An object can be any item in the real physical or virtual world that can be described by a concept (e.g., examples of objects are documents, web pages, characters, etc.). For example, an individual in the real world can be represented by a conceptual abstraction labeled "Bob." The information in the domain to be described, classified, or analyzed can involve virtual or physical objects, processes, and the relationships between such information. In some exemplary embodiments, complex KRs as described herein can be used when classifying content residing within a web page. Other types of domains may, in some embodiments, include document repositories, recommendation systems for music, software code repositories, models of workflows and business processes, and the like.
在一些实施例中,待分类的领域的对象可以被称为内容节点。内容节点可以包括使用知识表示模型负责于分类、描述、分析等的任何对象。例如,内容节点可以是文件、文档、文档的组块(比如注解)、图像或者存储的字符串。内容节点可以引用物理对象或者虚拟对象。在一些实施例中,内容节点可以被包含于内容容器中,这些内容容器提供可以通过其取回内容节点的可寻址(或者可定位)信息。例如,通过URL可寻址的网页的内容容器可以包含形式为文本和图像的内容节点。内容可以与内容节点相关联以提炼一些含义(比如内容节点的描述、目的、用法或者意图)。例如,现实世界中的内容节点的方面可以由抽象知识表示中的概念描述。In some embodiments, the objects in the field to be classified can be referred to as content nodes. Content nodes can include any objects that are responsible for classification, description, analysis, etc. using a knowledge representation model. For example, a content node can be a file, a document, a block of a document (such as an annotation), an image, or a stored string. A content node can reference a physical object or a virtual object. In some embodiments, a content node can be contained in a content container that provides addressable (or locatable) information by which the content node can be retrieved. For example, the content container of a web page addressable by a URL can include content nodes in the form of text and images. Content can be associated with a content node to extract some meanings (such as the description, purpose, usage, or intention of the content node). For example, aspects of a content node in the real world can be described by concepts in an abstract knowledge representation.
在复合提炼水平方面通过概念与其它实体的关系并且在结构上在其它更基础知识表示实体(例如,关键词和词素)方面定义概念。这样的结构在这里被称为概念定义。在一些实施例中,可以通过如下两个基础类型的概念关系使概念相关:内在类型,该类型指代用于创建更复杂概念的在基本概念之间的接合(例如,在基本数据结构300中的“山居”、“动物”和“山居动物”之间的关系);以及外在类型,该类型指代在复杂关系之间的接合。外在关系可以描述在概念对之间的特征,比如等价、分级(例如,在“动物”与“宠物”之间的关系)和关联。另外,在一些实施例中,外在和内在概念关系本身也可以被描述为概念类型,并且它们可以被归类成更复杂关系。例如,关联关系“与……结婚”可以包括关系概念“结婚”和“与”。Concepts are defined at a complex level of abstraction through their relationships to other entities and structurally in terms of other more basic knowledge representation entities (e.g., keywords and morphemes). Such structures are referred to here as concept definitions. In some embodiments, concepts can be related by two basic types of conceptual relationships: an intrinsic type, which refers to the connection between basic concepts used to create more complex concepts (e.g., the relationship between "mountain dwelling," "animals," and "mountain dwelling animals" in the basic data structure 300); and an extrinsic type, which refers to the connection between complex relationships. Extrinsic relationships can describe features between pairs of concepts, such as equivalence, hierarchy (e.g., the relationship between "animals" and "pets"), and association. In addition, in some embodiments, extrinsic and intrinsic concept relationships themselves can also be described as concept types, and they can be classified into more complex relationships. For example, the association relationship "marrying..." can include the relationship concepts "marrying" and "with."
在一些实施例中,被存储为系统100中的基本数据结构120的AKRM数据模型的总体组织可以被编码为按面数据结构,其中概念实体是在分级(外在关系)中显式相关以及在集合中接合成创建复杂概念(内在关系)。另外,如上文讨论的那样,可以使用概念对这些外在和内在关系本身归类。然而,应当领会到,可以在表示AKRM时利用任何适当类型的知识表示模型或者理论构造(包括任何适当类型的概念关系),因为本发明的方面就这一点而言不受限制。In some embodiments, the overall organization of the AKRM data model, stored as the base data structure 120 in the system 100, can be encoded as a faceted data structure, in which conceptual entities are explicitly related in a hierarchy (external relationships) and joined in sets to create complex concepts (internal relationships). Additionally, as discussed above, these external and internal relationships themselves can be categorized using concepts. However, it should be appreciated that any suitable type of knowledge representation model or theoretical construct (including any suitable type of concept relationships) can be utilized in representing the AKRM, as aspects of the present invention are not limited in this respect.
作为示例,图3提供根据本发明的一些实施例的可以在系统100的数据集110中运用的示例性数据规划350。这样的数据规划可以被设计成能够对复杂知识表示数据结构(复杂KR)(比如本体论和分类法)以及将复杂KR分解成的原子知识表示数据结构(例如,基本数据结构120)进行编码。在规划350中,可以接合概念以使用多对多关系来组成更复杂类型(具有类型)。以这一方式,模型中的核心概念实体可以根据由数据建模的复杂知识表示的性质来表示广泛多样化简单性或者复杂性。通过使用多对多关系将符号、规则和对象接合到这些概念,这样的规划可以管理用于对广泛知识表示建模的数据。As an example, FIG3 provides an exemplary data plan 350 that can be used in the data set 110 of the system 100 according to some embodiments of the present invention. Such a data plan can be designed to be able to encode complex knowledge representation data structures (complex KR) (such as ontologies and taxonomies) and the atomic knowledge representation data structures (e.g., basic data structures 120) into which the complex KR is decomposed. In the plan 350, concepts can be joined to form more complex types (with types) using many-to-many relationships. In this way, the core concept entities in the model can represent a wide variety of simplicity or complexity based on the properties of the complex knowledge representation modeled by the data. By joining symbols, rules, and objects to these concepts using many-to-many relationships, such a plan can manage data for modeling a wide range of knowledge representations.
在如图3中所示的规划350中,矩形框表示实体集,(例如,可以被编码为数据库中的主要对象的现实世界对象)以及抽象概念、引用概念的抽象人类和/或机器可读符号以及应用于知识表示中的概念的规则。每个实线连接符表示具有如由菱形表示的关系类型的、在两个实体集之间的关系。“N”表示关系的参与基数;在这里,关系为多对多,这指示每个实体集的许多实体可以参与与参与关系的其它实体集的实体的关系并且反之亦然。对照而言,在菱形的两侧上被标注为“1”的关系将表示一对一关系;在一侧上被标注为“1”而在另一侧上被标注为“N”的关系将表示一对多关系,其中第一类型的一个实体可以参与与第二类型的许多实体的关系,而第二类型的每个实体可以参与与第一类型的仅一个实体的该关系;等等。In the schema 350 shown in FIG3 , the rectangular boxes represent entity sets (e.g., real-world objects that can be encoded as primary objects in a database), as well as abstract concepts, abstract human- and/or machine-readable symbols that reference the concepts, and rules that apply to the concepts in the knowledge representation. Each solid connector represents a relationship between two entity sets, of a relationship type such as that represented by the diamond. “N” represents the cardinality of the relationship; here, the relationship is many-to-many, indicating that many entities of each entity set can participate in a relationship with entities of the other entity set that participate in the relationship, and vice versa. In contrast, a relationship labeled “1” on both sides of the diamond would represent a one-to-one relationship; a relationship labeled “1” on one side and “N” on the other side would represent a one-to-many relationship, where one entity of the first type can participate in a relationship with many entities of the second type, while each entity of the second type can participate in the relationship with only one entity of the first type; and so on.
在一些实施例中,可以使用任何适当数据库和/或其它数据编码技术在一个或者多个数据库表中根据规划350对知识表示的数据结构进行编码。例如,在一些实施例中,用于KR数据结构的数据集可以被构造为表的计算机可读表示,其中每行表示在一对概念之间的关系。例如,数据表的一个示例可以具有四个属性列,这些属性列包括“概念1”属性、“概念2”属性、“关系”属性和“类型”属性,从而将用于表的每行的三向关系建模为:“概念1通过类型(例如,外在或者内在)的关系概念来与概念2相关。例如,具有属性(列实体){概念1:“锤子”;概念2:“钉子”;关系“工具”;类型:“外在”}的这样的表的行可以表示关系:“锤子”作为“工具”与“钉子”相关,并且关系是“外在”。在许多示例性数据结构中,每个概念可以出现于数据库表的一行或者多行中,例如出现于多行中以表示与多个其它概念的关系。此外,例如如果特定一对概念通过多个类型的关系来相关,则该对概念可以出现于多行中。然而,应当领会到,前文描述仅为举例并且可以用任何适当方式实施和/或编码和存储数据结构,因为本发明的方面就这一点而言不受限制。In some embodiments, the data structure of the knowledge representation may be encoded in one or more database tables using any suitable database and/or other data encoding techniques according to schema 350. For example, in some embodiments, a data set for a KR data structure may be structured as a computer-readable representation of a table in which each row represents a relationship between a pair of concepts. For example, an example of a data table may have four attribute columns, including a "Concept 1" attribute, a "Concept 2" attribute, a "Relationship" attribute, and a "Type" attribute, thereby modeling a three-way relationship for each row of the table as: "Concept 1 is related to Concept 2 by a relationship concept of type (e.g., extrinsic or intrinsic). For example, a row of such a table with the attributes (column entities) {Concept 1: "hammer"; Concept 2: "nail"; Relationship "Tool"; Type: "External"} may represent the relationship: "Hammer" is related to "Nail" as a "Tool", and the relationship is "External". In many exemplary data structures, each concept may appear in one or more rows of the database table, e.g., appearing in multiple rows to represent relationships with multiple other concepts. Furthermore, a particular pair of concepts may appear in multiple rows, for example, if the pair is related by multiple types of relationships. However, it should be appreciated that the foregoing description is by way of example only and that the data structure may be implemented and/or encoded and stored in any suitable manner, as aspects of the present invention are not limited in this respect.
在一些实施例中,各种元数据可以与AKRM内的每个实体(例如,概念和概念关系)相关联以支持基于规则的编程。例如,由于许多规则将需要经排序概念集,所以可以向这一规划添加概念关系(内在或者外在)内的概念的优先级。这里省略这些细节仅为了简化数据模型的呈现。In some embodiments, various metadata can be associated with each entity (e.g., concepts and concept relationships) within the AKRM to support rule-based programming. For example, since many rules will require a ranked set of concepts, the priority of concepts within concept relationships (internal or external) can be added to this schema. These details are omitted here only to simplify the presentation of the data model.
虽然图3的示例性数据规划可以相对简单,但是当它与用于构造和解构知识表示的处理规则联姻时,它可以变成能够管理很广范围的复杂知识(如在下文各种示例中描述的那样)。益处可以包括用于提高数据经济并且减少对向大型知识表示数据结构中构建复杂性的需要的实时知识工程。另外,由于减少了知识表示数据结构的范围,所以它也可以对集成知识工程过程(比如推理、分析、数据挖掘和搜索)具有有益影响。While the exemplary data plan of FIG3 may be relatively simple, when it is coupled with processing rules for constructing and deconstructing knowledge representations, it can become capable of managing a wide range of complex knowledge (as described in various examples below). Benefits may include real-time knowledge engineering for improving data economy and reducing the need to build complexity into large knowledge representation data structures. Additionally, since the scope of the knowledge representation data structure is reduced, it may also have a beneficial impact on integrated knowledge engineering processes such as reasoning, analysis, data mining, and search.
回顾图1,在一些实施例中,知识处理规则130可以被编码和持续于系统100中(例如在数据集110中),并且可以被接合到输入KR 160和/或基本数据结构120内的概念。规则可以被接合到概念,从而使得在具体概念给定时可以通过由系统100的一个或者多个处理器执行编程代码来应用规则以从基本数据结构120生成新语义实体(概念和关系)和/或将输入KR 160解构成将在基本数据结构120中包括的基本实体。下文更详细地描述这样的规则的示例。1 , in some embodiments, knowledge processing rules 130 may be encoded and persisted in the system 100 (e.g., in the data set 110) and may be coupled to the input KR 160 and/or concepts within the base data structure 120. Rules may be coupled to concepts such that, given a specific concept, the rules may be applied by executing programming code by one or more processors of the system 100 to generate new semantic entities (concepts and relations) from the base data structure 120 and/or deconstruct the input KR 160 into base entities to be included in the base data structure 120. Examples of such rules are described in more detail below.
例如,系统100的开发者和/或系统100的终端用户可以根据他们的个别知识处理需要或者偏好向数据集110引入规则130作为输入规则140。应当领会到,可以在任何适当时间从任何适当来源获得输入规则140,并且任何适当用户可以在操作系统100之前或者期间在任何适当时间更新和/或改变作为AKRM的部分而被存储的规则130,并且可以针对与系统100交互的不同用户或者应用来维护不同的存储的规则130,因为本发明的方面就这一点而言不受限制。此外,在一些实施例中,存储的规则130的不同子集可以被应用于分析输入KR160而不是合成输出KR 190,而在其它实施例中,可以在分析操作和合成操作二者中应用相同规则130,并且存储的规则130的不同子集可以被应用于不同类型的知识表示。For example, a developer of the system 100 and/or an end user of the system 100 may introduce rules 130 as input rules 140 to the data set 110 based on their individual knowledge processing needs or preferences. It should be appreciated that the input rules 140 may be obtained from any appropriate source at any appropriate time, and that any appropriate user may update and/or change the rules 130 stored as part of the AKRM at any appropriate time before or during operation of the system 100, and that different stored rules 130 may be maintained for different users or applications interacting with the system 100, as aspects of the present invention are not limited in this respect. Furthermore, in some embodiments, different subsets of the stored rules 130 may be applied to analyzing the input KR 160 rather than synthesizing the output KR 190, while in other embodiments, the same rules 130 may be applied in both analysis and synthesis operations, and different subsets of the stored rules 130 may be applied to different types of knowledge representations.
规则130在被应用于KR的分析和合成中的概念时可以提供用于系统(比如系统100)的构造和解构逻辑。可以在规则集130中对如何创建(合成)或者解构(分析)知识的方法进行编码。规则130可以被设计成系统地(单规则在分析和合成二者中操作)或者不对称地(其中单规则被设计成仅在合成或者分析中工作)工作。在一些实施例中,规则130可以未被编码为知识模型的概念数据结构内的实体,而是为知识表示模型内的在生成能力中对概念数据结构操作的规则。在一些实施例中,规则130可以在包括规则的AKRM的机器可读编码中被编码为数据并且与知识表示数据结构(比如基本数据结构120)一起存储。可以使用规则引擎软件部件来应用规则130,该规则引擎软件部件例如由在系统100中包括的或者可由系统100访问的一个或者多个有形非瞬态计算机可读存储介质中编码的程序指令实施,这些编程指令由系统100的一个或者多个处理器执行以提供规则引擎。The rules 130, when applied to concepts in the analysis and synthesis of KR, can provide construction and deconstruction logic for a system (such as the system 100). The method of how to create (synthesize) or deconstruct (analyze) knowledge can be encoded in the rule set 130. The rules 130 can be designed to work systematically (a single rule operates in both analysis and synthesis) or asymmetrically (where a single rule is designed to work only in synthesis or analysis). In some embodiments, the rules 130 may not be encoded as entities within the conceptual data structure of the knowledge model, but rather as rules within the knowledge representation model that operate on the conceptual data structure in a generative capacity. In some embodiments, the rules 130 can be encoded as data in a machine-readable encoding of the AKRM that includes the rules and stored together with the knowledge representation data structure (such as the base data structure 120). The rules 130 may be applied using a rules engine software component implemented, for example, by program instructions encoded in one or more tangible, non-transitory computer-readable storage media included in or accessible by the system 100, the programming instructions being executed by one or more processors of the system 100 to provide a rules engine.
在系统(比如根据本发明的一些实施例的系统100)的概率性质给定时,可以执行用于针对从规则130的应用中产生的知识表示数据结构校验语义一致性的方法。在一些实施例中,系统100可以被编程为收集关于所得数据结构是否存在于现有知识模型中的证据。这些现有知识模型可以在系统内部(作为复杂知识表示数据结构)或者外部(比如在语义网上编码的知识模型)。在一些实施例中,搜索引擎可以用来调查与所得数据结构的概念相关联的术语(符号或者标签)是否存在于外部知识表示(比如文档)中。术语-文档频率(例如,搜索引擎命中次数)可以提供用于所得知识表示数据结构的语义一致性的一个示例性度量。然而,应当领会到,可以使用用于这样的数据结构的语义一致性的任何适当度量,因为本发明的方面就这一点而言不受限制。Given the probabilistic nature of a system (such as system 100 according to some embodiments of the present invention), a method for verifying semantic consistency of a knowledge representation data structure generated from the application of rules 130 can be performed. In some embodiments, system 100 can be programmed to collect evidence about whether the resulting data structure exists in existing knowledge models. These existing knowledge models can be internal to the system (as complex knowledge representation data structures) or external (such as knowledge models encoded on the semantic web). In some embodiments, a search engine can be used to investigate whether terms (symbols or tags) associated with the concepts of the resulting data structure exist in external knowledge representations (such as documents). Term-document frequency (e.g., the number of search engine hits) can provide an exemplary measure of semantic consistency for the resulting knowledge representation data structure. However, it should be appreciated that any appropriate measure of semantic consistency for such data structures can be used, as aspects of the present invention are not limited in this respect.
分析引擎150和合成引擎170可以使用各种语义分析和合成方法中的任何方法以支持构造和解构知识表示数据结构,因为本发明的方面就这一点而言不受限制。可以由分析引擎150在解构输入复杂KR 160时与规则130的应用一起使用的分析方法的示例包括文本分析、实体和信息提取、信息检索、数据挖掘、分类、统计聚类、语言分析、面分析、自然语言处理和语义知识库(例如,词典、本体论等)。可以由合成引擎170在构造复杂KR 190时与规则130的应用一起使用的合成方法的示例包括形式概念分析、按面分类合成、语义合成和动态分类法。The analysis engine 150 and the synthesis engine 170 may use any of a variety of semantic analysis and synthesis methods to support the construction and deconstruction of knowledge representation data structures, as aspects of the present invention are not limited in this regard. Examples of analysis methods that can be used by the analysis engine 150 in conjunction with the application of rules 130 when deconstructing the input complex KR 160 include text analysis, entity and information extraction, information retrieval, data mining, classification, statistical clustering, linguistic analysis, facet analysis, natural language processing, and semantic knowledge bases (e.g., dictionaries, ontologies, etc.). Examples of synthesis methods that can be used by the synthesis engine 170 in conjunction with the application of rules 130 when constructing the complex KR 190 include formal concept analysis, facet classification synthesis, semantic synthesis, and dynamic taxonomy.
应当领会到,复杂KR的示例性分析和合成方法可以由个别和/或与可以与分析引擎150和合成引擎170和/或系统100对接的任何适当外部软件应用结合操作的分析引擎150和合成引擎170执行。这样的外部软件应用可以实施于与系统100的其它部件相同的物理设备或者设备集内或者可以用与其它单独设备通信的分布式方式实施这样的软件应用的部分或者全部,因为本发明的方面就这一点而言不受限制。It should be appreciated that the exemplary analysis and synthesis methods of complex KRs may be performed by the analysis engine 150 and the synthesis engine 170 operating individually and/or in conjunction with any suitable external software applications that may interface with the analysis engine 150 and the synthesis engine 170 and/or the system 100. Such external software applications may be implemented within the same physical device or set of devices as the other components of the system 100 or portions or all of such software applications may be implemented in a distributed manner in communication with other separate devices, as aspects of the present invention are not limited in this respect.
图4图示了可以由分析引擎150在解构输入复杂KR 160时使用的语义分析的一个示例性方法400。应当领会到,图4中所示方法仅为一个示例并且如上文讨论的那样,许多其它分析方法是可能的,因为本发明的方面就这一点而言不受限制。示例性方法400始于提取具有在源数据结构中显式呈现的文本概念标签的源概念410。可以从源数据结构将多个源概念410与可以在源数据结构中显式存在的在源概念410之间的源概念关系一起提取。FIG4 illustrates an exemplary method 400 of semantic analysis that can be used by the analysis engine 150 when deconstructing the input complex KR 160. It should be appreciated that the method shown in FIG4 is merely an example and, as discussed above, many other analysis methods are possible, as aspects of the present invention are not limited in this regard. The exemplary method 400 begins by extracting source concepts 410 having text concept labels that are explicitly present in a source data structure. A plurality of source concepts 410 can be extracted from the source data structure along with source concept relationships between the source concepts 410 that can be explicitly present in the source data structure.
可以在用于源概念410的概念标签中标识一系列关键词界定符(delineator)。可以基于关键词的常见结构文本界定符(比如括号、引号和逗号)从概念标签解析初步关键词范围。继而可以再次使用常见字词界定符(比如空格和语法符号)从初步关键词范围解析完整字词。继而可以执行针对单字词独立性的校验以保证解析的候选关键词有效。在一些实施例中,针对字词独立性的校验可以基于下文被称为“词干化”的词干(或者词根)匹配的方法。一旦生效,如果字词存在于具有其它字词的一个概念标签中并且存在于没有那些其它字词的有关概念标签中,则该字词可以界定关键词。A series of keyword delineators can be identified in the concept tags for the source concepts 410. A preliminary keyword scope can be parsed from the concept tags based on common structural text delineators for keywords (such as brackets, quotes, and commas). Complete words can then be parsed from the preliminary keyword scope using common word delineators (such as spaces and grammatical symbols). A check for single word independence can then be performed to ensure that the parsed candidate keywords are valid. In some embodiments, the check for word independence can be based on a method of stem (or root) matching referred to as "stemming" below. Once valid, a word can define a keyword if it exists in a concept tag with other words and exists in related concept tags without those other words.
一旦这样生成了初步关键词标签集,则可以聚合地检查所有初步关键词标签以标识在单概念标签内的多个有效关键词标签内存在的复合关键词。在一些实施例中,递归可以用来将复合关键词集穷尽地拆分成由源数据支持的最基本关键词集。可以重复候选关键词提炼、生效和拆分的过程直至无法找到更原子的关键词。Once a preliminary keyword tag set has been generated, all preliminary keyword tags can be aggregated to identify the composite keywords that exist within multiple valid keyword tags within a single concept tag. In some embodiments, recursion can be used to exhaustively split the composite keyword set into the most basic keyword set supported by the source data. The process of refining, taking effect, and splitting the candidate keywords can be repeated until no more atomic keywords can be found.
在一些实施例中,最后一轮合并方法可以用来跨越整个领域将关键词标签去歧义。这样的去歧义可以用来消解在实体共享相同标签时出现的歧义。在一些实施例中,可以通过将关键词合并成共享相同标签的单个结构实体来提供去歧义。结果可以是关键词概念集,每个关键词概念包含于从其中推导它的源概念中。例如,源概念410可以被解构成从它的概念标签解析的关键词420、440和460,并且关键词420、440和460可以组成用于源概念410的概念定义。例如,在图2B的示例基本数据结构300中,可以从被标注为“家养狗”的更复杂概念250解构被标注为“家养”的更基本概念255作为从概念标签解析的关键词。In some embodiments, the last round of merging methods can be used to disambiguate keyword tags across the entire domain. Such disambiguation can be used to eliminate the ambiguity that occurs when entities share the same tag. In some embodiments, disambiguation can be provided by merging keywords into a single structural entity that shares the same tag. The result can be a set of keyword concepts, each of which is included in the source concept from which it is derived. For example, source concept 410 can be decomposed into keywords 420, 440 and 460 parsed from its concept tags, and keywords 420, 440 and 460 can form the concept definition for source concept 410. For example, in the example basic data structure 300 of Figure 2 B, a more basic concept 255 labeled as "domestic" can be deconstructed from a more complex concept 250 labeled as "domestic dog" as a keyword parsed from the concept tag.
在一些实施例中,可以通过进一步解构来扩展包括关键词概念的概念定义以在它们的结构中包括词素概念实体作为更深入和更基础的提炼级。在一些实施例中,词素可以表示更复杂概念及其关系的基本不可缩减属性。在词素提炼级,许多属性将不可被人类分类者识别为概念。然而,当跨越整个领域被组合成关系数据结构时,词素可以在一些实施例中能够使用更少信息来携带更复杂概念的意义含义。In certain embodiments, the concept definition that comprises keyword concept can be expanded to comprise morpheme concept entity as deeper and more basic refinement level in their structure by further deconstruction.In certain embodiments, morpheme can represent the basic irreducible attribute of more complex concept and relation thereof.At the morpheme refinement level, many attributes will not be identified as concept by human classifier.Yet, when being combined into relational data structure across whole domain, morpheme can in some embodiments, can use less information to carry the meaning implication of more complex concept.
在一些实施例中,词素提取方法可以具有与上文讨论的关键词提取方法共同的要素。可以定义模式以用作为用于标识词素候选的标准。这些模式可以建立用于词干化的参数并且可以包括用于完整字词的模式以及部分字词匹配。与关键词提取一样,源概念关系集可以提供用于词素模式匹配的上下文。可以比对关键词出现于其中的源概念关系集内的关键词池应用模式。可以标识基于词干化模式的共享词根集。共享词根集可以包括用于每个关键词的候选词素词根集。In some embodiments, the morpheme extraction method may have common elements with the keyword extraction method discussed above. Patterns may be defined to serve as criteria for identifying morpheme candidates. These patterns may establish parameters for stemming and may include patterns for whole words as well as partial word matching. As with keyword extraction, the source concept relationship set may provide context for morpheme pattern matching. The pattern may be applied to a pool of keywords within the source concept relationship set in which the keyword appears. A shared root set based on the stemming pattern may be identified. The shared root set may include a set of candidate morpheme roots for each keyword.
在一些实施例中,可以比较用于每个关键词的候选词素词根集以保证它们相互一致。可以假设在相同关键词的上下文内驻留的词根和关键词出现于其中的源概念关系集具有重叠词根。另外,可以假设从那些重叠词根的交集推导的基本词根将保留于用来标识有效词素的参数内。这样的生效可以约束过量词素拆分并且提供上下文有意义而又基本的提炼级。在一些实施例中,可以从关键词集去除任何不一致候选词素词根。可以重复用于标识词素候选的模式匹配过程直至去除了所有不一致候选。In certain embodiments, the candidate morpheme root set for each keyword can be compared to ensure that they are consistent with each other. It can be assumed that the resident root and the source concept relationship set in which the keyword appears in the context of the same keyword have overlapping roots. In addition, it can be assumed that the basic root derived from the intersection of those overlapping roots will remain in the parameter used to identify the effective morpheme. Such entry into force can constrain the splitting of excessive morphemes and provide a meaningful and basic level of refinement of the context. In certain embodiments, any inconsistent candidate morpheme root can be removed from the keyword set. The pattern matching process for identifying morpheme candidates can be repeated until all inconsistent candidates have been removed.
在一些实施例中,通过检查潜在词根组,可以针对每个关键词标识一个或者多个词素界定符。可以基于界定符在每个关键词标签内的位置提取词素。继而可以通过将提取的词素相关(或映射)到从其中推导它们的关键词来构造关键词概念定义。例如,可以在用于关键词概念420的概念定义中包括词素概念425和430,可以在用于关键词概念440的概念定义中包括词素概念445和450,并且可以在用于关键词概念460的概念定义中包括词素概念465和470。因此可以通过语义分析将原有源概念410解构成关键词概念级并且进一步解构成用于在AKRM的基本数据结构中包括的最基本词素概念级。In some embodiments, by checking potential root groups, one or more morpheme delimiters can be identified for each keyword. Morphemes can be extracted based on the position of the delimiter in each keyword tag. Then the keyword concept definition can be constructed by correlating (or mapping) the extracted morphemes to the keywords from which they are derived. For example, morpheme concepts 425 and 430 can be included in the concept definition for keyword concept 420, morpheme concepts 445 and 450 can be included in the concept definition for keyword concept 440, and morpheme concepts 465 and 470 can be included in the concept definition for keyword concept 460. Therefore, the original source concept 410 can be deconstructed into the keyword concept level and further deconstructed into the most basic morpheme concept level included in the basic data structure of AKRM by semantic analysis.
然而,应当领会到,可以在生成基本数据结构时运用任何适当抽象级,并且可以使用任何适当分析方法(包括未以关键词或者词素为中心的方法),因为本发明的方面就这一点而言不受限制。在一些实施例中,在用于在分析和/或合成更复杂KR时使用的AKRM中包括的基本数据结构可以包括和编码比在解构成填充基本数据结构的和/或从基本数据结构合成的复杂KR中包括的概念和关系更基本的概念和关系。例如,可以通过组合在AKRM的基本数据结构中编码的基本概念的抽象含义来形成在复杂KR中编码的复杂概念的抽象含义。However, it should be appreciated that any appropriate level of abstraction can be employed when generating the base data structures, and any appropriate analysis methods (including methods that are not keyword- or morpheme-centric) can be used, as aspects of the present invention are not limited in this respect. In some embodiments, the base data structures included in the AKRM for use in analyzing and/or synthesizing more complex KRs can include and encode concepts and relationships that are more basic than the concepts and relationships included in the complex KRs that are deconstructed to populate the base data structures and/or synthesized from the base data structures. For example, the abstract meanings of complex concepts encoded in the complex KR can be formed by combining the abstract meanings of the base concepts encoded in the base data structures of the AKRM.
在一些实施例中,可能已经从更复杂概念将作为集中式AKRM的部分而存储于基本数据结构中的概念解构成单个完整字词(比如关键词)这一级。图2B的示例图示了对单个完整字词编码的这样的基本数据结构。在一些实施例中,可能已经将基本数据结构中的概念解构成表示字词的部分的更基本级。在一些实施例中,可能已经将基本数据结构中的概念解构成由词素表示的更基本语义级,这些词素是仍然可以携带语义含义的最小语言单位。例如,可以解构完整字词概念“暹罗族”以创建两个词素概念“暹罗”和“族”而“暹罗”表示自由词素并且“族”表示词缀。在一些实施例中,AKRM的基本数据结构可以仅包括在指定基本性级的概念;例如,基本数据结构可以在一些实施例中完全由词素形成或者完全由单字词概念形成。在其它实施例中,基本数据结构可以包括在各种不同基本性级的概念(例如,包括词素概念、关键词概念和/或在其它基本性级的其它概念),而基本数据结构中的概念中的至少一些概念比从其中解构它们的输入KR中的复杂概念和/或它们与其它基本概念组合创建的输出KR中的复杂概念更基本。应当领会到,可以利用用于将复杂KR解构成更基本数据结构的任何适当基础(包括与除了语言和语义之外的范式连结的基础),因为本发明的方面就这一点而言不受限制。In some embodiments, concepts stored in a basic data structure as part of a centralized AKRM may have been deconstructed from more complex concepts to the level of single complete words (such as keywords). The example of Figure 2B illustrates such a basic data structure encoding a single complete word. In some embodiments, concepts in a basic data structure may have been deconstructed to a more basic level representing parts of words. In some embodiments, concepts in a basic data structure may have been deconstructed to a more basic semantic level represented by morphemes, which are the smallest linguistic units that can still carry semantic meaning. For example, the whole word concept "Siamese tribe" can be deconstructed to create two morpheme concepts "Siamese" and "tribe" with "Siamese" representing a free morpheme and "tribe" representing an affix. In some embodiments, the basic data structure of an AKRM may include only concepts at a specified basicity level; for example, the basic data structure may in some embodiments be formed entirely of morphemes or entirely of single-word concepts. In other embodiments, the basic data structure may include concepts at various levels of basicity (e.g., including morpheme concepts, keyword concepts, and/or other concepts at other levels of basicity), and at least some of the concepts in the basic data structure are more basic than the complex concepts in the input KR from which they are deconstructed and/or the complex concepts in the output KR created by combining them with other basic concepts. It should be appreciated that any suitable basis for deconstructing complex KRs into more basic data structures (including basis connected to paradigms other than language and semantics) may be utilized, as aspects of the present invention are not limited in this respect.
回顾图1,数据消耗者195可以表示系统100的一个或者多个人类用户和/或与系统100交互的一个或者多个机器实施的软件应用。在一些实施例中,数据消费者195可以进行请求和/或通过各种数据形式从系统100接收输出。例如,数据消费者195可以向系统100输入复杂160以解构成基本概念和概念关系以生成和/或更新基本数据结构120。数据消费者195(相同或者不同数据消费者)也可以从系统100接收通过将知识处理规则130中的一个或者多个知识处理规则应用于基本数据结构120的部分或者全部而合成的输出复杂KR 190。1 , data consumers 195 can represent one or more human users of system 100 and/or one or more machine-implemented software applications that interact with system 100. In some embodiments, data consumers 195 can make requests and/or receive output from system 100 in various data forms. For example, data consumers 195 can input complex KRs 160 to system 100 for deconstruction into basic concepts and concept relationships to generate and/or update basic data structures 120. Data consumers 195 (the same or different data consumers) can also receive output complex KRs 190 from system 100 that are synthesized by applying one or more knowledge processing rules 130 to part or all of basic data structures 120.
在一些实施例中,数据消费者195也可以提供用于指引合成和分析操作的上下文180。例如,通过将特定上下文180与针对输出KR的请求一起输入,数据消费者195可以指引系统100以生成针对所需信息或者由数据消费者执行的当前任务而具有适当特性的输出KR190。例如,数据消费者195可以输入特定上下文180作为可映射到如下特定概念的搜索术语,消费者195需要关于该特定概念的或者想要接收关于该特定概念的有关信息。合成引擎170可以例如将规则130仅应用于基本数据结构120的与对应于上下文180的概念在概念上有关(即在数据结构中连接)的那些部分。在另一示例中,输入上下文180可以指示数据消费者195想要输出KR 190符合的特定类型的知识表示模型,比如分类法。因而,合成引擎170可以仅应用规则集130的适合于从基本数据结构120合成分类法的那些规则。In some embodiments, data consumer 195 may also provide context 180 for guiding synthesis and analysis operations. For example, by inputting a specific context 180 along with a request for output KRs, data consumer 195 can direct system 100 to generate output KRs 190 with appropriate characteristics for the desired information or current task being performed by the data consumer. For example, data consumer 195 may input specific context 180 as a search term that can be mapped to a specific concept about which consumer 195 requires or wants to receive relevant information. Synthesis engine 170 may, for example, apply rules 130 only to those portions of base data structure 120 that are conceptually related (i.e., connected in the data structure) to the concept corresponding to context 180. In another example, input context 180 may indicate that data consumer 195 desires a specific type of knowledge representation model, such as a taxonomy, to which output KRs 190 conform. Thus, synthesis engine 170 may apply only those rules of rule set 130 that are suitable for synthesizing a taxonomy from base data structure 120.
应当领会到,输入上下文180可以包括应用于合成输出KR 190的任何数目的请求和/或限制,并且输入上下文180的组成可以是在任何适当形式的数据或者编程语言中编码的任何适当类型,因为本发明的方面就这一点而言不受限制。适当输入上下文的示例包括但不限于例如由自然语言处理(NLP)技术调解的自由文本查询和提交以及结构输入(比如与各种Web 2.0系统一致的术语或者标记集。在一些实施例中,根据特定上下文180生成输出KR 190可以实现与数据消费者的更流畅和动态的知识交换。然而,应当领会到,输入上下文180并非必需,并且系统100可以在一些实施例中产生输出KR 190而无需输入上下文,因为本发明的方面就这一点而言不受限制。It should be appreciated that the input context 180 can include any number of requests and/or constraints that apply to synthesizing the output KR 190, and that the composition of the input context 180 can be of any suitable type encoded in any suitable form of data or programming language, as aspects of the invention are not limited in this respect. Examples of suitable input contexts include, but are not limited to, free text queries and submissions, and structured inputs (such as terminology or tag sets consistent with various Web 2.0 systems, such as those mediated by natural language processing (NLP) techniques. In some embodiments, generating the output KR 190 based on a particular context 180 can enable a more fluid and dynamic knowledge exchange with data consumers. However, it should be appreciated that the input context 180 is not required, and that the system 100 can, in some embodiments, generate the output KR 190 without an input context, as aspects of the invention are not limited in this respect.
数据消费者195也可以使用任何适当数据编码和/或编程语言以任何适当形式向系统100提供任何适当类型的输入KR 160,因为本发明的方面就这一点而言不受限制。适当形式的输入KR的示例包括但不限于同样与各种形式的NLP和文本分析一起使用的半结构化或者非结构化文档以及结构化知识表示(比如分类法、受控词汇表、按面分类和本体论)。The data consumer 195 may also provide any suitable type of input KR 160 to the system 100 in any suitable form using any suitable data encoding and/or programming language, as aspects of the present invention are not limited in this respect. Examples of suitable forms of input KR include, but are not limited to, semi-structured or unstructured documents and structured knowledge representations (such as taxonomies, controlled vocabularies, faceted classifications, and ontologies) also used with various forms of NLP and text analysis.
在根据本公开内容的一些实施例中,用于使用AKRM来分析和合成复杂KR的系统(比如系统100)可以实施于与一个或者多个客户端设备、机器和/或计算机存在网络通信的分布式计算系统的服务器侧上。图5图示了这样的分布式计算环境500,其中系统100可以作为用于KR数据结构的服务器侧变换引擎来操作。变换引擎可以取得由客户端510例如通过客户端510的人类用户或者软件应用的动作从一个或者多个领域提供的一个或者多个源复杂KR数据结构520作为输入。在一些实施例中,可以将输入复杂KR 520编码成可以通过网络(比如因特网550)经由web服务(或者API或者其它分布渠道)向系统100实施于其上的计算系统分布的一个或者多个XML文件530。类似地,系统100可以通过网络向各种客户端510返回请求的输出KR作为XML文件540。然而,应当领会到,可以用任何适当方式并且以任何适当形式在服务器系统100与客户端系统510之间传达数据,因为本发明的方面就这一点而言不受限制。In some embodiments according to the present disclosure, a system for analyzing and synthesizing complex KRs using AKRMs (such as system 100) can be implemented on the server side of a distributed computing system in network communication with one or more client devices, machines, and/or computers. FIG5 illustrates such a distributed computing environment 500, in which system 100 can operate as a server-side transformation engine for KR data structures. The transformation engine can take as input one or more source complex KR data structures 520 provided by a client 510, for example, through actions of a human user of the client 510 or a software application from one or more domains. In some embodiments, the input complex KRs 520 can be encoded into one or more XML files 530 that can be distributed via a network (such as the Internet 550) via a web service (or API or other distribution channel) to the computing system on which system 100 is implemented. Similarly, system 100 can return the requested output KRs as XML files 540 to various clients 510 via the network. However, it should be appreciated that data may be communicated between server system 100 and client system 510 in any suitable manner and in any suitable form, as aspects of the invention are not limited in this respect.
通过这一和/或其它分布和分散模式,在一些实施例中,广泛开发者和/或发布者可以使用分析引擎150和合成引擎170以解构和创建复杂KR数据结构。示例性应用包括但不限于网站、知识库、电子商务店、搜索服务、客户端软件、管理信息系统、分析等。Through this and/or other distribution and decentralization models, in some embodiments, a wide range of developers and/or publishers can use the analysis engine 150 and the synthesis engine 170 to deconstruct and create complex KR data structures. Exemplary applications include, but are not limited to, websites, knowledge bases, e-commerce stores, search services, client software, management information systems, analytics, etc.
在一些实施例中,这样的分布式系统的优点可以是清晰分离由系统用来处理领域的私有领域数据和共享数据。数据分离可以有助于托管的处理模型(比如软件作为服务(SaaS)模型),借此第三方可以向领域所有者赋予变换引擎服务。领域所有者的领域专属数据可以由SaaS安全托管,因为它可与其它领域所有者的共享数据(例如,AKRM数据集110)和私有数据分离。备选地,领域专属数据可以由领域所有者托管、在物理上被从共享数据去除。在一些实施例中,领域所有者可以在整个用户团体的共享知识(例如,AKRM)上构建而无需损害它们的唯一知识。In some embodiments, an advantage of such a distributed system can be a clear separation between private domain data and shared data used by the system to process a domain. This data separation can facilitate hosted processing models (such as a Software as a Service (SaaS) model), whereby a third party can provide transformation engine services to a domain owner. A domain owner's domain-specific data can be securely hosted by SaaS because it can be separated from other domain owners' shared data (e.g., AKRM dataset 110) and private data. Alternatively, the domain-specific data can be hosted by the domain owner, physically removed from the shared data. In some embodiments, domain owners can build on the shared knowledge (e.g., AKRM) of the entire user community without compromising their unique knowledge.
如应当从前文讨论中领会到的那样,根据本公开内容的一些实施例涉及分析原有复杂知识表示以解构复杂KR并且生成或者更新原子知识表示模型的基本数据结构的技术。图6图示了一种这样的技术作为示例性过程600。过程600始于动作610,其中例如分析/合成系统(比如系统100)可以从数据消费者接收输入复杂KR。As should be appreciated from the foregoing discussion, some embodiments according to the present disclosure relate to techniques for analyzing existing complex knowledge representations to deconstruct complex KRs and generate or update the basic data structures of atomic knowledge representation models. FIG6 illustrates one such technique as an exemplary process 600. Process 600 begins at action 610, where, for example, an analysis/synthesis system (such as system 100) may receive an input complex KR from a data consumer.
在动作620,可以应用作为AKRM的部分而在系统100中编码的一个或者多个知识处理规则以将输入复杂KR解构成一个或者多个基本概念和/或一个或者多个基本概念关系。下文提供可适用于各种类型的输入KR的知识处理规则的示例。然而,应当领会到,本发明的方面不限于知识处理规则的任何特定示例,并且可以利用与原子知识表示模型相关联地编码的任何适当规则。如上文讨论的那样,这样的规则可以在任何适当时间由分析系统的开发者和/或由分析系统的一个或者多个终端用户提供。In action 620, one or more knowledge processing rules encoded in the system 100 as part of the AKRM can be applied to deconstruct the input complex KR into one or more basic concepts and/or one or more basic concept relationships. Examples of knowledge processing rules applicable to various types of input KRs are provided below. However, it should be appreciated that aspects of the present invention are not limited to any particular example of knowledge processing rules and any appropriate rules encoded in association with the atomic knowledge representation model can be utilized. As discussed above, such rules can be provided at any appropriate time by the developer of the analysis system and/or by one or more end users of the analysis system.
在动作630,可以在作为系统的AKRM的部分而编码和存储的基本数据结构中包括在动作620中发现和/或推导的基本概念和/或基本概念关系中的一个或者多个基本概念和/或基本概念关系。在一些实施例中,从单个输入复杂KR推导的基本概念和关系中的一些或者所有基本概念和关系可以用来填充AKRM的新基本数据结构。在一些实施例中,当已经填充了存储的基本数据结构时,可以在存储的基本数据结构中包括从后续输入KR发现的新基本概念和/或关系以更新和/或扩展集中式AKRM。在一些实施例中,过程600可以继续循环回到开始以在新输入KR变得可用时进一步更新存储的基本数据结构和/或生成新基本数据结构。在其它实施例中,过程600可以在一次通过或者另一预定次数通过过程之后、在存储的基本数据结构已经达到预定大小或者复杂性之后或者在满足任何其它适当停止标准之后结束。In act 630, one or more of the basic concepts and/or basic concept relationships discovered and/or derived in act 620 may be included in a basic data structure encoded and stored as part of the system's AKRM. In some embodiments, some or all of the basic concepts and relationships derived from a single input complex KR may be used to populate a new basic data structure of the AKRM. In some embodiments, once the stored basic data structure has been populated, new basic concepts and/or relationships discovered from subsequent input KRs may be included in the stored basic data structure to update and/or expand the centralized AKRM. In some embodiments, process 600 may continue to loop back to the beginning to further update the stored basic data structure and/or generate new basic data structures as new input KRs become available. In other embodiments, process 600 may terminate after a single pass or another predetermined number of passes through the process, after the stored basic data structure has reached a predetermined size or complexity, or after any other suitable stopping criteria are met.
如应当从前文讨论中领会到的那样,根据本公开内容的另一些实施例涉及用于使用原子知识表示模型生成(合成)复杂知识表示的技术。图7图示了这样的技术作为示例性过程700。过程700始于动作710,其中可以例如从数据消费者(比如人类用户或者软件应用)接收输入上下文。如上文讨论的那样,这样的上下文可以包括文本查询或者请求、一个或者多个搜索术语、一个或者多个种子概念的标识等。此外,上下文可以指示针对特定形式的复杂KR的请求。然而,在一些实施例中,可以接收针对复杂KR的请求而没有进一步上下文以用于限制将在复杂KR中包括的概念和/或概念关系,因为本发明的方面就这一点而言不受限制。另外,在一些实施例中,可以将接收上下文解释为针对复杂KR的请求而无需显式请求来伴随上下文。As should be appreciated from the foregoing discussion, further embodiments according to the present disclosure relate to techniques for generating (synthesizing) complex knowledge representations using an atomic knowledge representation model. FIG7 illustrates such a technique as an exemplary process 700. Process 700 begins with action 710, where an input context may be received, for example, from a data consumer (such as a human user or a software application). As discussed above, such context may include a text query or request, one or more search terms, identification of one or more seed concepts, and the like. In addition, the context may indicate a request for a particular form of complex KR. However, in some embodiments, a request for a complex KR may be received without further context for limiting the concepts and/or concept relationships to be included in the complex KR, as aspects of the present invention are not limited in this regard. Additionally, in some embodiments, a received context may be interpreted as a request for a complex KR without an explicit request to accompany the context.
在动作720,响应于输入请求和/或上下文,可以将AKRM中编码的一个或者多个适当知识处理规则应用于AKRM的基本数据结构以合成未在基本数据结构中显式编码的一个或者多个附加概念和/或概念关系。下文提供可适用于合成各种类型的输出KR的知识处理规则的示例。如上文讨论的那样,在一些实施例中,可以双向应用规则以使用相同知识处理规则来实现复杂KR的分析和合成二者,而在其它实施例中,一个规则集可以应用于分析并且不同规则集可以应用于合成。然而,应当领会到,本发明的方面不限于知识处理规则的任何特定示例,并且可以利用与原子知识表示模型相关联地编码的任何适当规则。如上文讨论的那样,这样的规则可以在任何适当时间由分析系统的开发者和/或由分析系统的一个或者多个终端用户提供。In action 720, in response to the input request and/or context, one or more appropriate knowledge processing rules encoded in the AKRM may be applied to the basic data structure of the AKRM to synthesize one or more additional concepts and/or concept relationships that are not explicitly encoded in the basic data structure. Examples of knowledge processing rules that may be applicable to synthesizing various types of output KRs are provided below. As discussed above, in some embodiments, rules may be applied bidirectionally to achieve both analysis and synthesis of complex KRs using the same knowledge processing rules, while in other embodiments, one rule set may be applied to analysis and a different rule set may be applied to synthesis. However, it should be appreciated that aspects of the present invention are not limited to any particular example of knowledge processing rules, and any appropriate rules encoded in association with an atomic knowledge representation model may be utilized. As discussed above, such rules may be provided at any appropriate time by the developer of the analysis system and/or by one or more end users of the analysis system.
在一些实施例中,适当规则可以根据接收的输入请求和/或上下文来应用于基本数据结构的适当部分。例如,如果输入请求指定将输出的特定类型的复杂KR,则在一些实施例中,在AKRM中编码的应用于合成该类型的复杂KR的仅那些规则可以应用于基本数据结构。在一些实施例中,如果未指定特定类型的复杂KR,则可以合成默认类型的复杂KR(比如分类法)或者可以选择随机类型的复杂KR等。例如,如果输入上下文指定感兴趣的一个或者多个特定种子概念,则仅可以选择基本数据结构的与那些种子概念有关(即通过概念关系连接)的那些部分并且将规则应用于它们以合成新复杂KR。在一些实施例中,例如合成系统的开发者或者终端用户可以设置对输出复杂KR的大小和/或复杂性的某一预定限制,例如以包括的概念数目、在种子概念与基本数据结构中的所选有关概念之间的分级距离、所得输出复杂KR的编码数据大小、处理要求等为条件。In some embodiments, appropriate rules can be applied to appropriate parts of the basic data structure based on the received input request and/or context. For example, if the input request specifies a specific type of complex KR to be output, then in some embodiments, only those rules encoded in the AKRM that are applied to synthesizing complex KRs of that type can be applied to the basic data structure. In some embodiments, if a specific type of complex KR is not specified, a default type of complex KR (such as a taxonomy) can be synthesized, or a random type of complex KR can be selected. For example, if the input context specifies one or more specific seed concepts of interest, only those parts of the basic data structure that are related to those seed concepts (i.e., connected by concept relationships) can be selected and rules applied to them to synthesize new complex KRs. In some embodiments, for example, the developer or end user of the synthesis system can set a predetermined limit on the size and/or complexity of the output complex KR, such as the number of concepts included, the hierarchical distance between the seed concepts and the selected related concepts in the basic data structure, the size of the encoded data of the resulting output complex KR, processing requirements, etc.
在动作730,可以从在动作720中合成的附加概念和关系以及基本数据结构的所选适当部分合成并且根据在接收的输入中指示的任何指定类型的KR编码新复杂KR。在动作740,可以向从其接收请求的数据消费者提供所得合成的复杂KR。如上文讨论的那样,这可以例如是软件应用或者可以通过软件用户接口查看和/或利用提供的复杂KR的人类用户。过程700继而可以提供对新知识进行编码的新合成的复杂KR作为结束。In act 730, a new complex KR can be synthesized from the additional concepts and relationships synthesized in act 720 and the selected appropriate portion of the base data structure and encoded according to any specified type of KR indicated in the received input. In act 740, the resulting synthesized complex KR can be provided to the data consumer from whom the request was received. As discussed above, this can be, for example, a software application or a human user who can view and/or utilize the provided complex KR through a software user interface. Process 700 can then conclude by providing the newly synthesized complex KR that encodes the new knowledge.
以下伪码段可以用作上文描述的方法的进一步举例说明。The following pseudo-code segment may be used as a further illustration of the method described above.
知识创建(KRin,RULESin,CONTEXT,ANALYSIS,SYNTHESIS)Knowledge creation (KR in , rules in , context, analysis, and synthesis)
输入:enter:
-CONTEXT:用户/应用上下文(例如,请求、种子概念、领域约束)-CONTEXT: User/application context (e.g., request, seed concept, domain constraints)
-KRin:知识表示(例如,分类法)-KR in : Knowledge Representation (e.g., taxonomy)
-RULES:相关知识处理规则-RULES: relevant knowledge processing rules
-ANALYSIS:用于启用分析事件的标志-ANALYSIS: Flag to enable profiling events
-SYNTHESIS:用于启用合成事件的标志-SYNTHESIS: Flag to enable synthetic events
输出:Output:
-将存储于AKRM中的概念和关系-Concepts and relationships to be stored in AKRM
-用于向用户/应用呈现的复杂KRout - Complex KR out for presentation to users/applications
过程:process:
如应当从前文讨论中领会到的那样,根据本公开内容的一些实施例涉及用于使用原子知识表示模型来支持在知识表示之间的语义互操作性的技术。如上文讨论的那样,维护具有存储的基本数据结构的共享集中式AKRM在一些实施例中可以允许将多个不同输入复杂KR(在不同类型的知识表示模型的一些情况下)解构成在生成和/或更新在语义上与所有类型的复杂KR兼容的单个共享基本数据结构时使用的基本概念和/或概念关系。此外,通过解构成基本数据结构和后续合成为新复杂KR,可以在一些实施例中将一个类型的输入KR变换成基于相同源数据的不同类型的输出KR。As should be appreciated from the foregoing discussion, some embodiments according to the present disclosure relate to techniques for supporting semantic interoperability between knowledge representations using an atomic knowledge representation model. As discussed above, maintaining a shared centralized AKRM with a stored base data structure can, in some embodiments, allow multiple different input complex KRs (in some cases of different types of knowledge representation models) to be deconstructed into base concepts and/or concept relationships used in generating and/or updating a single shared base data structure that is semantically compatible with all types of complex KRs. Furthermore, by deconstructing into base data structures and subsequently synthesizing into new complex KRs, one type of input KR can, in some embodiments, be transformed into different types of output KRs based on the same source data.
以下伪码可以用作在这里描述的AKRM之下集成多个不同KR的方法的进一步举例说明以提供语义互操作性的益处。The following pseudo-code may serve as a further illustration of a method for integrating multiple different KRs under the AKRM described herein to provide the benefits of semantic interoperability.
输入:enter:
-KR1,KR2,…,KRn:/*n个可能的不同KR*/-KR 1 , KR 2 , ..., KR n : /*n possible different KRs*/
-RULES1,RULES2,…,RULESn/*相关知识处理规则*/-RULES 1 , RULES 2 , ..., RULES n /*Related knowledge processing rules*/
-用户/应用上下文- User/Application Context
输出:Output:
-将存储于AKRM中的概念和关系-Concepts and relationships to be stored in AKRM
-用于向用户/应用呈现的复杂KR- Complex KRs for presentation to users/applications
过程:process:
图8提供如下表,该表图示了可以在根据本公开内容的一些实施例中在分析和/或合成五个示例性类型的复杂知识表示(即分类法、同义词环、辞典、按面分类和本体论)时使用的六个示例性知识处理规则。然而,如上文讨论的那样,应当领会到,提供这些示例仅为了举例说明,并且本发明的方面不限于任何特定规则或者KR类型或者模型集。此外,在一些实施例中,例如系统的开发者可以用初始知识处理规则集播种分析/合成系统,该初始知识处理规则集可以例如由系统的终端用户用附加规则来扩展和/或在以后时间用改变和/或删除的规则来更新。可应用于不同类型的KR的不同规则集也可以针对不同终端用户或者应用而例如存储于用户账户中。另外,在一些实施例中,可以按照新颖方式重用和组合知识处理规则以解决针对具体KR的要求。Figure 8 provides the following table, which illustrates six exemplary knowledge processing rules that can be used when analyzing and/or synthesizing five exemplary types of complex knowledge representations (i.e., taxonomies, synonym rings, dictionaries, faceted classifications, and ontologies) in some embodiments according to the present disclosure. However, as discussed above, it should be appreciated that these examples are provided for illustration purposes only, and aspects of the present invention are not limited to any particular rule or KR type or model set. In addition, in some embodiments, for example, a developer of a system can seed an analysis/synthesis system with an initial set of knowledge processing rules, which can be expanded with additional rules and/or updated at a later time with changed and/or deleted rules, for example, by an end user of the system. Different rule sets applicable to different types of KRs can also be stored, for example, in user accounts, for different end users or applications. In addition, in some embodiments, knowledge processing rules can be reused and combined in novel ways to address requirements for specific KRs.
下文参照涉及到在附图中提供的示例性KR类型的具体示例来讨论图8中呈现的示例性规则。应当领会到,上文描述的广义方法中的任何方法可以应用于下例中的任何示例而涉及到不同输入、输出和知识处理规则。也应当领会到,虽然可以通过这里讨论的示例性规则对知识创建理论的许多不同方面建模,但是各种其它类型的规则是可能的。下例主要由知识表示数据结构的拓扑驱动。用于规则的其它基础可以包括语义词态和语法、语音体系、隐喻、象征和感官知觉以及其它基础。Below with reference to being related to the concrete example of the exemplary KR type that provides in the accompanying drawings, discuss the exemplary rule that presents among Fig. 8.Should be appreciated that any method in the generalized method described above can be applied to any example in the following example and relate to different inputs, outputs and knowledge processing rules.Also should be appreciated that, although can model many different aspects of knowledge creation theory by the exemplary rule discussed here, the rule of various other types is possible.Following example is mainly driven by the topology of knowledge representation data structure.Other basis that is used for rule can comprise semantic morphology and grammar, phonetic system, metaphor, symbol and sensory perception and other basis.
在一些实施例中,对原子知识表示模型内的知识处理规则集(比如图8中给出的示例性规则)进行编码可以允许分析和/或合成支持的KR类型集(比如图8中表示的那些KR类型集)内的任何复杂KR。在图8的示例中,“X”号表明六个规则的示例性集的哪些规则应用于五个KR类型的示例性集的哪些KR类型。在这些示例中,可以在分析或者合成每个规则应用于的类型的复杂KR时双向应用该规则。例如,在输入辞典KR给定时,图8表明规则1、2、3和4可以应用于输入辞典以将它解构成将在基本数据结构中包括的基本概念和概念关系。在另一示例中,将规则1、2和3应用于基本数据结构造成输出同义词环KR。下文参照示例描述使用这些示例性规则中的每个规则以执行适当复杂KR的分析和/或合成。In some embodiments, encoding a set of knowledge processing rules within an atomic knowledge representation model (such as the exemplary rules given in Figure 8) can allow analysis and/or synthesis of any complex KR within a supported set of KR types (such as those represented in Figure 8). In the example of Figure 8, the "X" indicates which rules of the exemplary set of six rules are applied to which KR types of the exemplary set of five KR types. In these examples, the rules can be applied bidirectionally when analyzing or synthesizing complex KRs of the type to which each rule is applied. For example, when an input dictionary KR is given, Figure 8 shows that rules 1, 2, 3, and 4 can be applied to the input dictionary to deconstruct it into basic concepts and concept relationships to be included in the basic data structure. In another example, applying rules 1, 2, and 3 to the basic data structure results in outputting a synonym ring KR. The following describes the use of each of these exemplary rules to perform analysis and/or synthesis of appropriately complex KRs with reference to examples.
分类法规则Classification rules
以下输出/输出和知识处理规则提供分类法的特征作为概念的分级分类。The following import/export and knowledge processing rules provide the characteristics of the taxonomy as a hierarchical classification of concepts.
输入/输出Input/Output
概念集CConcept Set C
分级关系集(非循环)Hierarchical relation sets (acyclic)
R={r(ci,cj):ci,cj∈C and ci Is-a cj}R={r(c i , c j ): c i , c j ∈C and c i Is-a c j }
定义1(一致概念):如果根据某一距离度量M,M(ci,cj)<T,则两个概念ci、cj被视为一致,其中T是预选阈值。可能的度量包括:输入语料库中的两个概念的同现频率,或者对分类法分级应用的树距离函数。Definition 1 (Consistent Concepts): Two concepts c i , c j are considered congruent if, according to some distance metric M , M(c i , c j ) < T , where T is a preselected threshold. Possible metrics include the co-occurrence frequency of the two concepts in the input corpus, or a tree distance function applied to the taxonomy hierarchy.
规则1(一致概念合成):创建新概念c={ci,cj}。如果并且仅如果ci和cj关于定义1为一致,则c被视为包括ci和cj。Rule 1 (Consistent Concept Composition): Create a new concept c = { ci , cj }. If and only if ci and cj are consistent with respect to Definition 1, then c is considered to include ci and cj .
规则2(一只概念合成):令c1={c11,c22,...c1n}为包括n个概念c11至c1n的概念。类似地,令c2={c21,c22,..c2m}为包括m个概念c21至c2m的概念。如果并且仅如果对于每个c1i存在用于某一概念c2j的关系r(c1i,c2j),则创建新分级关系r(c1,c2)。Rule 2 (One-Concept Composition): Let c 1 = {c 11 , c 22 , ... c 1n } be a concept consisting of n concepts c 11 to c 1n . Similarly, let c 2 = {c 21 , c 22 , ... c 2m } be a concept consisting of m concepts c 21 to c 2m . If and only if for each c 1i there exists a relation r(c 1i , c 2j ) for some concept c 2j , then create a new hierarchical relation r(c 1 , c 2 ).
注意,每个示例性规则(例如,规则1和规则2)的如果-并且-仅-如果部分反应规则的双向分析/合成性质。例如,分析将实行“如果”部分(强制在AKRM中呈现显式分级关系以满足条件)。另一方面,合成将发现“仅如果”部分(如果条件适用则发现分级关系)。Note that the if-and-only-if portion of each exemplary rule (e.g., Rule 1 and Rule 2) reflects the bidirectional analysis/synthesis nature of the rule. For example, analysis will enforce the "if" portion (forcing the explicit hierarchical relationship to be present in the AKRM to satisfy the condition). On the other hand, synthesis will discover the "only if" portion (discovering the hierarchical relationship if the condition applies).
已经在图2A和2B中给出了将这些示例性规则应用于将输入分类法200分析和解构为更基本数据结构300的示例。在该示例中,解构复杂概念230、250和270以通过应用规则1生成新的更基本概念235、240、255、260、275和280并且通过应用规则2生成它们的关系。此外,还通过(例如)使用外部语料库作为证据、通过应用规则1来合成新复杂概念:{家养,狮子}、{山居,狗}、{山居,猫}、{家养,山羊}、{家养,宠物}、{家养,猫}。在合成中应用规则2可以生成新概念关系;例如,由于分级关系存在于“动物”与“狗”之间和“动物”与“山居”之间,所以可以合成在“动物”与“山居狗”之间的新分级关系。An example of applying these exemplary rules to analyze and deconstruct an input taxonomy 200 into a more basic data structure 300 is provided in Figures 2A and 2B. In this example, complex concepts 230, 250, and 270 are deconstructed to generate new more basic concepts 235, 240, 255, 260, 275, and 280 by applying Rule 1, and their relationships are generated by applying Rule 2. Furthermore, new complex concepts are synthesized by applying Rule 1, for example, using an external corpus as evidence: {domestic, lion}, {mountain, dog}, {mountain, cat}, {domestic, goat}, {domestic, pet}, {domestic, cat}. Applying Rule 2 in the synthesis can generate new concept relationships; for example, since hierarchical relationships exist between "animal" and "dog" and between "animal" and "mountain," a new hierarchical relationship between "animal" and "mountain dog" can be synthesized.
同义词环规则Synonym Ring Rule
以下输入/输出和知识处理规则提供如由跨越术语或者概念的含义邻近度或者在逻辑上为保留真值的术语的内部可代替性所限定的同义词环的特征。The following input/output and knowledge processing rules provide for the characterization of synonym rings as defined by proximity of meaning across terms or concepts or logically internal substitutability of terms for truth-preserving terms.
输入/输出:Input/Output:
概念集C(可能具有“包括”关系)Concept set C (may have an "include" relationship)
同义词列表:同义词(ci,cj)Synonym list: synonyms ( ci , cj )
定义2(语义相似度):令c1={c11,c22,...c1n}为包括n个概念c11至c1n的概念。类似地,令c2={c21,c22,...c2m}。相似度函数S,S(c1,c2)描述在两个概念之间的语义相似性。示例函数如下:Definition 2 (Semantic Similarity): Let c 1 = {c 11 , c 22 , ... c 1n } be a concept consisting of n concepts c 11 to c 1n . Similarly, let c 2 = {c 21 , c 22 , ... c 2m }. The similarity function S, S(c 1 , c 2 ) describes the semantic similarity between two concepts. An example function is as follows:
(if:如果;Synonym:同义词;otherwise:否则)(if: if; Synonym: synonym; otherwise: otherwise)
定义3(概念交集):令c1={c11,c22,...c1n}为包括n个概念c11至c1n的概念。类似地,令c2={c21,c22,..c2m}。Definition 3 (Concept Intersection): Let c 1 ={c 11 , c 22 , ...c 1n } be a concept including n concepts c 11 to c 1n . Similarly, let c 2 ={c 21 , c 22 , ...c 2m }.
(if:如果)(if: if)
定义3(同义词概念合成):令c1={c11,c22,...c1n}和c2={c21,c22,...c2m}是根据定义2的两个同义词概念。如果并且仅如果S(c1,c2)>Tsynonym(T同义词),则概念c3=c1∩c2以及分级关系r(c1,c3)和r(c2,c3)存在,其中T同义词是使断言“同义词”成立的语义相似度的阈值:Definition 3 (Synonym Concept Composition): Let c 1 = {c 11 , c 22 , ... c 1n } and c 2 = {c 21 , c 22 , ... c 2m } be two synonym concepts according to Definition 2. Concept c 3 = c 1 ∩ c 2 and hierarchical relations r(c 1 , c 3 ) and r(c 2 , c 3 ) exist if and only if S(c 1 , c 2 ) > T synonym (T synonym ), where T synonym is the threshold of semantic similarity for the assertion “synonym” to hold:
Synonym::=c3=c1∩c2≠φ∧r(c1,c3)∧r(c2,c3)Synonym::=c 3 =c 1 ∩c 2 ≠φ∧r(c 1 , c 3 )∧r(c 2 , c 3 )
S(c1,c2)>Tsynonym S(c 1 ,c 2 )>T synonym
(Synonym:同义词;Tsynonym:T同义词)(Synonym: synonym; T synonym : T synonym )
同义词环的示例如下:Examples of synonym rings are as follows:
宠物:家养动物:家居野兽:猫Pets: Domestic Animals: Household Beasts: Cats
根据规则3的分析可以推导分级关系,通过这些分级关系,所有四个概念是“家居动物”的子代。根据规则1的分析可以推导以下新概念:Analysis according to Rule 3 derives hierarchical relationships by which all four concepts are children of "household animals." Analysis according to Rule 1 derives the following new concepts:
住宅、家养、家居、动物、野兽、哺乳动物residence, domestic, household, animal, beast, mammal
根据规则2的分析可以发现如下分级,在这些分级中,“家养”和“家居”是“住宅”的子代,并且“宠物”、“哺乳动物”、“野兽”和“猫”是“动物”的子代。可以基于在从其中提取更简单概念的复杂概念之间的关系创建这些分级关系。因而,可以通过应用规则3合成以下新同义词环:Analysis based on Rule 2 reveals a hierarchy in which "domestic" and "home" are children of "residence," and "pet," "mammal," "beast," and "cat" are children of "animal." These hierarchical relationships can be created based on relationships between complex concepts from which simpler concepts are extracted. Thus, the following new synonym ring can be synthesized by applying Rule 3:
猫:宠物:哺乳动物:野兽Cat: Pet: Mammal: Wild Beast
家养:家居Domestication: Home
辞典规则Dictionary rules
以下输入/输出和知识处理规则提供辞典的特征(包括上文描述的KR的特征)以及关联关系(有关术语)The following input/output and knowledge processing rules provide the characteristics of the dictionary (including the characteristics of KR described above) and the relationship (related terms)
输入/输出:Input/Output:
概念集C(可能具有“包括关系”)Concept set C (may have an "include relationship")
关联关系列表,例如同义词(ci,cj)、有关术语(ci,cj)。A list of association relationships, such as synonyms ( ci , cj ), related terms ( ci , cj ).
分级关系集(非循环)R={r(ci,cj):ci,cj∈C并且and ci NT cj}Hierarchical relation set (acyclic) R = {r( ci , cj ): ci , cj∈C and ciNTcj }
规则1(一致概念合成)应用于辞典。Rule 1 (consistent concept synthesis) is applied to the dictionary.
规则2(分级概念合成)应用于辞典。Rule 2 (hierarchical concept synthesis) is applied to the dictionary.
规则4(关联关系合成):令c1={c11,c22,...c1n和c2={c21,c22,...c2m}是根据某一关联关系AR的两个有关概念。如果并且仅如果S(c1,c2)>TAR,则概念c3=c1∩c2,c4={AR}以及三个分级关系r(c1,c3)、r(c2,c3)和r(c4,c3)存在,其中TAR是使在两个概念之间的“AR”关系的断言成立的语义相似度的阈值:Rule 4 (Association Relation Synthesis): Let c 1 = {c 11 , c 22 , ... c 1n } and c 2 = {c 21 , c 22 , ... c 2m } be two related concepts according to some association relation AR. If and only if S(c 1 , c 2 ) > T AR , then concept c 3 = c 1 ∩ c 2 , c 4 = {AR} and three hierarchical relations r(c 1 , c 3 ), r(c 2 , c 3 ) and r(c 4 , c 3 ) exist, where T AR is the threshold of semantic similarity for the assertion of an “AR” relation between two concepts:
关联关系Association
注意,如果没有要求语义相似度并且经由c3的关联足以捕获关系则TAR可以被设置成零。Note that TAR can be set to zero if semantic similarity is not required and the association via c3 is sufficient to capture the relationship.
示例辞典可以包括关联关系:{猫,食谱}与{鱼,食物}相关联。根据规则1的分析可以推导以下新概念:The example dictionary may include an association relationship: {cat, recipe} is associated with {fish, food}. Based on the analysis of Rule 1, the following new concepts can be derived:
猫,食谱,鱼,食物cat, recipe, fish, food
在呈现的分级关系中的适当模式给定时,可以通过应用规则4来合成新关联关系,例如,“猫”与“鱼”相关联并且“食谱”与“食物”相关联。同样,可以基于在从其中提取更简单概念的复杂概念之间的关系创建关联关系。Given appropriate patterns in the presented hierarchical relationships, new associations can be synthesized by applying Rule 4, e.g., "cat" is associated with "fish" and "recipe" is associated with "food." Similarly, associations can be created based on relationships between complex concepts from which simpler concepts are extracted.
按面分类规则Classification rules by surface
以下输入/输出和知识处理规则提供按面分类的特征(包括作为概念的面和面属性)以及作为在类分级中组织的概念类别的面。此外,下例添加互斥面分级(被约束为严格/单一分级、单继承性的面属性)的特征以及向将被分类为概念集的对象(或者节点)分配面属性。另外,在拓扑上标识面作为面分级中的根节点。The following input/output and knowledge processing rules provide features for facet classification (including faces and face attributes as concepts) and faces as concept categories organized in a class hierarchy. Furthermore, the following example adds features for mutually exclusive facet hierarchies (face attributes constrained to a strict/single hierarchy, single inheritance), and assigns face attributes to objects (or nodes) that are to be classified as concept sets. Furthermore, faces are topologically identified as root nodes in the facet hierarchy.
输入/输出:Input/Output:
面分级(用于每个根面的价值节点的分级)Face classification (classification of value nodes for each root face)
关于面值的标注的术语/概念Terms/concepts regarding face value marking
定义4(互斥面分级):可以通过从每个面分级挑选一个并且仅一个节点标签/价值/属性对任何概念分类。也就是说,表示任何面分级中的节点的概念的语义不重叠Definition 4 (mutually exclusive facet classification): Any concept can be classified by picking one and only one node label/value/attribute from each facet classification. That is, the semantics of the concepts representing the nodes in any facet classification do not overlap.
规则1、2和4应用于面分类。Rules 1, 2, and 4 are applied to face classification.
规则5(面属性分配):面分级中的每个节点/值/属性对应于概念c。如果并且仅如 果ci表现为某一面分级中的仅一个父代cj的子代,并且如果对于面分级中的任何两个概念c1、c2而言c1∩c2={},则关系r(ci,cj)存在。Rule 5 (Aspect Attribute Assignment): Every node/value/attribute in the aspect hierarchy corresponds to a concept c. A relation r(ci, cj ) exists if and only if ci appears as a child of only one parent cj in some aspect hierarchy, and if c1∩c2 = {} for any two concepts c1 , c2 in the aspect hierarchy.
规则6(标注的概念分配):按面分类中的每个标注的术语对应于概念ci={ci1,ci2,...cin},其中cij是根据规则5的标签概念。Rule 6 (Concept assignment of annotations): Each annotated term in the faceted classification corresponds to a concept c i ={c i1 , c i2 , . . . c in }, where c ij is a label concept according to Rule 5.
示例输入按面分配如下:The example input is distributed by face as follows:
具有面属性/节点/价值分配的对象Object with face attributes/nodes/value assignments
“家养狗”{北美,驯养的,狗}"domestic dog" {North American, domesticated, dog}
“山居狮子”{美洲,野生,猫,山居}"Mountain Lion" {American, wild, cat, mountain dwelling}
“暹罗族猫”{世界,驯养的,猫}"Siamese Cat" {World, Domesticated, Cat}
“狮子”{非洲,野生,狮子,草原}"lion" {Africa, wild, lion, grassland}
如上例中所示,根据规则2和5的分析可以用来将输入按面分类分解成更广义面分级(例如,使用面分析或者统计聚类方法)。As shown in the example above, analysis according to rules 2 and 5 can be used to decompose the input facet classification into a broader facet hierarchy (eg, using facet analysis or statistical clustering methods).
面:“宠物”/*合成标签*/Surface: "Pet" /*Crafting Tag*/
-“常见宠物”/*从聚类{驯养的,动物}推导*/- "Common Pets" /* derived from cluster {domesticated, animals} */
-“稀有宠物”/*从聚类{野生,动物}推导*/- "rare pets" /* derived from cluster {wild,animals} */
由于“狗”和“猫”均为“动物”(从面分级“动物”推导),所以可以发现新概念“驯养,动物”如在“驯养的,狗”、“驯养的,猫”等集合中明显一致。Since both “dog” and “cat” are “animals” (derived from the facet classification “animal”), it can be found that the new concept “domesticated, animal” is clearly consistent in sets such as “domesticated, dog” and “domesticated, cat”.
类似地,可以根据规则1和6创建具有面属性/节点/价值分配的新对象。例如,使用上文描述的用于概念合成的规则,也可以合成新概念,比如“狮子宠物”{人造,狮子,驯养的}。虽然这可能在现实生活中不存在,但是在输入KR中的证据给定时可以证实它为有可能的新知识并且以后(例如)通过用户与数据的交互来评价它Similarly, new objects with facet attributes/nodes/value assignments can be created according to rules 1 and 6. For example, using the rules for concept synthesis described above, new concepts such as "lion pet" {artificial, lion, domesticated} can also be synthesized. Although this may not exist in real life, it can be confirmed as possible new knowledge given the evidence in the input KR and later evaluated (for example) through user interaction with the data.
本体论规则Ontological rules
规则1、2、4、5和6应用于提供拓扑的特征(包括作为概念的面和面属性)以及作为在类分级中组织的概念类别的面。Rules 1, 2, 4, 5, and 6 are applied to provide the characteristics of the topology (including faces as concepts and face attributes) and faces as categories of concepts organized in a class hierarchy.
考虑示例复杂关系同居(COH):Consider the example complex relationship Cohabitation (COH):
野生猫←COH→狮子Wild cat ← COH → lion
家养狗←COH→家养猫Domestic dog ← COH → Domestic cat
分析COH关系可以将它们分解成更原子关系和概念。以下原子构造是可能性:Analyzing COH relations allows breaking them down into more atomic relations and concepts. The following atomic constructs are possibilities:
野生猫,狮子,家养狗,家养猫,同居Wild cats, lions, domestic dogs, domestic cats, cohabitation
上述用于知识创建的规则可以用复杂方式可应用于表示更丰富关系(例如c1关系c2),其中关系是一般关联关系。对于作为关联关系(双向)的复杂关系,可以利用在关系中配对的概念之间的含义的交集性质。对于作为分级(单向)的关系,可以利用在关系中配对的概念之间的含义的包容性质。针对合成的复杂关系而推导的标签可以符合常规呈现,例如“C1和C2有关,因为它们共同具有C3”。The above-described rules for knowledge creation can be applied in a complex manner to represent richer relationships (e.g., C1 and C2 ), where the relationship is a general associative relationship. For complex relationships that are associative relationships (bidirectional), the intersection property of the meanings between the paired concepts in the relationship can be utilized. For hierarchical (unidirectional) relationships, the inclusion property of the meanings between the paired concepts in the relationship can be utilized. The labels derived for the synthesized complex relationships can conform to conventional presentation, such as "C1 and C2 are related because they share C3."
应用规则1(一致概念合成)和规则4(关联关系合成)可以产生以下更原子概念:Applying Rule 1 (consistent concept composition) and Rule 4 (association relationship composition) can produce the following more atomic concepts:
野生,猫,狗,家养,栖息地,野生栖息地,家养栖息地,“野生栖息地”属于栖息地,“家养栖息地”属于栖息地Wild, cat, dog, domestic, habitat, wild habitat, domestic habitat, "wild habitat" belongs to habitat, "domestic habitat" belongs to habitat
如果发现一致,则合成可以构造以下概念和关系:If agreement is found, synthesis can construct the following concepts and relationships:
“野生狗”包括{野生,狗,野生栖息地}"Wild dogs" includes {wild, dogs, wild habitats}
因此,可以演绎以下更高阶关系:Therefore, the following higher-order relations can be deduced:
野生狗←COH→狮子Wild Dog ←COH→Lion
野生狗←COH→野生猫Feral Dog ← COH → Feral Cat
因此,“野生狗”以及具有“狮子”和“野生猫”的关系二者是新合成的构造。Thus, both "wild dog" and the relationship with "lion" and "wild cat" are newly synthesized constructs.
自由文本(自然语言)示例Free text (natural language) examples
以下是可以使用比如自然语言处理、实体提取和统计聚类这样的方式来变换成结构化语义表示的自然语言文本的示例。一旦被变换,可以应用上文描述的示例性规则以处理数据。The following is an example of natural language text that can be transformed into a structured semantic representation using approaches such as natural language processing, entity extraction, and statistical clustering.Once transformed, the exemplary rules described above can be applied to process the data.
猫(Felis silvestris catus)(也被称为家养猫或者家猫以区分它与其它猫科和猫科动物)是由于其友谊及其猎捕寄生虫和家居害虫的能力而为人类所珍爱的小型食肉哺乳动物。猫已经与人类相关联了至少9,500年并且当前是世界上的最流行的宠物。猫由于它们与人类的密切关联而现在几乎在地球上无处不在。The cat (Felis silvestris catus) (also known as the domestic cat or house cat to distinguish it from other cats and felines) is a small carnivorous mammal cherished by humans for its companionship and its ability to hunt parasites and household pests. Cats have been associated with humans for at least 9,500 years and are currently the world's most popular pets. Cats are now nearly ubiquitous on Earth due to their close association with humans.
可以从这一自然语言文本推导如图9中所示的结构化知识表示。可以使用在每个例示性知识表示类型之下描述的规则处理这一知识表示如下:From this natural language text, a structured knowledge representation as shown in FIG9 can be derived. This knowledge representation can be processed using the rules described under each exemplary knowledge representation type as follows:
分类法:C1属于C5(分级)Classification: C1 belongs to C5 (grading)
同义词环:C1:C2:C3Synonym ring: C1:C2:C3
辞典:C1与C7相关联。Dictionary: C1 is associated with C7.
本体论:C1猎捕C6;在C7上发现C1Ontology: C1 hunts C6; C1 is found on C7
将合成应用于这一示例,可以推导附加结构化数据。例如,应用规则1(一致概念合成),可以推导附加概念:Applying composition to this example, additional structured data can be derived. For example, applying Rule 1 (consistent concept composition) can derive additional concepts:
C8:家养C8: Domestic
C9:住宅C9: Residential
继而,可以例如通过应用规则3(同义词概念合成)来合成新关系:Then, the new relation can be synthesized, for example, by applying Rule 3 (Synonym Concept Synthesis):
C8::C9(“家养”是“住宅”的同义词)C8::C9 ("domesticated" is a synonym for "residential")
语义互操作性示例Semantic Interoperability Example
下例举例说明语义互操作性,其中可以将一个KR中的输入变换成不同KR作为输出。可以例如根据上文针对语义互操作性处理而呈现的伪码的一般数据流来实施下文描述的示例性处理。The following example illustrates semantic interoperability, where input in one KR can be transformed into a different KR as output.The exemplary process described below can be implemented, for example, according to the general data flow of the pseudocode presented above for the semantic interoperability process.
输入(输入KR是辞典;::表示……的同义词;|-表示更狭义。)Input (Input KR is a dictionary; :: indicates a synonym for ...; |- indicates a narrower meaning.)
在图10中图示了可以从上述输入KR分析的基本数据结构。在该图中,实线箭头表示“属于”关系,并且虚线箭头表示“包括”关系。The basic data structure that can be analyzed from the above-mentioned input KR is illustrated in Figure 10. In this figure, the solid arrows represent "belong to" relationships, and the dashed arrows represent "include" relationships.
输出(输出KR是概念“红头啄木鸟”的面分级。)Output (Output KR is the face classification of the concept "Red-headed Woodpecker").
标注Annotation
“红头啄木鸟”是{鸟类:啄木鸟,着色:红色,同名解剖体:头部}"Red-headed Woodpecker" is {bird: Woodpecker, coloration: Red, eponymous anatomy: Head}
注意,在上例中,AKRM表示中的原子语义可以用来探索跨越每个KR的含义交集(语义互操作性)。例如,原子概念“冠部”和“头部”可以提供跨越形式上不相交概念“麻雀”和“啄木鸟”的含义的连接。Note that in the above example, the atomic semantics in the AKRM representation can be used to explore the intersection of meanings across each KR (semantic interoperability). For example, the atomic concepts "crown" and "head" can provide connections of meaning across the formally disjoint concepts "sparrow" and "woodpecker".
从前文讨论和示例中应当领会到,本发明的方面可以涉及知识表示中的一些最紧迫和有挑战性的应用,包括用于头脑风暴和认知扩充、支持动态和新兴知识以及通过在各种复杂知识表示之间转换成共同语义词汇表来提供语义互操作性的工具。It should be appreciated from the foregoing discussion and examples that aspects of the present invention can address some of the most pressing and challenging applications in knowledge representation, including tools for brainstorming and cognitive augmentation, supporting dynamic and emerging knowledge, and providing semantic interoperability by converting between various complex knowledge representations into a common semantic vocabulary.
这里描述的各种发明方面可以与一个或者多个计算机和/或设备中的任何计算机和/或设备一起使用,每个计算机和/或设备具有一个或者多个处理器,该处理器可以被编程为执行上文描述的用于在分析和合成复杂知识表示时使用原子知识表示模型的动作中的任何动作。例如,服务器和客户端计算系统二者可以如上文描述的那样被实施为一个或者多个计算机。图11示意性地示出了本公开内容的各种发明方面可以实施于其上的例示性计算机1100。计算机1100包括处理器或者处理单元1101和存储器1102,该存储器可以包括易失性和/或非易失性存储器。计算机1100除了系统存储器1102之外也可以包括储存器1105(例如,一个或者多个盘驱动)。The various inventive aspects described herein can be used with any of one or more computers and/or devices, each having one or more processors that can be programmed to perform any of the actions described above for using an atomic knowledge representation model when analyzing and synthesizing complex knowledge representations. For example, both the server and client computing systems can be implemented as one or more computers as described above. Figure 11 schematically illustrates an exemplary computer 1100 on which the various inventive aspects of the present disclosure can be implemented. The computer 1100 includes a processor or processing unit 1101 and a memory 1102, which can include volatile and/or non-volatile memory. The computer 1100 can also include storage 1105 (e.g., one or more disk drives) in addition to the system memory 1102.
存储器1102和/或储存器1105可以存储用于将处理单元1101编程为执行这里描述的功能中的任何功能的一个或者多个计算机可执行指令。储存器1105可以可选地也随需存储一个或者多个数据集。例如,用来实施服务器系统100的计算机可以在一些实施例中在储存器1105中存储AKRM数据集110。备选地,可以与用来实施服务器系统100的计算机分离地实施这样的数据集。Memory 1102 and/or storage 1105 may store one or more computer-executable instructions for programming processing unit 1101 to perform any of the functions described herein. Storage 1105 may also optionally store one or more datasets as needed. For example, a computer used to implement server system 100 may, in some embodiments, store AKRM dataset 110 in storage 1105. Alternatively, such a dataset may be implemented separately from the computer used to implement server system 100.
这里对计算机的引用可以包括任何具有编程的处理器的设备,包括架装型计算机、台式计算机、膝上型计算机、平板计算机或者可以一般不被视为计算机的包括编程的处理器的许多设备(例如,PDA、MP3播放器、移动电话、无线耳机)中的任何设备。References here to a computer may include any device having a programmed processor, including a rack-mounted computer, a desktop computer, a laptop computer, a tablet computer, or any of the many devices that include a programmed processor that may not generally be considered a computer (e.g., a PDA, an MP3 player, a mobile phone, a wireless headset).
示例性计算机1100可以具有一个或者多个输入设备和/或输出设备,比如图11中所示的设备1106和1107。这些设备可以用来呈现用户接口以及其它功能。可以用来提供用户接口的输出设备的示例包括用于可视呈现输出的打印机或者显示屏和用于可听呈现输出的扬声器或者其它声音生成设备。可以用于用户接口的输入设备的示例包括键盘和指点设备,比如鼠标、触摸板和数字面板。作为另一示例,计算机可以通过话音识别或者以其它可听格式接收输入信息。Exemplary computer 1100 can have one or more input devices and/or output devices, such as the equipment 1106 and 1107 shown in Figure 11.These devices can be used for presenting user interface and other functions.The example of the output device that can be used to provide user interface comprises the printer or the display screen that is used for visual presentation output and the loudspeaker or other sound generating device that is used for audible presentation output.The example of the input device that can be used for user interface comprises keyboard and pointing device, such as mouse, touch pad and digital panel.As another example, computer can receive input information by voice recognition or with other audible formats.
如图11中所示,计算机1100也可以包括用于经由各种网络(例如,网络1120)实现通信的一个或者多个网络接口(例如,网络接口1110)。网络的示例包括局域网或者广域网,比如企业网或者因特网。这样的网络可以基于任何适当技术并且可以根据任何适当协议来操作而且可以包括无线网络、有线网络或者光纤网络。As shown in FIG11 , computer 1100 may also include one or more network interfaces (e.g., network interface 1110) for enabling communication via various networks (e.g., network 1120). Examples of networks include local area networks or wide area networks, such as an intranet or the Internet. Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks, or fiber optic networks.
已经这样描述了本发明的至少一个实施例的若干方面,将领会到,各种变更、修改和改进将容易为本领域技术人员所想到。因此,前文描述和附图仅为举例。Having thus described several aspects of at least one embodiment of the present invention, it will be appreciated that various alterations, modifications, and improvements will readily occur to those skilled in the art.Accordingly, the foregoing description and drawings are by way of example only.
可以按照许多方式中的任何方式实施本发明的上文描述的实施例。例如,实施例可以使用硬件、软件或者其组合来实施。当用软件实施时,可以在任何适当处理器上或者在无论是在单个计算机中提供的还是分布于多个处理器之中的处理器汇集上执行软件代码。这样的处理器可以被实施为集成电路,其中一个或者多个处理器在集成电路部件中。然而,可以使用任何适当格式的电路来实施处理器。The embodiments described above of the present invention may be implemented in any of a number of ways. For example, the embodiments may be implemented using hardware, software, or a combination thereof. When implemented using software, the software code may be executed on any suitable processor or on a collection of processors, whether provided in a single computer or distributed among multiple processors. Such a processor may be implemented as an integrated circuit, wherein one or more processors are included in an integrated circuit component. However, the processor may be implemented using circuits of any suitable format.
另外,应当领会到,可以按照多个形式(比如架装型计算机、台式计算机、膝上型计算机或者平板计算机)中的任何形式实现计算机。此外,计算机可以嵌入于一般不被视为计算机、但是具有适当处理能力的设备(包括个人数字助理(PDA)、智能电话或者任何其它适当便携或者固定电子设备)中。Additionally, it will be appreciated that the computer may be implemented in any of a number of forms, such as a rack-mounted computer, a desktop computer, a laptop computer, or a tablet computer. Furthermore, the computer may be embedded in a device not generally regarded as a computer but with suitable processing capabilities, including a personal digital assistant (PDA), a smart phone, or any other suitable portable or fixed electronic device.
计算机也可以具有一个或者多个输入和输出设备。这些设备可以用来呈现用户接口以及其它功能。可以用来提供用户接口的输出设备的示例包括用于可视呈现输出的打印机或者显示屏和用于可听呈现输出的扬声器或者其它声音生成设备。可以用于用户接口的输入设备的示例包括键盘和指点设备,比如鼠标、触摸板和数字面板。作为另一示例,计算机可以通过话音识别或者以其它可听格式接收输入信息。Computers may also have one or more input and output devices. These devices may be used to present user interfaces and other functions. Examples of output devices that may be used to provide user interfaces include printers or display screens for visual presentation output and loudspeakers or other sound generating devices for audible presentation output. Examples of input devices that may be used for user interfaces include keyboards and pointing devices, such as mice, touchpads, and digital panels. As another example, computers may receive input information via voice recognition or in other audible formats.
这样的计算机可以按照任何适当形式由一个或者多个网络(包括作为局域网或者广域网,比如企业网或者因特网)互连。这样的网络可以基于任何适当技术并且可以根据任何适当协议来操作而且可以包括无线网络、有线网络或者光纤网络。Such computers may be interconnected in any suitable manner by one or more networks (including as local area networks or wide area networks, such as an enterprise network or the Internet). Such networks may be based on any suitable technology and may operate according to any suitable protocol and may include wireless networks, wired networks, or fiber optic networks.
也可以将这里概括的各种方法或者过程编码为在运用多种操作系统或者平台中的任何操作系统或者平台的一个或者多个处理器上可执行的软件。此外,可以使用多个适当编程语言和/或编程或者脚本编写工具中的任何语言和/或工具来编写这样的软件,并且也可以将这样的软件编译为在框架或者虚拟机上执行的可执行机器语言代码或者中间代码。The various methods or processes summarized herein may also be encoded as software executable on one or more processors utilizing any of a variety of operating systems or platforms. Additionally, such software may be written using any of a plurality of appropriate programming languages and/or programming or scripting tools, and may also be compiled into executable machine language code or intermediate code that is executed on a framework or virtual machine.
就这一点而言,可以将本发明实现为用一个或者多个如下程序编码的一种有形非瞬态计算机可读存储介质(或者多个计算机可读介质)(例如,计算机存储器、一个或者多个软盘、紧致盘(CD)、光盘、数字万用盘(DVD)、磁带、闪存、在现场可编程门阵列或者其它半导体器件中的电路配置或者其它非瞬态有形计算机可读存储介质),该程序当在一个或者多个计算机或者其它处理器上被执行时执行如下方法,这些方法实施上文讨论的本发明的各种实施例。一个或者多个计算机可读介质可以是可移植的,从而使得存储于其上的一个或者多个程序可以被加载到一个或者多个不同计算机或者其它处理器上以实施如上文讨论的本发明的各种方面。如这里所用,术语“非瞬态计算机可读存储介质”仅涵盖可以被视为制造品(即制造产品)或者机器的计算机可读介质In this regard, the present invention may be implemented as a tangible, non-transitory computer-readable storage medium (or multiple computer-readable media) (e.g., computer memory, one or more floppy disks, compact disks (CDs), optical disks, digital versatile disks (DVDs), magnetic tape, flash memory, a circuit configuration in a field programmable gate array or other semiconductor device, or other non-transitory, tangible computer-readable storage medium) encoded with one or more programs that, when executed on one or more computers or other processors, perform methods that implement the various embodiments of the present invention discussed above. The one or more computer-readable media may be portable, such that the one or more programs stored thereon may be loaded onto one or more different computers or other processors to implement various aspects of the present invention as discussed above. As used herein, the term "non-transitory computer-readable storage medium" encompasses only computer-readable media that may be considered an article of manufacture (i.e., an article of manufacture) or a machine.
术语“程序”或者“软件”这里在通用意义上用来指代可以用来将计算机或者其它处理器编程为实施如上文讨论的本发明的各种方面的、任何类型的计算机代码或者计算机可执行指令集。此外,应当领会到,根据本发明的一个方面,在被执行时执行本发明方法的一个或者多个计算机程序无需驻留于单个计算机或者处理器上,而是可以用模块化方式分布于多个不同计算机或者处理器之中以实施本发明的各种方面。The terms "program" or "software" are used herein in a generic sense to refer to any type of computer code or set of computer-executable instructions that can be used to program a computer or other processor to implement the various aspects of the present invention as discussed above. Furthermore, it should be appreciated that according to one aspect of the present invention, the one or more computer programs that, when executed, perform the methods of the present invention need not reside on a single computer or processor, but can be distributed in a modular manner among multiple different computers or processors to implement the various aspects of the present invention.
计算机可执行指令可以是以由一个或者多个计算机或者其它设备执行的许多形式,比如程序模块。一般而言,程序模块包括执行特定任务或者实施特定抽象数据类型的例程、程序、对象、部件、数据结构等。通常在各种实施例中可以如希望的那样组合或者分布程序模块的功能。Computer-executable instructions can be in many forms, such as program modules, that are executed by one or more computers or other devices. Generally speaking, program modules include routines, programs, objects, components, data structures, etc. that perform specific tasks or implement specific abstract data types. The functionality of the program modules can often be combined or distributed as desired in various embodiments.
数据结构也可以按照任何适当形式存储于计算机可读介质中。为了简化说明,可以将数据结构示出为具有通过在数据结构中的位置来相关的字段。通过在传送字段之间关系的计算机可读介质中向用于字段的储存器分配位置来类似地实现这样的关系。然而,任何适当机制可以用来建立在数据结构的字段中的信息之间的关系,包括通过使用指针、标记或者在数据元之间建立关系的其它机制。The data structure may also be stored in a computer-readable medium in any suitable form. For simplicity of description, the data structure may be illustrated as having fields that are related by their position in the data structure. Similar relationships may be achieved by assigning locations to the memory used for the fields in the computer-readable medium that transmits the relationships between the fields. However, any suitable mechanism may be used to establish relationships between the information in the fields of the data structure, including by using pointers, tags, or other mechanisms that establish relationships between data elements.
可以单独、在组合中或者在前文中描述的实施例中未具体讨论的多种布置中使用本发明的各种方面,因此本发明在它的应用上不限于在前文描述中阐述的或者在附图中图示的部件细节和布置。例如,在一个实施例中描述的方面可以按照任何方式与在其它实施例中描述的方面组合。The various aspects of the present invention may be used alone, in combination, or in a variety of arrangements not specifically discussed in the embodiments described above, and thus the present invention is not limited in its application to the details and arrangements of components set forth in the foregoing description or illustrated in the accompanying drawings. For example, aspects described in one embodiment may be combined in any manner with aspects described in other embodiments.
也可以江本发明实现为已经提供其示例的方法。可以按照任何适当方式对作为方法的部分而执行的动作排序。因而,可以构造如下实施例,在这些实施例中以与所示顺序不同的如下顺序执行动作,该顺序可以包括同时执行即使在示例实施例中作为依序动作而示出的一些动作。The present invention may also be implemented as a method, for which examples have been provided. The actions performed as part of the method may be ordered in any suitable manner. Thus, embodiments may be constructed in which the actions are performed in an order different from that shown, which may include performing some actions simultaneously even though they are shown as sequential actions in the example embodiments.
在权利要求中使用诸如“第一”、“第二”、“第三”等顺序术语以修饰权利要求要素并非本身意味着一个权利要求要素较另一权利要求要素而言的任何优先、居先或者顺序或者意味着执行方法的动作的时间顺序、而是仅用作使具有某一名称的一个权利要求要素区别于具有相同名称的另一权利要求要素的标签(而是用于顺序术语)以区分权利要求要素。The use of ordinal terms such as "first", "second", "third", etc. in the claims to modify claim elements does not in itself mean any priority, precedence or order of one claim element over another claim element or mean the temporal order of the actions of performing the method, but is merely used as a label to distinguish one claim element with a certain name from another claim element with the same name (but is used as ordinal terms) to distinguish claim elements.
如这里定义和使用的所有定义应当被理解为支配字典定义、在通过引用而结合的文献中的定义和/或定义的术语的普通含义。All definitions, as defined and used herein, should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.
不定冠词“一个”和“一种”如这里所用除非另有相反明示则应当理解为意味着“至少一个”。The indefinite articles "a" and "an" as used herein should be understood to mean "at least one" unless expressly indicated otherwise.
如这里所用,短语“至少一个”在引用一个或者多个要素的列表时应当理解为意味着从要素列表中的要素中的任何一个或者多个要素中选择的至少一个要素、但是未必包括在要素列表内具体列举的每一个要素中的至少一个要素、并且未排除要素列表中的要素的任何组合。这一定义也允许可以可选地存在除了在短语“至少一个”引用的要素列表内具体标识的要素之外的、无论是与具体标识的那些要素有关还是无关的要素。因此,作为非限制性的示例,“A和B中的至少一个”(或者等效为“A或者B中的至少一个”或者等效为“A和/或B中的至少一个”)可以在一个实施例中指代至少一个A、可选地包括多个A而没有B存在(并且可选地包括除了B之外的要素);在另一实施例中指代至少一个B、可选地包括多个B而无A存在(并且可选地包括除了A之外的要素);在又一实施例中指代至少一个A、可选地包括多个A和至少一个B、可选地包括多个B(并且可选地包括其它要素);等等。As used herein, the phrase "at least one" when referring to a list of one or more elements should be understood to mean at least one element selected from any one or more elements in the list of elements, but not necessarily including at least one element from each element specifically enumerated in the list of elements, and not excluding any combination of elements in the list of elements. This definition also allows for the optional presence of elements other than the elements specifically identified in the list of elements to which the phrase "at least one" refers, whether related or unrelated to those specifically identified elements. Thus, as a non-limiting example, "at least one of A and B" (or equivalently, "at least one of A or B" or equivalently, "at least one of A and/or B") can refer to at least one A, optionally including multiple A's without B's presence (and optionally including elements other than B) in one embodiment; at least one B, optionally including multiple B's without A's presence (and optionally including elements other than A) in another embodiment; at least one A, optionally including multiple A's and at least one B, optionally including multiple B's (and optionally including other elements) in yet another embodiment; and so on.
短语“和/或”如这里所用应当理解为意味着这样联合的要素中的“任一个或者二者”、即在一些情况下相与存在而在其它情况下相或存在的要素。应当以相同方式理解用“和/或”列举的多个要素、即这样联合的元素中的“一个或者多个”。可以可选地存在除了由“和/或”分句具体标识的要素之外的、无论是与具体标识的那些要素有关还是无关的其它要素。因此作为非限制性的示例,对“A和/或B”的引用在与开放式语言、比如“包括”结合使用时可以在一个实施例中仅指代A(可选地包括除了B之外的要素);在另一实施例中仅指代B(可选地包括除了A之外的要素);在又一实施例中指代A和B(可选地包括其它要素);等等。The phrase "and/or" as used herein should be understood to mean "either or both" of the elements so united, i.e., elements that exist in conjunction in some cases and in or in other cases. Multiple elements listed with "and/or," i.e., "one or more" of the elements so united, should be understood in the same manner. Other elements may optionally be present in addition to the elements specifically identified by the "and/or" clause, whether related or unrelated to those specifically identified elements. Thus, as a non-limiting example, a reference to "A and/or B," when used in conjunction with open language such as "comprising," may refer to only A (optionally including elements in addition to B) in one embodiment; to only B (optionally including elements in addition to A) in another embodiment; to both A and B (optionally including other elements) in yet another embodiment; and so on.
如这里所用,“或者”应当被理解为具有与如上文定义的“和/或”相同的含义。例如,当在列表中隔开项目时,“或者”或者“和/或”应当被解释为包含意义,即包括多个要素或者要素列表中的至少一个、也包括多个并且可选地包括附加的未列举项目。As used herein, "or" should be understood to have the same meaning as "and/or" as defined above. For example, when separating items in a list, "or" or "and/or" should be interpreted as inclusive, i.e., including a plurality of elements or at least one of a list of elements, also including a plurality and optionally including additional unlisted items.
这里所用短语和术语也是为了描述而不应被视为限制。这里使用“包括”或者“具有”、“包含”、“涉及到”及其变化意味着包括其后列举的项目及其等效项目以及附加项目。The phrases and terms used herein are also for description and should not be considered as limiting. The use of "include" or "have", "comprises", "involves" and variations thereof herein means including the items listed thereafter and equivalents thereof as well as additional items.
已经具体描述了本发明的若干实施例,各种修改和改进将容易为本领域技术人员所想到。因此,前文描述仅为举例而并非旨在于作为限制。Having specifically described several embodiments of the present invention, various modifications and improvements will readily occur to those skilled in the art. Therefore, the foregoing description is by way of example only and is not intended to be limiting.
Claims (20)
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US61/357,266 | 2010-06-22 |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| HK13108963.1A Addition HK1181895B (en) | 2010-06-22 | 2011-06-22 | Systems and methods for analyzing and synthesizing complex knowledge representations |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| HK13108963.1A Division HK1181895B (en) | 2010-06-22 | 2011-06-22 | Systems and methods for analyzing and synthesizing complex knowledge representations |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| HK1230300A1 HK1230300A1 (en) | 2017-12-01 |
| HK1230300B true HK1230300B (en) | 2021-09-17 |
Family
ID=
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN102947842B (en) | Systems and methods for analyzing and synthesizing complex knowledge representations | |
| US11474979B2 (en) | Methods and devices for customizing knowledge representation systems | |
| AU2011269676B2 (en) | Systems of computerized agents and user-directed semantic networking | |
| JP5989665B2 (en) | System and method for analyzing and synthesizing complex knowledge expressions | |
| US11809388B2 (en) | Methods and devices for customizing knowledge representation systems | |
| AU2015268637B2 (en) | Systems and methods for analyzing and synthesizing complex knowledge representations | |
| HK1230300B (en) | Systems and methods for analyzing and synthesizing complex knowledge representations | |
| US12105684B2 (en) | Methods and devices for customizing knowledge representation systems | |
| CA2886202C (en) | Methods and devices for customizing knowledge representation systems | |
| HK1230300A1 (en) | Systems and methods for analyzing and synthesizing complex knowledge representations | |
| HK1181895B (en) | Systems and methods for analyzing and synthesizing complex knowledge representations | |
| Dhanasekaran et al. | An Approach for Knowledge Extraction Using Ontology Construction and Machine Learning Techniques |