CN118657140A - A test question knowledge point annotation method and system based on artificial intelligence - Google Patents
A test question knowledge point annotation method and system based on artificial intelligence Download PDFInfo
- Publication number
- CN118657140A CN118657140A CN202411133598.1A CN202411133598A CN118657140A CN 118657140 A CN118657140 A CN 118657140A CN 202411133598 A CN202411133598 A CN 202411133598A CN 118657140 A CN118657140 A CN 118657140A
- Authority
- CN
- China
- Prior art keywords
- knowledge point
- data
- contribution
- knowledge
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
技术领域Technical Field
本发明涉及习题标注技术领域,尤其涉及一种基于人工智能的试题知识点标注方法及系统。The present invention relates to the technical field of exercise annotation, and in particular to an artificial intelligence-based test question knowledge point annotation method and system.
背景技术Background Art
当前,教育科技领域中,试题知识点标注主要集中在对单一解法的分析上。这种做法虽然能够针对某一特定解题路径提供较为深入的知识点标注,但它忽略了问题解决的多样性与复杂性。在现实的教育场景中,许多问题往往存在多种解法,每种解法背后都蕴含着独特的知识点和思维方式。然而,传统的单一解法标注方式往往只关注最常见或最直接的解题途径,从而导致了对其他潜在知识点的忽视,限制了学生思维的拓展和创造力的培养。Currently, in the field of educational technology, the annotation of knowledge points in test questions mainly focuses on the analysis of a single solution. Although this approach can provide more in-depth knowledge point annotation for a specific problem-solving path, it ignores the diversity and complexity of problem solving. In real educational scenarios, many problems often have multiple solutions, and each solution contains unique knowledge points and ways of thinking. However, the traditional single solution annotation method often only focuses on the most common or most direct way to solve the problem, which leads to the neglect of other potential knowledge points and limits the expansion of students' thinking and the cultivation of creativity.
发明内容Summary of the invention
为了解决上述提出的至少一个技术问题,本发明提供一种基于人工智能的试题知识点标注方法,通过引入多解法的知识点标注策略,显著提升了知识点标注的全面性。In order to solve at least one of the technical problems raised above, the present invention provides a test question knowledge point labeling method based on artificial intelligence, which significantly improves the comprehensiveness of knowledge point labeling by introducing a knowledge point labeling strategy with multiple solutions.
一方面,本发明提供了一种基于人工智能的试题知识点标注方法,所述方法包括:In one aspect, the present invention provides a method for labeling test knowledge points based on artificial intelligence, the method comprising:
采集待标注的试题数据,所述试题数据包括题目文本和多个解答文本,每个所述解答文本对应不同解题策略;Collecting test data to be annotated, the test data including a question text and a plurality of answer texts, each answer text corresponding to a different problem-solving strategy;
依次对多个所述解答文本进行知识点贡献分析处理,得到多个知识点贡献分数;Performing knowledge point contribution analysis on the plurality of answer texts in sequence to obtain a plurality of knowledge point contribution scores;
基于所述题目文本,整合每个所述解答文本对应的多个所述知识点贡献分数,得到多解法知识点标注数据。Based on the question text, the contribution scores of the multiple knowledge points corresponding to each of the answer texts are integrated to obtain multi-solution knowledge point annotation data.
优选地,所述依次对多个所述解答文本进行知识点贡献分析处理,得到多个知识点贡献分数,包括:Preferably, the step of sequentially performing knowledge point contribution analysis on the plurality of answer texts to obtain a plurality of knowledge point contribution scores includes:
对所述解答文本进行知识点类型分词处理,得到多个知识点类型文本信息集合和多个论证连接词;Performing knowledge point type word segmentation processing on the answer text to obtain multiple knowledge point type text information sets and multiple argumentation connectives;
从预设的知识点类型基础贡献权重表中获取知识点基础权重数据,并基于多个所述论证连接词,从预设的论证连接词贡献权重表中匹配对应的贡献权重,得到逻辑贡献权重数据;Obtaining knowledge point basic weight data from a preset knowledge point type basic contribution weight table, and matching corresponding contribution weights from a preset argumentation connective contribution weight table based on a plurality of the argument connectives to obtain logical contribution weight data;
基于所述知识点基础权重数据和知识点与论证连接词的位置关系,并根据所述逻辑贡献权重数据和所述知识点类型文本信息集合进行知识点贡献分析,得到多个所述知识点贡献分数。Based on the knowledge point basic weight data and the positional relationship between the knowledge point and the argument connective, and according to the logic contribution weight data and the knowledge point type text information set, a knowledge point contribution analysis is performed to obtain a plurality of the knowledge point contribution scores.
优选地,所述基于所述知识点基础权重数据和知识点与论证连接词的位置关系,并根据所述逻辑贡献权重数据和所述知识点类型文本信息集合进行知识点贡献分析,得到多个所述知识点贡献分数,包括:Preferably, the knowledge point contribution analysis is performed based on the knowledge point basic weight data and the positional relationship between the knowledge point and the argument connective, and according to the logical contribution weight data and the knowledge point type text information set to obtain a plurality of the knowledge point contribution scores, including:
设逻辑贡献权重数据为,知识点基础权重数据为,知识点类型文本信息集合为,其中为知识点类型数量,知识点类型与所有论证连接词的距离集合,其中为论证连接词的数量,为知识点类型与论证连接词的距离,则对于第个解法的知识点贡献分数表示如下:Assume the logical contribution weight data is , the basic weight data of knowledge points is , the knowledge point type text information set is ,in is the number of knowledge point types, knowledge point type The set of distances from all argument connectives ,in To prove the number of conjunctions, Knowledge point type Connective words with argument , then for the first The knowledge point contribution score of each solution It is expressed as follows:
, ,
, ,
其中为知识点类型的基础权重数值,为知识点类型的贡献分数,为基础权重与论证连接词影响权重调节因子,默认为0.5。in Knowledge point type The basic weight value of Knowledge point type The contribution score of The basic weight and argument connective influence weight adjustment factor, default is 0.5.
优选地,所述对所述解答文本进行知识点类型分词处理,得到多个知识点类型文本信息集合和多个论证连接词,包括:Preferably, the knowledge point type segmentation processing is performed on the answer text to obtain multiple knowledge point type text information sets and multiple argumentation conjunctions, including:
使用分词工具对所述解答文本进行分词处理,得到多个解答分词数据;Using a word segmentation tool to perform word segmentation processing on the answer text to obtain multiple answer word segmentation data;
收集包含有L种不同知识点类型的参照知识点分词数据,并获取多个参照论证连接词;Collect reference knowledge point word segmentation data containing L different knowledge point types, and obtain multiple reference argumentation connectives;
整合多个所述解答分词数据、多个所述参照知识点分词数据和多个所述参照论证连接词,得到混合文本数据集;Integrate the plurality of the answer word segmentation data, the plurality of the reference knowledge point word segmentation data and the plurality of the reference argumentation connectives to obtain a mixed text data set;
采用K-means算法对所述混合文本数据集进行L+1簇分类处理,得到L+1个分类簇数据;Using K-means algorithm to perform L+1 cluster classification processing on the mixed text data set to obtain L+1 classification cluster data;
依次分析所述解答分词数据在L+1个分类簇数据中的分布情况,将包含有最多所述参照论证连接词的分类簇数据中的所述解答分词数据确定为论证连接词,确定所述解答分词数据的知识点类型为所在分类簇数据中数据量最多的所述参照知识点分词数据对应的知识点类型。Analyze the distribution of the answer word segmentation data in the L+1 classification cluster data in turn, determine the answer word segmentation data in the classification cluster data containing the most reference argument connectives as argument connectives, and determine the knowledge point type of the answer word segmentation data as the knowledge point type corresponding to the reference knowledge point word segmentation data with the largest data volume in the classification cluster data.
优选地,所述采集待标注的试题数据之后,还包括:Preferably, after collecting the test question data to be marked, the method further includes:
对采集到的试题数据中的文本进行预处理,包括文本规范化、数学公式转换处理和数学实体同义化处理。The text in the collected test data is preprocessed, including text normalization, mathematical formula conversion processing and mathematical entity synonymization processing.
优选地,所述从预设的知识点类型基础贡献权重表中获取知识点基础权重数据之前,还包括:Preferably, before acquiring the knowledge point basic weight data from the preset knowledge point type basic contribution weight table, the method further includes:
构建所述知识点类型基础贡献权重表,具体包括:Constructing the basic contribution weight table of the knowledge point type, specifically including:
收集历史考试试题及解法得到历史试题数据,并获取教学大纲,所述教学大纲包含所有知识点类型;Collecting history test questions and solutions to obtain history test question data, and obtaining a teaching syllabus, wherein the teaching syllabus includes all knowledge point types;
结合所述知识点类型在所述历史试题数据的出现频率,采用统计学方法计算出各知识点类型的贡献权重。Combined with the frequency of occurrence of the knowledge point type in the history test data, a statistical method is used to calculate the contribution weight of each knowledge point type.
优选地,所述得到多个所述知识点贡献分数之后,还包括:Preferably, after obtaining the plurality of knowledge point contribution scores, the method further comprises:
对整合后的知识点标注数据进行可视化展示,生成知识图谱。The integrated knowledge point annotation data is visualized to generate a knowledge graph.
第二方面,本发明提供了一种基于人工智能的试题知识点标注系统,所述系统包括:In a second aspect, the present invention provides a test question knowledge point annotation system based on artificial intelligence, the system comprising:
数据采集模块,用于采集待标注的试题数据,所述试题数据包括题目文本和多个解答文本,每个所述解答文本对应不同解题策略;A data collection module is used to collect test data to be marked, wherein the test data includes a question text and a plurality of answer texts, each of which corresponds to a different problem-solving strategy;
数据处理模块,用于依次对多个所述解答文本进行知识点贡献分析处理,得到多个知识点贡献分数;A data processing module, used to perform knowledge point contribution analysis on the plurality of answer texts in sequence to obtain a plurality of knowledge point contribution scores;
数据整合模块,用于基于所述题目文本,整合每个所述解答文本对应的多个所述知识点贡献分数,得到多解法知识点标注数据。The data integration module is used to integrate the contribution scores of the multiple knowledge points corresponding to each of the answer texts based on the question text to obtain multi-solution knowledge point annotation data.
优选地,所述数据处理模块包括:Preferably, the data processing module includes:
解答文本分词单元,用于对所述解答文本进行知识点类型分词处理,得到多个知识点类型文本信息集合和多个论证连接词;The answer text segmentation unit is used to perform knowledge point type segmentation processing on the answer text to obtain multiple knowledge point type text information sets and multiple argumentation connectives;
静态数据获取单元,用于从预设的知识点类型基础贡献权重表中获取知识点基础权重数据,并基于多个所述论证连接词,从预设的论证连接词贡献权重表中匹配对应的贡献权重,得到逻辑贡献权重数据;A static data acquisition unit, used to acquire knowledge point basic weight data from a preset knowledge point type basic contribution weight table, and based on a plurality of said argument connectives, match corresponding contribution weights from a preset argument connective contribution weight table to obtain logical contribution weight data;
知识点贡献分析单元,用于基于所述知识点基础权重数据和知识点与论证连接词的位置关系,并根据所述逻辑贡献权重数据和所述知识点类型文本信息集合进行知识点贡献分析,得到多个所述知识点贡献分数。The knowledge point contribution analysis unit is used to perform knowledge point contribution analysis based on the knowledge point basic weight data and the positional relationship between the knowledge point and the argument connective, and according to the logical contribution weight data and the knowledge point type text information set to obtain multiple knowledge point contribution scores.
优选地,所述系统还包括:Preferably, the system further comprises:
可视化模块,用于对整合后的知识点标注数据进行可视化展示,生成知识图谱。The visualization module is used to visualize the integrated knowledge point annotation data and generate a knowledge graph.
本发明的有益效果在于:The beneficial effects of the present invention are:
(1)本发明通过引入多解法的知识点标注策略,显著提升了知识点标注的全面性,传统的单解法标注往往局限于一种思维模式,容易忽视问题的多元解和潜在的知识点网络,而本发明通过分析多种解题策略,不仅能够捕捉到不同解法背后的独特知识点,还能够揭示知识点之间的内在联系,为学生提供多角度的学习视角,促进批判性思维和创新能力的培养;(1) The present invention significantly improves the comprehensiveness of knowledge point annotation by introducing a multi-solution knowledge point annotation strategy. Traditional single-solution annotation is often limited to one thinking mode and easily ignores the multiple solutions to the problem and the potential knowledge point network. The present invention analyzes multiple problem-solving strategies and can not only capture the unique knowledge points behind different solutions, but also reveal the internal connections between knowledge points, providing students with a multi-angle learning perspective and promoting the cultivation of critical thinking and innovation ability.
(2)本发明基于知识点基础权重数据和论证连接词的位置关系,结合逻辑贡献权重数据和知识点类型文本信息集合,实现了知识点贡献分析的精细化和智能化,通过量化知识点的基础权重和论证连接词的逻辑贡献,能够准确评估知识点在解答过程中的实际作用,不仅关注知识点的静态重要性,还考虑了其在逻辑推理中的动态贡献,从而为知识点的评估提供了更加全面和深入的视角;(2) Based on the positional relationship between the basic weight data of knowledge points and argument connectives, combined with the logical contribution weight data and the knowledge point type text information set, the present invention realizes the refinement and intelligence of knowledge point contribution analysis. By quantifying the basic weights of knowledge points and the logical contributions of argument connectives, the actual role of knowledge points in the answering process can be accurately evaluated. It not only focuses on the static importance of knowledge points, but also considers their dynamic contributions in logical reasoning, thereby providing a more comprehensive and in-depth perspective for the evaluation of knowledge points.
(3)本发明通过引入论证连接词的影响权重和距离因素,能够在量化分析的基础上,更准确地评估知识点在解题过程中的作用,为教育决策者和教师提供了基于数据的决策支持,教育决策者可以依据知识点的贡献分数和关联性,优化课程设计,合理分配教育资源;教师则能依据知识点的评估结果,调整教学策略,确保核心知识点得到充分讲解,辅助知识点得以适当覆盖,从而提升教学质量和效率。(3) By introducing the influence weight and distance factor of argument connectives, the present invention can more accurately evaluate the role of knowledge points in the problem-solving process on the basis of quantitative analysis, and provide data-based decision support for educational decision makers and teachers. Educational decision makers can optimize course design and reasonably allocate educational resources based on the contribution scores and relevance of knowledge points; teachers can adjust teaching strategies based on the evaluation results of knowledge points to ensure that core knowledge points are fully explained and auxiliary knowledge points are appropriately covered, thereby improving teaching quality and efficiency.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本发明实施例或背景技术中的技术方案,下面将对本发明实施例或背景技术中所需要使用的附图进行说明。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the background technology, the drawings required for use in the embodiments of the present invention or the background technology will be described below.
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本发明公开的技术方案。The drawings herein are incorporated into the specification and constitute a part of the specification. These drawings illustrate embodiments consistent with the present disclosure and, together with the specification, are used to illustrate the technical solutions disclosed in the present invention.
图1为本发明实施例提供的一种基于人工智能的试题知识点标注方法的流程示意图;FIG1 is a schematic diagram of a flow chart of a method for labeling test question knowledge points based on artificial intelligence provided by an embodiment of the present invention;
图2为本发明实施例提供的单个解法的知识点贡献分数分析的流程示意图;FIG2 is a schematic diagram of a flow chart of knowledge point contribution score analysis of a single solution provided by an embodiment of the present invention;
图3为本发明实施例提供的对解答文本进行知识点类型分词处理的流程示意图;FIG3 is a schematic diagram of a flow chart of performing knowledge point type word segmentation processing on an answer text according to an embodiment of the present invention;
图4为本发明实施例提供的一种基于人工智能的试题知识点标注系统的结构示意图。FIG4 is a schematic diagram of the structure of an artificial intelligence-based test question knowledge point annotation system provided in an embodiment of the present invention.
具体实施方式DETAILED DESCRIPTION
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to enable those skilled in the art to better understand the scheme of the present invention, the technical scheme in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of the present invention.
本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其他步骤或单元。The terms "first", "second", etc. in the specification and claims of the present invention and the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific order. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions. For example, a process, method, system, product or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes steps or units that are not listed, or optionally includes other steps or units inherent to these processes, methods, products or devices.
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。The term "and/or" herein is only a description of the association relationship of the associated objects, indicating that there may be three relationships. For example, A and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone. In addition, the term "at least one" herein represents any combination of at least two of any one or more of a plurality of. For example, including at least one of A, B, and C can represent including any one or more elements selected from the set consisting of A, B, and C.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本发明的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。Reference to "embodiments" herein means that a particular feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present invention. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment that is mutually exclusive with other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.
另外,为了更好地说明本发明,在下文的具体实施方式中给出了众多的具体细节。本领域技术人员应当理解,没有某些具体细节,本发明同样能够实施。在一些实例中,对于本领域技术人员熟知的方法、手段、元件和电路未作详细描述,以便于凸显本发明的主旨。In addition, in order to better illustrate the present invention, numerous specific details are provided in the following specific embodiments. It should be understood by those skilled in the art that the present invention can be implemented without certain specific details. In some examples, methods, means, components and circuits well known to those skilled in the art are not described in detail in order to highlight the subject matter of the present invention.
目前,现有的试题知识点标注方法过度依赖于单一解法,导致知识点标注的片面性和局限性。At present, the existing test knowledge point annotation methods rely too much on a single solution, which leads to the one-sidedness and limitations of knowledge point annotation.
实施例1Example 1
本发明提供了一种基于人工智能的试题知识点标注方法,参考图1,方法包括:The present invention provides a method for labeling test knowledge points based on artificial intelligence. Referring to FIG1 , the method includes:
S100,采集待标注的试题数据,试题数据包括题目文本和多个解答文本,每个解答文本对应不同解题策略;S100, collecting test data to be annotated, the test data including a question text and a plurality of answer texts, each answer text corresponding to a different problem-solving strategy;
S200,依次对多个解答文本进行知识点贡献分析处理,得到多个知识点贡献分数;S200, performing knowledge point contribution analysis on multiple answer texts in sequence to obtain multiple knowledge point contribution scores;
S300,基于题目文本,整合每个解答文本对应的多个知识点贡献分数,得到多解法知识点标注数据。S300, based on the question text, integrating the contribution scores of multiple knowledge points corresponding to each answer text to obtain multi-solution knowledge point annotation data.
本实施例中,从题库中收集待标注的试题数据,每道试题除了包含题目文本外,还应附带至少两种不同的解答文本,每种解答文本代表一种独立的解题策略。进而,运用自然语言处理(NLP)技术和机器学习算法,对每种解答文本进行深度解析,识别并量化其中涉及的知识点,计算出每个知识点对该解答文本的贡献度(即知识点贡献分数)。这一过程可能涉及到语义理解、关键词提取、概念映射等技术。最后,将同一试题下所有解答文本的知识点贡献分数进行汇总,通过加权平均或其他统计方法,生成该试题的综合知识点标注数据。这样,即便是同一道题,也能反映出由不同解题策略带来的多维度知识点覆盖。In this embodiment, the test data to be annotated is collected from the question bank. In addition to the question text, each test question should also be accompanied by at least two different answer texts, and each answer text represents an independent problem-solving strategy. Furthermore, natural language processing (NLP) technology and machine learning algorithms are used to deeply analyze each answer text, identify and quantify the knowledge points involved, and calculate the contribution of each knowledge point to the answer text (i.e., the knowledge point contribution score). This process may involve semantic understanding, keyword extraction, concept mapping and other technologies. Finally, the knowledge point contribution scores of all answer texts under the same test question are summarized, and the comprehensive knowledge point annotation data of the test question is generated by weighted average or other statistical methods. In this way, even for the same question, the multi-dimensional knowledge point coverage brought by different problem-solving strategies can be reflected.
在一个可能的实施例中,从教育数据库、在线教育平台或教师提交的资料中收集试题数据。每道试题数据至少包含一个题目文本和多个解答文本,确保每个解答文本代表不同的解题策略,去除文本中的噪声信息,如无关符号、空格等,保证后续分析的准确性,使用NLP技术对每个解答文本进行语义解析,识别出其中涉及的知识点。这一步骤可能包括词性标注、命名实体识别、依存句法分析等技术,以理解解答文本中的核心概念和逻辑关系。将识别出的知识点与预先定义的知识点库进行匹配,为每个知识点计算贡献分数。贡献分数的计算可以基于知识点在解答文本中的出现频率、关键性以及与题目的相关性等多因素综合评估。考虑到不同解题策略对知识点的覆盖程度可能不同,为每个解答文本分配权重,以反映其在整体知识点标注中的重要性。基于题目文本,将每个解答文本对应的多个知识点贡献分数进行加权平均,生成综合的知识点标注结果。这一过程可以采用统计方法,如加权平均、最大值选取等,以确保标注结果的全面性和准确性。In one possible embodiment, test data is collected from an education database, an online education platform, or materials submitted by a teacher. Each test data contains at least one question text and multiple answer texts, ensuring that each answer text represents a different problem-solving strategy, removing noise information in the text, such as irrelevant symbols, spaces, etc., to ensure the accuracy of subsequent analysis, and using NLP technology to perform semantic parsing on each answer text to identify the knowledge points involved. This step may include techniques such as part-of-speech tagging, named entity recognition, and dependency syntax analysis to understand the core concepts and logical relationships in the answer text. The identified knowledge points are matched with a predefined knowledge point library to calculate a contribution score for each knowledge point. The calculation of the contribution score can be based on a comprehensive evaluation of multiple factors such as the frequency of occurrence of the knowledge point in the answer text, criticality, and relevance to the question. Considering that different problem-solving strategies may have different coverage of knowledge points, a weight is assigned to each answer text to reflect its importance in the overall knowledge point annotation. Based on the question text, the weighted average of the multiple knowledge point contribution scores corresponding to each answer text is performed to generate a comprehensive knowledge point annotation result. This process can use statistical methods, such as weighted average, maximum value selection, etc., to ensure the comprehensiveness and accuracy of the labeling results.
与传统仅关注单一解法的知识点标注相比,本发明能够捕捉到更多解题路径下的知识点,从而提供更加全面的学习资源,有助于学生理解和掌握问题的不同解决思路,进一步的,多解法知识点标注为学生提供了多种解决问题的角度,有助于激发学生的创新思维和批判性思考能力,同时也能满足不同学习风格和水平的学生需求,促进个性化学习,教师可以利用这些多角度的知识点标注来设计更具针对性的教学活动,比如组织讨论、案例分析等,提高教学效果和学生参与度。Compared with the traditional knowledge point annotation that only focuses on a single solution, the present invention can capture more knowledge points under more problem-solving paths, thereby providing more comprehensive learning resources, helping students understand and master different problem-solving ideas. Furthermore, multi-solution knowledge point annotation provides students with multiple angles for solving problems, which helps to stimulate students' innovative thinking and critical thinking ability, while also meeting the needs of students with different learning styles and levels, and promoting personalized learning. Teachers can use these multi-angle knowledge point annotations to design more targeted teaching activities, such as organizing discussions, case analysis, etc., to improve teaching effectiveness and student participation.
优选地,参考图2,依次对多个解答文本进行知识点贡献分析处理,得到多个知识点贡献分数,包括:Preferably, referring to FIG2 , knowledge point contribution analysis is performed on multiple answer texts in sequence to obtain multiple knowledge point contribution scores, including:
S210,对解答文本进行知识点类型分词处理,得到多个知识点类型文本信息集合和多个论证连接词;S210, performing knowledge point type word segmentation processing on the answer text to obtain multiple knowledge point type text information sets and multiple argumentation connectives;
S220,从预设的知识点类型基础贡献权重表中获取知识点基础权重数据,并基于多个论证连接词,从预设的论证连接词贡献权重表中匹配对应的贡献权重,得到逻辑贡献权重数据;S220, obtaining knowledge point basic weight data from a preset knowledge point type basic contribution weight table, and matching corresponding contribution weights from a preset argumentation connective contribution weight table based on a plurality of argumentation connectives to obtain logic contribution weight data;
S230,基于知识点基础权重数据和知识点与论证连接词的位置关系,并根据逻辑贡献权重数据和知识点类型文本信息集合进行知识点贡献分析,得到多个知识点贡献分数。S230, based on the knowledge point basic weight data and the positional relationship between the knowledge point and the argument connective, and according to the logic contribution weight data and the knowledge point type text information set, a knowledge point contribution analysis is performed to obtain a plurality of knowledge point contribution scores.
本实施例中,采用深度学习模型(如BiLSTM-CRF、BERT等)进行语义理解与分词,识别解答文本中的关键词汇,并将其分类至预设的知识点类型中,如数学公式、定理引用、实验步骤等。同时,提取文本中的论证连接词,如“因此”、“然而”、“并且”等,用于分析解题逻辑。从预先建立的知识点类型基础贡献权重表中查找并获取每个识别出的知识点类型的基础权重。此表基于领域专家的经验和大数据分析,反映了不同知识点对于解题的重要性。根据提取的论证连接词,从预设的论证连接词贡献权重表(该表反映了不同论证连接词在解题过程中的逻辑价值)中匹配相应的逻辑贡献权重。这一步骤考虑了解题过程中的逻辑连贯性,确保知识点在论证链条中的贡献被准确量化。进一步的,结合知识点基础权重数据与逻辑贡献权重数据,依据知识点类型文本信息集合中知识点与论证连接词的位置关系,通过算法(如加权求和、层次分析法等)进行综合评估,得到每个知识点的具体贡献分数。In this embodiment, a deep learning model (such as BiLSTM-CRF, BERT, etc.) is used for semantic understanding and word segmentation, to identify key words in the answer text, and to classify them into preset knowledge point types, such as mathematical formulas, theorem references, experimental steps, etc. At the same time, argumentation connectives in the text, such as "therefore", "however", "and", etc., are extracted for analyzing the logic of problem solving. The basic weight of each identified knowledge point type is searched and obtained from the pre-established knowledge point type basic contribution weight table. This table is based on the experience of domain experts and big data analysis, and reflects the importance of different knowledge points for problem solving. According to the extracted argumentation connectives, the corresponding logical contribution weights are matched from the preset argumentation connective contribution weight table (which reflects the logical value of different argumentation connectives in the problem solving process). This step takes into account the logical coherence in the problem solving process and ensures that the contribution of knowledge points in the argument chain is accurately quantified. Further, combined with the knowledge point basic weight data and the logical contribution weight data, according to the positional relationship between the knowledge point and the argumentation connective in the knowledge point type text information set, a comprehensive evaluation is performed through an algorithm (such as weighted summation, hierarchical analysis method, etc.) to obtain the specific contribution score of each knowledge point.
本实施例中基于深度学习的分词技术和预设的权重表,能更精准地识别和量化知识点的贡献,避免了人工标注的主观性和不一致性,提高了知识点标注的准确度。通过分析论证连接词及其贡献权重,增强了知识点贡献分析的逻辑连贯性,确保了标注结果不仅涵盖知识点本身,也反映了它们之间的逻辑关系,有利于学生理解解题过程中的内在逻辑。进一步的,自动化的知识点贡献分析处理大大减少了人工干预的需求,提高了工作效率,同时也为大规模试题库的知识点标注提供了可行的解决方案,促进了教育资源的智能化管理。In this embodiment, the word segmentation technology based on deep learning and the preset weight table can more accurately identify and quantify the contribution of knowledge points, avoid the subjectivity and inconsistency of manual annotation, and improve the accuracy of knowledge point annotation. By analyzing and demonstrating the conjunctions and their contribution weights, the logical coherence of the knowledge point contribution analysis is enhanced, ensuring that the annotation results not only cover the knowledge points themselves, but also reflect the logical relationship between them, which is conducive to students' understanding of the internal logic in the problem-solving process. Furthermore, the automated knowledge point contribution analysis process greatly reduces the need for manual intervention and improves work efficiency. It also provides a feasible solution for the knowledge point annotation of large-scale test question banks and promotes the intelligent management of educational resources.
优选地,基于知识点基础权重数据和知识点与论证连接词的位置关系,并根据逻辑贡献权重数据和知识点类型文本信息集合进行知识点贡献分析,得到多个知识点贡献分数,包括:Preferably, based on the knowledge point basic weight data and the positional relationship between the knowledge point and the argument connective, and according to the logical contribution weight data and the knowledge point type text information set, a knowledge point contribution analysis is performed to obtain multiple knowledge point contribution scores, including:
设逻辑贡献权重数据为,知识点基础权重数据为,知识点类型文本信息集合为,其中为知识点类型数量,知识点类型与所有论证连接词的距离集合,其中为论证连接词的数量,为知识点类型与论证连接词的距离,则对于第个解法的知识点贡献分数表示如下:Assume the logical contribution weight data is , the basic weight data of knowledge points is , the knowledge point type text information set is ,in is the number of knowledge point types, knowledge point type The set of distances from all argument connectives ,in To prove the number of conjunctions, Knowledge point type Connective words with argument , then for the first The knowledge point contribution score of each solution It is expressed as follows:
, ,
, ,
其中为知识点类型的基础权重数值,为知识点类型的贡献分数,为基础权重与论证连接词影响权重调节因子,默认为0.5。in Knowledge point type The basic weight value of Knowledge point type The contribution score of The basic weight and argument connective influence weight adjustment factor, default is 0.5.
本实施例中,逻辑贡献权重数据表示各个论证连接词对知识点贡献的权重,反映了论证结构对知识点价值的影响,知识点基础权重数据为每个知识点类型预设的基础重要性指标,不受论证结构影响,为调节因子,用于平衡基础权重与论证连接词影响的贡献,默认情况下,设为0.5,意味着基础权重与论证连接词影响权重同等重要,可根据实际情况进行调整。In this embodiment, the logical contribution weight data Indicates the weight of each argument connective's contribution to the knowledge point, reflecting the impact of the argument structure on the value of the knowledge point, and the basic weight data of the knowledge point The basic importance index preset for each knowledge point type is not affected by the argument structure. is a moderating factor used to balance the contribution of the base weight and the influence of the argument connectives. By default, It is set to 0.5, which means that the basic weight is equally important as the weight of the argument connective influence, and can be adjusted according to actual conditions.
本发明通过引入论证连接词的影响权重和距离因素,能够更全面地评估知识点在解答过程中的作用,避免了仅依赖基础权重的局限性,使得知识点贡献分数更加客观和全面。进一步的,调节因子β的设置允许用户根据实际情况调整基础权重与论证连接词影响的比重,增加了模型的灵活性和适应性,适用于不同学科领域和复杂度的试题。其中,距离函数的引入,使得模型能够精细化地分析知识点与论证连接词之间的位置关系,更准确地反映了知识点在解答逻辑中的实际贡献。通过量化不同解法中知识点的贡献,教师和学生可以获得关于解题策略的深入洞察,促进对知识点本质的理解和应用,从而提高教育质量和学习效果。By introducing the influence weights of argument connectives and the distance factor, the present invention can more comprehensively evaluate the role of knowledge points in the answering process, avoiding the limitation of relying solely on basic weights, making the knowledge point contribution score more objective and comprehensive. Furthermore, the setting of the adjustment factor β allows the user to adjust the proportion of basic weights and the influence of argument connectives according to actual conditions, increasing the flexibility and adaptability of the model and making it suitable for test questions of different subject areas and complexity. The introduction of enables the model to finely analyze the positional relationship between knowledge points and argument connectives, and more accurately reflects the actual contribution of knowledge points in the solution logic. By quantifying the contribution of knowledge points in different solutions, teachers and students can gain in-depth insights into problem-solving strategies, promote the understanding and application of the essence of knowledge points, and thus improve the quality of education and learning outcomes.
优选地,参考图3,对解答文本进行知识点类型分词处理,得到多个知识点类型文本信息集合和多个论证连接词,包括:Preferably, referring to FIG3 , the answer text is segmented by knowledge point type to obtain a plurality of knowledge point type text information sets and a plurality of argumentation conjunctions, including:
S211,使用分词工具对解答文本进行分词处理,得到多个解答分词数据;S211, using a word segmentation tool to perform word segmentation processing on the answer text to obtain a plurality of answer word segmentation data;
S212,收集包含有L种不同知识点类型的参照知识点分词数据,并获取多个参照论证连接词;S212, collecting reference knowledge point segmentation data including L different knowledge point types, and obtaining a plurality of reference argumentation connectives;
S213,整合多个解答分词数据、多个参照知识点分词数据和多个参照论证连接词,得到混合文本数据集;S213, integrating a plurality of answer segmentation data, a plurality of reference knowledge point segmentation data, and a plurality of reference argumentation connectives to obtain a mixed text data set;
S214,采用K-means算法对混合文本数据集进行L+1簇分类处理,得到L+1个分类簇数据;S214, using K-means algorithm to perform L+1 cluster classification processing on the mixed text data set to obtain L+1 classification cluster data;
S215,依次分析解答分词数据在L+1个分类簇数据中的分布情况,将包含有最多参照论证连接词的分类簇数据中的解答分词数据确定为论证连接词,确定解答分词数据的知识点类型为所在分类簇数据中数据量最多的参照知识点分词数据对应的知识点类型。S215, analyze the distribution of the answer word segmentation data in the L+1 classification cluster data in turn, determine the answer word segmentation data in the classification cluster data containing the most reference argument connectives as argument connectives, and determine the knowledge point type of the answer word segmentation data as the knowledge point type corresponding to the reference knowledge point word segmentation data with the largest data volume in the classification cluster data.
本实施例中,使用先进的自然语言处理(NLP)分词工具,如Jieba分词、StanfordNLP等,对解答文本进行初步的词汇分割,得到一系列的解答分词数据。收集包含L种不同知识点类型的参照文本,并对其进行分词处理,获取参照知识点分词数据,这些数据代表了已知知识点类型的典型表述,整理出多种常见的论证连接词列表,如“因此”、“然而”、“另外”等,用于识别解答文本中的逻辑结构,将解答分词数据、参照知识点分词数据以及参照论证连接词合并,形成一个综合性的混合文本数据集。进一步的,运用K-means算法对混合文本数据集进行L+1簇的分类处理,其中L为参照知识点的种类数,额外的一簇用于容纳未明确归属的知识点或论证连接词。K-means算法通过迭代计算,将相似的词汇归类到同一簇中,形成L+1个分类簇数据。进而,逐一分析解答分词数据在L+1个分类簇数据中的分布情况,识别出包含最多参照论证连接词的分类簇,将其标记为论证连接词簇。对于其他分类簇,确定其中数据量最多的参照知识点分词数据所对应的知识点类型,作为该簇内解答分词数据的知识点类型。In this embodiment, advanced natural language processing (NLP) word segmentation tools, such as Jieba word segmentation, StanfordNLP, etc., are used to perform preliminary vocabulary segmentation on the answer text to obtain a series of answer word segmentation data. Reference texts containing L different types of knowledge points are collected and word segmentation is performed on them to obtain reference knowledge point word segmentation data, which represent typical expressions of known knowledge point types, and a list of multiple common argumentation conjunctions, such as "therefore", "however", "in addition", etc., are sorted out to identify the logical structure in the answer text, and the answer word segmentation data, reference knowledge point word segmentation data and reference argumentation conjunctions are merged to form a comprehensive mixed text data set. Further, the K-means algorithm is used to classify the mixed text data set into L+1 clusters, where L is the number of reference knowledge point types, and an additional cluster is used to accommodate knowledge points or argumentation conjunctions that are not clearly attributed. The K-means algorithm classifies similar words into the same cluster through iterative calculations to form L+1 classified cluster data. Then, the distribution of the answer segmentation data in the L+1 classification cluster data is analyzed one by one, and the classification cluster containing the most reference argumentation connectives is identified and marked as the argumentation connective cluster. For other classification clusters, the knowledge point type corresponding to the reference knowledge point segmentation data with the largest amount of data is determined as the knowledge point type of the answer segmentation data in the cluster.
本实施例中,借助于K-means聚类算法,本方法能够有效地区分不同类型的知识点和论证连接词,提高了识别的精确度,避免了人工判断的主观性和误差。进一步的,由于采用了参照知识点分词数据和论证连接词作为分类参照,本方法具有较强的适应性,可应用于不同学科领域,如数学、物理、化学等,只需相应调整参照知识点即可。In this embodiment, by means of the K-means clustering algorithm, the method can effectively distinguish different types of knowledge points and argument connectives, improve the accuracy of recognition, and avoid the subjectivity and errors of manual judgment. Furthermore, since the reference knowledge point segmentation data and argument connectives are used as classification references, the method has strong adaptability and can be applied to different disciplines, such as mathematics, physics, chemistry, etc., and only the reference knowledge points need to be adjusted accordingly.
优选地,采集待标注的试题数据之后,还包括:Preferably, after collecting the test question data to be marked, the method further includes:
对采集到的试题数据中的文本进行预处理,包括文本规范化、数学公式转换处理和数学实体同义化处理。The text in the collected test data is preprocessed, including text normalization, mathematical formula conversion processing and mathematical entity synonymization processing.
本实施例中,将文本内容转换为统一的编码格式,如UTF-8,确保文本在不同系统和设备间的正确显示和处理,通过正则表达式或NLP工具去除文本中的无用字符、空白符、标点符号等,保持文本的清洁度,减少后续分析的干扰。纠正拼写错误,统一缩写和术语的书写形式,确保文本的一致性和专业性。利用专门的数学公式识别软件或工具,如Mathpix、LaTeXML等,从文本中准确识别并提取数学公式。将提取的数学公式转换为标准格式,如MathML或LaTeX,便于计算机理解和处理。这一过程可能涉及到公式结构的解析和重构。将转换后的数学公式重新嵌入到原始文本中,确保其位置和上下文关系的准确性。使用NLP技术识别文本中的数学实体,如变量名、函数名、常数等,建立数学实体的同义词库,收录不同表述但含义相同或相近的数学实体,如π和圆周率、sin和正弦函数等,根据同义词库,将文本中出现的数学实体替换为其标准或统一的表述形式,以消除因表述差异带来的理解障碍。In this embodiment, the text content is converted into a unified encoding format, such as UTF-8, to ensure the correct display and processing of the text between different systems and devices. Useful characters, blanks, punctuation marks, etc. in the text are removed by regular expressions or NLP tools to maintain the cleanliness of the text and reduce interference in subsequent analysis. Correct spelling errors, unify the writing form of abbreviations and terms, and ensure the consistency and professionalism of the text. Use specialized mathematical formula recognition software or tools, such as Mathpix, LaTeXML, etc., to accurately identify and extract mathematical formulas from the text. Convert the extracted mathematical formulas to a standard format, such as MathML or LaTeX, for easy computer understanding and processing. This process may involve parsing and reconstruction of the formula structure. The converted mathematical formula is re-embedded into the original text to ensure the accuracy of its position and context. Use NLP technology to identify mathematical entities in the text, such as variable names, function names, constants, etc., establish a synonym library of mathematical entities, and include mathematical entities with different expressions but the same or similar meanings, such as π and pi, sin and sine function, etc. According to the synonym library, replace the mathematical entities appearing in the text with their standard or unified expression forms to eliminate the understanding barriers caused by differences in expression.
本实施例中,通过文本规范化确保了数据的一致性和可读性,数学公式转换处理保证了公式的准确性和可处理性,而数学实体同义化处理消除了表述差异,共同提高了数据的整体质量,高质量的预处理结果为后续的知识点识别和贡献分析提供了坚实的基础,减少了因数据质量问题导致的分析偏差,增强了最终标注结果的准确性,标准化和规范化的数据格式简化了后续处理流程,减少了不必要的数据清理工作,加快了整个分析流程的速度,提高了工作效率。In this embodiment, text normalization ensures data consistency and readability, mathematical formula conversion processing ensures formula accuracy and processability, and mathematical entity synonymization processing eliminates expression differences, which together improve the overall quality of the data. High-quality preprocessing results provide a solid foundation for subsequent knowledge point identification and contribution analysis, reduce analysis deviations caused by data quality issues, and enhance the accuracy of the final annotation results. The standardized and normalized data format simplifies subsequent processing procedures, reduces unnecessary data cleaning work, speeds up the entire analysis process, and improves work efficiency.
优选地,从预设的知识点类型基础贡献权重表中获取知识点基础权重数据之前,还包括:Preferably, before obtaining the knowledge point basic weight data from the preset knowledge point type basic contribution weight table, the method further includes:
构建知识点类型基础贡献权重表,具体包括:Construct a basic contribution weight table of knowledge point types, including:
收集历史考试试题及解法得到历史试题数据,并获取教学大纲,教学大纲包含所有知识点类型;Collect history test questions and solutions to obtain history test data, and obtain the teaching syllabus, which contains all types of knowledge points;
结合知识点类型在历史试题数据的出现频率,采用统计学方法计算出各知识点类型的贡献权重。Based on the frequency of occurrence of knowledge point types in historical test data, statistical methods are used to calculate the contribution weight of each knowledge point type.
本实施例中,从教育部门、学校、考试机构等渠道收集历年的考试试题及其官方或公认的解法,确保数据的权威性和可靠性,对收集到的历史试题数据进行分类和整理,按学科、年级、考试类型等维度进行划分,便于后续的分析和处理,进一步的,从教育主管部门或官方教材出版单位获取最新的教学大纲,确保覆盖所有必要的知识点类型,对教学大纲进行解析,提取出所有知识点类型,形成知识点类型列表,为后续权重计算提供参照。基于历史试题数据,使用文本分析和数据挖掘技术,统计每个知识点类型在试题中的出现次数,反映出知识点在考试中的重要性和常见性,采用统计学方法,如频次分析、主成分分析、回归分析等,结合知识点类型在历史试题中的出现频率,计算出各知识点类型的贡献权重。这一过程可能还需要考虑知识点的难度、关联性等因素,根据教学大纲的要求和教育专家的意见,对计算得出的贡献权重进行人工校准,确保权重的合理性和教育适用性。In this embodiment, the examination questions of previous years and their official or recognized solutions are collected from channels such as education departments, schools, and examination institutions to ensure the authority and reliability of the data. The collected historical examination question data are classified and sorted, and divided according to the dimensions of subject, grade, examination type, etc., to facilitate subsequent analysis and processing. Further, the latest teaching syllabus is obtained from the education department or the official textbook publishing unit to ensure that all necessary knowledge point types are covered. The teaching syllabus is parsed, all knowledge point types are extracted, and a list of knowledge point types is formed to provide a reference for subsequent weight calculation. Based on the historical examination question data, text analysis and data mining techniques are used to count the number of occurrences of each knowledge point type in the examination question, reflecting the importance and commonness of the knowledge point in the examination. Statistical methods such as frequency analysis, principal component analysis, regression analysis, etc. are used to combine the frequency of occurrence of the knowledge point type in the historical examination question to calculate the contribution weight of each knowledge point type. This process may also need to consider factors such as the difficulty and relevance of the knowledge point. According to the requirements of the teaching syllabus and the opinions of education experts, the calculated contribution weight is manually calibrated to ensure the rationality and educational applicability of the weight.
本实施例中,基于历史试题数据和教学大纲,采用统计学方法计算贡献权重,确保了权重设定的科学性和客观性,避免了人为偏见的影响。进一步的,权重表的构建紧密贴合教学大纲,确保了知识点标注与教育目标的一致性,有助于引导学生和教师关注考试重点和难点,提高教学效率和学习成果。进一步的,通过定期更新历史试题数据和教学大纲,可以及时调整知识点类型基础贡献权重表,使其能够反映最新的教育趋势和考试要求,保持权重表的时效性和有效性。In this embodiment, based on the historical test data and teaching syllabus, the contribution weights are calculated using statistical methods to ensure the scientificity and objectivity of the weight setting and avoid the influence of human bias. Furthermore, the construction of the weight table is closely aligned with the teaching syllabus, ensuring the consistency of knowledge point annotations with educational objectives, helping to guide students and teachers to focus on the key points and difficulties of the exam, and improving teaching efficiency and learning outcomes. Furthermore, by regularly updating the historical test data and teaching syllabus, the knowledge point type basic contribution weight table can be adjusted in a timely manner to reflect the latest educational trends and exam requirements, and to maintain the timeliness and effectiveness of the weight table.
优选地,得到多个知识点贡献分数之后,还包括:Preferably, after obtaining the contribution scores of the multiple knowledge points, the method further includes:
对整合后的知识点标注数据进行可视化展示,生成知识图谱。The integrated knowledge point annotation data is visualized to generate a knowledge graph.
本实施例中,对整合后的知识点标注数据进行质量检查,去除重复、缺失或异常的数据点,确保数据的准确性和一致性。对知识点贡献分数进行标准化处理,例如采用Z-score标准化或Min-Max归一化,使不同量纲的数据能在同一尺度上进行比较。利用数据挖掘技术,如Apriori算法或FP-growth算法,发现知识点之间的关联规则,识别哪些知识点经常一起出现在同一试题中,反映知识点间的潜在联系。采用K-means、层次聚类或DBSCAN等算法,根据知识点贡献分数和关联规则,将知识点划分为不同的主题领域,揭示知识点结构。将知识点作为知识图谱中的节点,每个节点包含知识点的名称、描述、贡献分数等属性。根据知识点之间的关联规则和主题领域的聚类结果,定义节点之间的边,表示知识点间的关联程度和逻辑关系。选择合适的图布局算法,如Fruchterman-Reingold算法或Spring-Electric模型,确定节点在二维或三维空间中的位置,使知识图谱的结构清晰、易于理解。为节点和边设计不同的颜色、形状和大小,以区分不同的知识点类型和关联强度,增强视觉表达力。开发鼠标悬停显示详细信息、点击放大查看细节、拖拽重新布局等功能,提升用户探索知识图谱的体验。In this embodiment, the quality check of the integrated knowledge point annotation data is performed to remove duplicate, missing or abnormal data points to ensure the accuracy and consistency of the data. The knowledge point contribution score is standardized, for example, Z-score standardization or Min-Max normalization is used so that data of different dimensions can be compared on the same scale. Data mining techniques, such as Apriori algorithm or FP-growth algorithm, are used to discover the association rules between knowledge points, identify which knowledge points often appear together in the same test questions, and reflect the potential connection between knowledge points. Using algorithms such as K-means, hierarchical clustering or DBSCAN, knowledge points are divided into different subject areas according to the knowledge point contribution scores and association rules to reveal the knowledge point structure. The knowledge points are used as nodes in the knowledge graph, and each node contains the name, description, contribution score and other attributes of the knowledge point. According to the association rules between knowledge points and the clustering results of the subject area, the edges between nodes are defined to represent the degree of association and logical relationship between knowledge points. Choose a suitable graph layout algorithm, such as the Fruchterman-Reingold algorithm or the Spring-Electric model, to determine the position of nodes in two-dimensional or three-dimensional space, so that the structure of the knowledge graph is clear and easy to understand. Design different colors, shapes, and sizes for nodes and edges to distinguish different types of knowledge points and association strengths, and enhance visual expression. Develop functions such as hovering the mouse to display detailed information, clicking to zoom in to view details, and dragging to re-layout to enhance the user experience of exploring the knowledge graph.
本实施例中,知识图谱以图形化的方式展现了知识点之间的复杂关系,帮助用户快速把握知识点的整体架构和相互影响,促进对知识体系的理解和记忆。进一步的,教师可以依据知识图谱识别教学重点和难点,调整课程内容和教学策略;学生则能通过知识图谱发现自己的知识盲区,有针对性地补充学习。In this embodiment, the knowledge graph graphically displays the complex relationship between knowledge points, helping users quickly grasp the overall structure and mutual influence of knowledge points, and promoting understanding and memory of the knowledge system. Furthermore, teachers can identify teaching key points and difficulties based on the knowledge graph, adjust course content and teaching strategies; students can discover their own knowledge blind spots through the knowledge graph and supplement their learning in a targeted manner.
实施例2Example 2
本发明提供了一种基于人工智能的试题知识点标注系统,参考图4,系统包括:The present invention provides a test question knowledge point annotation system based on artificial intelligence. Referring to FIG4 , the system includes:
数据采集模块100,用于采集待标注的试题数据,试题数据包括题目文本和多个解答文本,每个解答文本对应不同解题策略;The data collection module 100 is used to collect the test data to be marked, the test data includes the question text and multiple answer texts, each answer text corresponds to a different problem-solving strategy;
数据处理模块200,用于依次对多个解答文本进行知识点贡献分析处理,得到多个知识点贡献分数;The data processing module 200 is used to perform knowledge point contribution analysis on multiple answer texts in sequence to obtain multiple knowledge point contribution scores;
数据整合模块300,用于基于题目文本,整合每个解答文本对应的多个知识点贡献分数,得到多解法知识点标注数据。The data integration module 300 is used to integrate the multiple knowledge point contribution scores corresponding to each answer text based on the question text to obtain multi-solution knowledge point annotation data.
优选地,数据处理模块包括:Preferably, the data processing module includes:
解答文本分词单元,用于对解答文本进行知识点类型分词处理,得到多个知识点类型文本信息集合和多个论证连接词;The answer text segmentation unit is used to perform knowledge point type segmentation processing on the answer text to obtain multiple knowledge point type text information sets and multiple argumentation connectors;
静态数据获取单元,用于从预设的知识点类型基础贡献权重表中获取知识点基础权重数据,并基于多个论证连接词,从预设的论证连接词贡献权重表中匹配对应的贡献权重,得到逻辑贡献权重数据;A static data acquisition unit is used to acquire knowledge point basic weight data from a preset knowledge point type basic contribution weight table, and based on a plurality of argument connectives, matches corresponding contribution weights from a preset argument connective contribution weight table to obtain logical contribution weight data;
知识点贡献分析单元,用于基于知识点基础权重数据和知识点与论证连接词的位置关系,并根据逻辑贡献权重数据和知识点类型文本信息集合进行知识点贡献分析,得到多个知识点贡献分数。The knowledge point contribution analysis unit is used to perform knowledge point contribution analysis based on the knowledge point basic weight data and the positional relationship between the knowledge point and the argument connective, and according to the logical contribution weight data and the knowledge point type text information set to obtain multiple knowledge point contribution scores.
优选地,继续参考图4,系统还包括:Preferably, with continued reference to FIG. 4 , the system further includes:
可视化模块400,用于对整合后的知识点标注数据进行可视化展示,生成知识图谱。The visualization module 400 is used to visualize the integrated knowledge point annotation data and generate a knowledge graph.
以上所述仅是本发明的具体实施方式,使本领域技术人员能够理解或实现本发明。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所申请的原理和新颖特点相一致的最宽的范围。The foregoing is merely a specific embodiment of the present invention, which enables those skilled in the art to understand or implement the present invention. Various modifications to these embodiments will be apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the present invention. Therefore, the present invention will not be limited to the embodiments shown herein, but rather to the widest scope consistent with the principles and novel features claimed herein.
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411133598.1A CN118657140B (en) | 2024-08-19 | 2024-08-19 | Test question knowledge point labeling method and system based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202411133598.1A CN118657140B (en) | 2024-08-19 | 2024-08-19 | Test question knowledge point labeling method and system based on artificial intelligence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118657140A true CN118657140A (en) | 2024-09-17 |
CN118657140B CN118657140B (en) | 2024-11-08 |
Family
ID=92708154
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202411133598.1A Active CN118657140B (en) | 2024-08-19 | 2024-08-19 | Test question knowledge point labeling method and system based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118657140B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030190592A1 (en) * | 2002-04-03 | 2003-10-09 | Bruno James E. | Method and system for knowledge assessment and learning incorporating feedbacks |
US20160224564A1 (en) * | 2013-09-29 | 2016-08-04 | Peking University Founder Group Co., Ltd. | Method and system for key knowledge point recommendation |
CN106355628A (en) * | 2015-07-16 | 2017-01-25 | 中国石油化工股份有限公司 | Image-text knowledge point marking method and device and image-text mark correcting method and system |
WO2017124343A1 (en) * | 2016-01-20 | 2017-07-27 | 吴晓敏 | Method for pushing information during sorting-out of knowledge points, and learning system |
-
2024
- 2024-08-19 CN CN202411133598.1A patent/CN118657140B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030190592A1 (en) * | 2002-04-03 | 2003-10-09 | Bruno James E. | Method and system for knowledge assessment and learning incorporating feedbacks |
US20160224564A1 (en) * | 2013-09-29 | 2016-08-04 | Peking University Founder Group Co., Ltd. | Method and system for key knowledge point recommendation |
CN106355628A (en) * | 2015-07-16 | 2017-01-25 | 中国石油化工股份有限公司 | Image-text knowledge point marking method and device and image-text mark correcting method and system |
WO2017124343A1 (en) * | 2016-01-20 | 2017-07-27 | 吴晓敏 | Method for pushing information during sorting-out of knowledge points, and learning system |
Also Published As
Publication number | Publication date |
---|---|
CN118657140B (en) | 2024-11-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113886567A (en) | Teaching method and system based on knowledge graph | |
Joorabchi et al. | Text mining stackoverflow: An insight into challenges and subject-related difficulties faced by computer science learners | |
Azevedo et al. | Qualitative analysis of discussion forums | |
Nitin et al. | Analyzing educational comments for topics and sentiments: A text analytics approach | |
CN115309885A (en) | A knowledge graph construction, retrieval and visualization method and system for scientific and technological services | |
Liu et al. | PrimeNet: A framework for commonsense knowledge representation and reasoning based on conceptual primitives | |
Irawan et al. | Research trends of computational thinking in mathematics learning: A bibliometric analysis from 2009 to 2023 | |
CN118898426A (en) | A physician training and assessment system based on digital patients | |
CN118260437A (en) | Public opinion information display method and system based on knowledge graph | |
Rahman et al. | Unsupervised machine learning approach for tailoring educational content to individual student weaknesses | |
CN117829157A (en) | A text processing method for constructing a natural person value system based on a large language model | |
Rong | Distance learning quality assessment of universities based on interval monotonic decision tree algorithm | |
CN116610758A (en) | Information traceability method, system and storage medium | |
Ishmael et al. | Topic modelling using latent dirichlet allocation (LDA) and analysis of students sentiments | |
CN119961416A (en) | A Visual Question Answering Method for Subjective Urban Performance Based on Large Language Model | |
Zhang et al. | Towards an assessment model of college students’ computational thinking with text-based programming | |
Ma et al. | Research on the construction method of knowledge graph for power grid education resources | |
CN118822696A (en) | Personalized exercise resource recommendation method and system based on solution strategy | |
CN113326062A (en) | Software defect-oriented multi-round automatic question and answer method, system, computer equipment and storage medium | |
Inyang et al. | Visual association analytics approach to predictive modelling of students’ academic performance | |
CN118657140A (en) | A test question knowledge point annotation method and system based on artificial intelligence | |
Tenriawaru et al. | A new model of students participation measurement in e-learning systems based on meaningful learning characteristics: An initial investigation | |
Partaourides et al. | On the identification of influential topics in the social sciences using citation analysis | |
Gao et al. | Examining the dimensionality of the self-perceptions of creativity scale using psychometric network analysis: A tutorial | |
Halim et al. | Extracting disease-symptom relationships from health question and answer forum |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |