CN113742496B - A power knowledge learning system and method based on heterogeneous resource fusion - Google Patents
A power knowledge learning system and method based on heterogeneous resource fusion Download PDFInfo
- Publication number
- CN113742496B CN113742496B CN202111064267.3A CN202111064267A CN113742496B CN 113742496 B CN113742496 B CN 113742496B CN 202111064267 A CN202111064267 A CN 202111064267A CN 113742496 B CN113742496 B CN 113742496B
- Authority
- CN
- China
- Prior art keywords
- power
- knowledge
- information
- employee
- electric power
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02E—REDUCTION OF GREENHOUSE GAS [GHG] EMISSIONS, RELATED TO ENERGY GENERATION, TRANSMISSION OR DISTRIBUTION
- Y02E40/00—Technologies for an efficient electrical power generation, transmission or distribution
- Y02E40/70—Smart grids as climate change mitigation technology in the energy generation sector
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y04—INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
- Y04S—SYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
- Y04S10/00—Systems supporting electrical power generation, transmission or distribution
- Y04S10/50—Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Water Supply & Treatment (AREA)
- Animal Behavior & Ethology (AREA)
- Computational Linguistics (AREA)
- Public Health (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本申请公开了一种基于异构资源融合的电力知识学习系统及方法,所述系统包括资源层、业务层和应用层;所述资源层用于获取电力资源信息;所述业务层用于根据所述电力资源信息,构建电力知识库和电力知识图谱,以及用于根据员工信息,构建不同岗位员工画像;所述应用层用于针对每一个员工,提供个性化服务,所述个性化服务包括电力知识推荐以及电力知识试卷评测。本申请可以对每一员工进行针对性的培训,根据每一个员工的岗位需求以及偏好特征推荐相应的电力知识,还可以为每个员工提供个性化电力试题并进行评测;本申请还可以对杂乱的电力知识进行梳理,能够清晰的看出电力知识之间的关系,提高了电网工作人员学习的效率。
The present application discloses an electric power knowledge learning system and method based on heterogeneous resource fusion, the system includes a resource layer, a business layer and an application layer; the resource layer is used to obtain electric power resource information; the business layer is used to construct an electric power knowledge base and an electric power knowledge graph based on the electric power resource information, and is used to construct portraits of employees in different positions based on employee information; the application layer is used to provide personalized services for each employee, and the personalized services include electric power knowledge recommendation and electric power knowledge test paper evaluation. The present application can provide targeted training for each employee, recommend corresponding electric power knowledge according to the job requirements and preference characteristics of each employee, and can also provide personalized electric power test questions for each employee and conduct evaluation; the present application can also sort out the messy electric power knowledge, and can clearly see the relationship between electric power knowledge, thereby improving the learning efficiency of power grid staff.
Description
技术领域Technical Field
本申请涉及电力系统构建技术领域,主要是一种基于异构资源融合的电力知识学习系统及方法。The present application relates to the technical field of power system construction, and mainly relates to a power knowledge learning system and method based on heterogeneous resource fusion.
背景技术Background technique
世界各国的经验表明,电力生产的发展速度应高于其他部门的发展速度,才能促进国民经济的协调发展,所以电力工业又被称为国民经济的“先行官”。由于电力行业发展的高要求,每个电网工作人员的素质能力提高也至关重要,因此需要对每一个电网工作人员进行电力知识培训。Experiences from countries around the world show that the development speed of electricity production should be higher than that of other sectors in order to promote the coordinated development of the national economy. Therefore, the power industry is also known as the "pioneer" of the national economy. Due to the high requirements for the development of the power industry, it is also crucial to improve the quality and ability of each power grid worker. Therefore, it is necessary to provide power knowledge training to each power grid worker.
现如今,对电网工作人员的培训大多数通过企业开展培训讲座,由专业老师一对多进行培训,或者由电网工作人员通过上网查找资料进行自学,但是电力行业知识的资源广泛且数量庞大,每个工作人员在进行查找时,无法从大量数据中筛选出自己需要的资源;且每个岗位的电网工作人员的诉求不同,需要重点学习的对象不同,进行一对多培训无法依据每个电网工作人员的特点得到针对性的提升,但若对每一个电网工作人员都安排一名专业老师,这样耗费的资源众多,成本太高。Nowadays, most of the training for power grid workers is carried out through training lectures held by enterprises, one-to-many training by professional teachers, or self-study by grid workers by searching for information on the Internet. However, the resources of knowledge in the power industry are extensive and huge in number. When searching, each worker cannot filter out the resources he needs from a large amount of data. In addition, the demands of grid workers in each position are different, and the objects they need to focus on are different. One-to-many training cannot achieve targeted improvement based on the characteristics of each grid worker. However, if a professional teacher is assigned to each grid worker, this will consume a lot of resources and the cost will be too high.
发明内容Summary of the invention
由于现有技术无法对每个电网工作人员进行针对性的电力知识培训,本申请公开了一种基于异构资源融合的电力知识学习系统及方法。Since the existing technology cannot provide targeted electricity knowledge training to each power grid worker, the present application discloses an electricity knowledge learning system and method based on heterogeneous resource fusion.
本申请第一方面公开了一种基于异构资源融合的电力知识学习系统,包括资源层、业务层和应用层;所述资源层用于获取电力资源信息;所述业务层用于根据所述电力资源信息,构建电力知识库和电力知识图谱,以及用于根据员工信息,构建不同岗位员工画像;所述应用层用于针对每一个员工,提供个性化服务,所述个性化服务包括电力知识推荐以及电力知识试卷评测。The first aspect of the present application discloses an electric power knowledge learning system based on heterogeneous resource fusion, including a resource layer, a business layer and an application layer; the resource layer is used to obtain electric power resource information; the business layer is used to construct an electric power knowledge base and an electric power knowledge graph according to the electric power resource information, and is used to construct portraits of employees in different positions according to employee information; the application layer is used to provide personalized services for each employee, and the personalized services include electric power knowledge recommendations and electric power knowledge test paper evaluations.
本申请第二方面公开了一种基于异构资源融合的电力知识学习方法,所述方法应用于所述一种基于异构资源融合的电力知识学习系统,包括:The second aspect of the present application discloses a method for learning electric power knowledge based on heterogeneous resource fusion, the method being applied to the electric power knowledge learning system based on heterogeneous resource fusion, comprising:
获取电力资源信息;Obtain power resource information;
构建电力知识库;Build a knowledge base on electricity;
构建电力知识图谱;Build a power knowledge graph;
构建不同岗位员工画像;Build portraits of employees in different positions;
针对每一个员工,提供个性化服务;所述个性化服务包括电力知识推荐以及电力知识试卷评测。Provide personalized services for each employee; the personalized services include power knowledge recommendations and power knowledge test paper evaluation.
可选的,所述获取电力资源信息包括:Optionally, the acquiring of power resource information includes:
获取电力资源原始信息,所述电力资源原始信息包括:原始电力音频信息、原始电力视频信息和原始电力文档信息;Acquire original information of electric power resources, wherein the original information of electric power resources includes: original electric power audio information, original electric power video information and original electric power document information;
获取电力资源信息,所述电力资源信息包括:电力音频信息、电力视频信息和电力文档信息;所述电力音频信息为将所述原始电力音频信息格式统一;所述电力视频信息为将所述原始电力视频信息格式统一;所述电力文档信息为将所述原始电力文档信息格式统一,以及提取所述电力音频信息和所述电力视频信息中的文字信息,并将所述文字信息格式统一。Acquire power resource information, the power resource information including: power audio information, power video information and power document information; the power audio information is to unify the format of the original power audio information; the power video information is to unify the format of the original power video information; the power document information is to unify the format of the original power document information, and extract text information from the power audio information and the power video information, and unify the format of the text information.
可选的,所述构建电力知识库包括:Optionally, the building of the electric power knowledge base includes:
针对所述电力资源信息,提取关键词信息;Extracting keyword information from the electric power resource information;
获取分类后的电力资源信息;所述分类后的电力资源信息包括:将所述关键词信息分类,以及将每类关键词信息所对应的电力资源信息汇总;Acquire classified power resource information; the classified power resource information includes: classifying the keyword information, and summarizing the power resource information corresponding to each type of keyword information;
对每类电力资源信息进行处理,构建电力知识库;所述处理包括内容分析、相同词语消除歧义、重复内容去除以及相似内容融合。Each type of power resource information is processed to construct a power knowledge base; the processing includes content analysis, disambiguation of identical words, removal of duplicate content, and fusion of similar content.
可选的,所述构建电力知识图谱包括:Optionally, the constructing of the electric power knowledge graph includes:
针对电力知识库,提取电力专业词汇,构建电力专业词汇表;Based on the power knowledge base, extract power professional vocabulary and build a power professional vocabulary list;
基于所述电力专业词汇表以及所述每类电力资源信息,获取所述电力专业词汇之间的联系;Based on the electric power professional vocabulary and each type of electric power resource information, obtaining the connection between the electric power professional vocabulary;
将每个电力专业词汇作为一个标签,以及将所述电力专业词汇之间的联系作为一条边,构建电力知识图谱。Each electric power professional vocabulary is taken as a label, and the connection between the electric power professional vocabulary is taken as an edge to construct an electric power knowledge graph.
可选的,所述构建不同岗位员工画像包括:Optionally, the constructing of employee portraits in different positions includes:
获取不同岗位的员工信息;Get information about employees in different positions;
对每个岗位的员工信息进行特征提取,构建每个岗位的员工画像;Extract features from employee information of each position and build a profile of employees in each position;
获取不同岗位员工画像。Obtain portraits of employees in different positions.
可选的,针对每一个员工,提供个性化服务,包括:Optional, provide personalized services for each employee, including:
根据所述不同岗位员工画像,分析不同岗位员工需求;Analyze the needs of employees in different positions based on the portraits of employees in different positions;
根据每个员工在所述电力知识图谱中的历史浏览信息,获取每个员工的喜好特征;Obtaining the preference characteristics of each employee based on the historical browsing information of each employee in the power knowledge graph;
根据每个员工当前所要查找的信息,以及每个员工的喜好特征,以及每个员工对应的岗位需求,为每个员工进行电力知识推荐。Based on the information each employee is currently looking for, each employee's preference characteristics, and each employee's corresponding job requirements, electricity knowledge recommendations are made for each employee.
可选的,所述根据每个员工当前所要查找的信息,以及每个员工的喜好特征,以及每个员工对应的岗位需求,为每个员工进行电力知识推荐之后,所述方法还包括:Optionally, after recommending power knowledge to each employee based on the information currently being searched by each employee, the preference characteristics of each employee, and the job requirements corresponding to each employee, the method further includes:
获取每个员工对所述电力知识推荐的反馈信息;Obtaining feedback information from each employee on the power knowledge recommendation;
根据所述反馈信息,优化每个员工下一次的电力知识推荐内容。Based on the feedback information, the next recommended content of electricity knowledge for each employee is optimized.
可选的,针对每一个员工,提供个性化服务,还包括:Optional, personalized services are provided for each employee, including:
所述电力知识库中包括电力知识试题;The electric power knowledge base includes electric power knowledge test questions;
针对每个员工的喜好特征,以及每个员工对应的岗位需要,为每个员工在所述电力知识库中选取电力知识试题,将选取的电力知识试题组成电力知识试卷;According to the preference characteristics of each employee and the corresponding job requirements of each employee, electric power knowledge test questions are selected from the electric power knowledge base for each employee, and the selected electric power knowledge test questions are combined into an electric power knowledge test paper;
对每个员工完成的电力知识试卷进行打分,获取每个员工的分数。Score the electricity knowledge test paper completed by each employee and obtain each employee's score.
可选的,针对每一个员工,提供个性化服务,还包括:Optional, personalized services are provided for each employee, including:
获取每个员工对所述电力知识试卷的反馈信息;Obtaining feedback information from each employee on the electricity knowledge test paper;
根据所述反馈信息,以及根据每个员工的分数,优化每个员工下一次电力知识试卷的内容。Based on the feedback information and the score of each employee, the content of the next electricity knowledge test paper for each employee is optimized.
本申请公开了一种基于异构资源融合的电力知识学习系统及方法,所述系统包括资源层、业务层和应用层;所述资源层用于获取电力资源信息;所述业务层用于根据所述电力资源信息,构建电力知识库和电力知识图谱,以及用于根据员工信息,构建不同岗位员工画像;所述应用层用于针对每一个员工,提供个性化服务,所述个性化服务包括电力知识推荐以及电力知识试卷评测。The present application discloses an electric power knowledge learning system and method based on heterogeneous resource fusion, the system comprising a resource layer, a business layer and an application layer; the resource layer is used to obtain electric power resource information; the business layer is used to construct an electric power knowledge base and an electric power knowledge graph according to the electric power resource information, and is used to construct portraits of employees in different positions according to employee information; the application layer is used to provide personalized services for each employee, and the personalized services include electric power knowledge recommendations and electric power knowledge test paper evaluations.
本申请可以对每一个员工进行针对性的培训,根据每一个员工的岗位需求以及偏好特征推荐相应的电力知识,还可以为每个员工提供个性化电力试题并进行评测;本申请还可以将庞大复杂的电力领域知识构建为电力知识库和电力知识图谱,将杂乱的电力知识进行梳理,能够清晰的看出电力知识之间的关系,提高了电网工作人员学习的效率。This application can provide targeted training for each employee, recommend corresponding electricity knowledge based on each employee's job requirements and preference characteristics, and provide personalized electricity test questions and conduct assessments for each employee; this application can also construct the huge and complex knowledge in the electricity field into an electricity knowledge base and electricity knowledge graph, sort out the messy electricity knowledge, clearly see the relationship between electricity knowledge, and improve the learning efficiency of power grid staff.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solution of the present application, the drawings required for use in the embodiments are briefly introduced below. Obviously, for ordinary technicians in this field, other drawings can be obtained based on these drawings without any creative work.
图1为本申请实施例公开的一种基于异构资源融合的电力知识学习系统的结构示意图;FIG1 is a schematic diagram of the structure of a power knowledge learning system based on heterogeneous resource fusion disclosed in an embodiment of the present application;
图2为本申请实施例公开的一种基于异构资源融合的电力知识学习方法的流程示意图。FIG2 is a flow chart of a method for learning electric power knowledge based on heterogeneous resource fusion disclosed in an embodiment of the present application.
具体实施方式Detailed ways
由于现有技术无法对每个电网工作人员进行针对性的电力知识培训,本申请公开了一种基于异构资源融合的电力知识学习系统及方法。Since the existing technology cannot provide targeted electricity knowledge training to each power grid worker, the present application discloses an electricity knowledge learning system and method based on heterogeneous resource fusion.
本申请第一实施例公开了一种基于异构资源融合的电力知识学习系统,参见图1所示的结构示意图,包括资源层、业务层和应用层;所述资源层用于获取电力资源信息;所述业务层用于根据所述电力资源信息,构建电力知识库和电力知识图谱,以及用于根据员工信息,构建不同岗位员工画像;所述应用层用于针对每一个员工,提供个性化服务,所述个性化服务包括电力知识推荐以及电力知识试卷评测。具体的,使用spark或hadoop等技术搭建基础服务平台,将获取的电力资源信息存入硬件基础环境与设备中。The first embodiment of the present application discloses an electric power knowledge learning system based on heterogeneous resource fusion, referring to the structural diagram shown in FIG1 , which includes a resource layer, a business layer and an application layer; the resource layer is used to obtain electric power resource information; the business layer is used to construct an electric power knowledge base and an electric power knowledge graph according to the electric power resource information, and to construct employee portraits of different positions according to employee information; the application layer is used to provide personalized services for each employee, and the personalized services include electric power knowledge recommendation and electric power knowledge test paper evaluation. Specifically, a basic service platform is built using technologies such as spark or hadoop, and the acquired electric power resource information is stored in the hardware basic environment and equipment.
本申请第二实施例公开了一种基于异构资源融合的电力知识学习方法,所述方法应用于所述一种基于异构资源融合的电力知识学习系统,参见图2所示的流程示意图,包括:The second embodiment of the present application discloses a method for learning electric power knowledge based on heterogeneous resource fusion, and the method is applied to the electric power knowledge learning system based on heterogeneous resource fusion, referring to the flow chart shown in FIG2 , including:
获取电力资源原始信息,所述电力资源原始信息包括:原始电力音频信息、原始电力视频信息和原始电力文档信息;其中,所述原始电力文档信息的获取途径很多,包括内部学习资源和与外部学习资源两部分,其中网络学院、文库中心与数字图书馆构成了内部学习资源;掌上学院与资讯网站构成了外部学习资源;划分资源的内容与类别,从不同的载体进行资源归集与整理,对于内网资源,从人资系统、网络学院、掌上学院、数字图书馆、文库中心系统对接学习资源,结合在不同系统中的存储方式,搭建内网资源整合平台,开发内网连接接口,使用爬虫技术抓取外网资讯网站上的外部资源,拷贝至内网系统,然后区分类别建立资源存储库。Obtain original information on electric power resources, which includes original electric power audio information, original electric power video information and original electric power document information; wherein, there are many ways to obtain the original electric power document information, including internal learning resources and external learning resources, wherein the online college, document center and digital library constitute internal learning resources; the palmtop college and information website constitute external learning resources; divide the content and category of resources, collect and organize resources from different carriers, and for intranet resources, connect learning resources from the human resources system, online college, palmtop college, digital library and document center system, combine the storage methods in different systems, build an intranet resource integration platform, develop an intranet connection interface, use crawler technology to capture external resources on the external information website, copy them to the intranet system, and then distinguish categories to establish a resource storage repository.
获取电力资源信息,所述电力资源信息包括:电力音频信息、电力视频信息和电力文档信息;所述电力音频信息为将所述原始电力音频信息格式统一;所述电力视频信息为将所述原始电力视频信息格式统一;所述电力文档信息为将所述原始电力文档信息格式统一,以及提取所述电力音频信息和所述电力视频信息中的文字信息,并将所述文字信息格式统一。获取到的电力知识资源格式五花八门,需要进行解析和统一,方便使用。由于音频文件和视频文件是电力知识的主要形式之一,所以可以提取中音视频中的文字信息,而音视频中常常出现大量晦涩难懂的专业内容和专业术语,导致识别准确低,可以构建一个相对应的电力领域的语音训练模型,提高语音识别的准确率。Obtain power resource information, the power resource information includes: power audio information, power video information and power document information; the power audio information is to unify the format of the original power audio information; the power video information is to unify the format of the original power video information; the power document information is to unify the format of the original power document information, and extract the text information in the power audio information and the power video information, and unify the format of the text information. The formats of the acquired power knowledge resources are varied and need to be parsed and unified for easy use. Since audio files and video files are one of the main forms of power knowledge, the text information in the audio and video can be extracted, and a large amount of obscure professional content and professional terms often appear in the audio and video, resulting in low recognition accuracy. A corresponding speech training model in the power field can be constructed to improve the accuracy of speech recognition.
所获取的原始电力文档信息,结构上差异很大,主流格式有txt、word、ppt、pdf、epub等,不同类型资源的存储格式内容结构完全不同,无法采用一种万能的方式就能读取和处理所有格式的内容。因此,需要设计和提供一种具有自适应的框架,来解决多样异构的文件的读取(识别)问题,适配器模式是一种可扩展、高度定制的解决方案,本申请包括多种解析器,针对不同的文档结构,分别提供专用的解析器,如epub解析器、pdf解析器、mp4视频解析处理器等,并在处理工程中通过适配器分配到各自所需的处理单元中,从而使得整个系统即具有高度的集成统一性又具有高度的适应性可扩展性。The original power document information obtained has great differences in structure. The mainstream formats are txt, word, ppt, pdf, epub, etc. The storage format and content structure of different types of resources are completely different. It is impossible to read and process all formats of content in a universal way. Therefore, it is necessary to design and provide an adaptive framework to solve the problem of reading (recognizing) diverse and heterogeneous files. The adapter mode is an extensible and highly customized solution. This application includes a variety of parsers. For different document structures, dedicated parsers are provided, such as epub parser, pdf parser, mp4 video parsing processor, etc., and they are distributed to the required processing units through adapters in the processing project, so that the entire system has both high integration and unity and high adaptability and extensibility.
对原始文档信息进行结构化的逻辑处理,将内容从异构的、充满噪音的数据中提取出来,通常文本内容由多个句子组成,句子与句子间具有一定的逻辑结构关系,例如书籍的篇章结构、段落结构、前后句等等。一些存储格式的文件本身带有结构信息,例如epub格式文件,其包含toc文件中就存储了内容的章节信息;pdf文件中,其0utline Tree部分也存储着文档的大纲,由于涉及到文档制作人的技术水平和能力,本身的结构往往既不准确也不全面,而且更多的文档格式,如txt、doc文件,并没有结构的概念,往往只能通过标定词或者格式来呈现。因此,通过对内容的逻辑处理,使得内容结构化,一方面可以对内容(句子)迅速定位,另一方面也为后续的语义处理提供基础。The original document information is processed logically in a structured way to extract the content from heterogeneous and noisy data. Usually, the text content consists of multiple sentences, and there is a certain logical structural relationship between sentences, such as the chapter structure, paragraph structure, and previous and next sentences of a book. Some storage format files have structural information themselves, such as epub format files, which contain chapter information of the content in the toc file; in pdf files, the outline of the document is also stored in the Outline Tree part. Due to the technical level and ability of the document producer, the structure itself is often neither accurate nor comprehensive, and more document formats, such as txt and doc files, do not have the concept of structure and can often only be presented through calibrated words or formats. Therefore, through the logical processing of the content, the content is structured, which can quickly locate the content (sentences) on the one hand, and provide a basis for subsequent semantic processing on the other hand.
针对所述电力资源信息,提取关键词信息;Extracting keyword information from the power resource information;
获取分类后的电力资源信息;所述分类后的电力资源信息包括:将所述关键词信息分类,以及将每类关键词信息所对应的电力资源信息汇总;Acquire classified power resource information; the classified power resource information includes: classifying the keyword information, and summarizing the power resource information corresponding to each type of keyword information;
对每类电力资源信息进行处理,构建电力知识库;所述处理包括内容分析、相同词语消除歧义、重复内容去除以及相似内容融合。具体的,使用隐性知识推断技术将知识中心中的多类别知识体系间的知识点建立起关联关系,把知识中心分散的内容建立起网络体系,建立网络体系之后使用知识发现技术及时更新和扩展知识体系,然后运用知识消歧技术,去除调冗余的知识项,最后使用知识融合技术形成结构清晰、层次分明的知识中心。Each type of power resource information is processed to construct a power knowledge base; the processing includes content analysis, disambiguation of identical words, removal of duplicate content, and fusion of similar content. Specifically, implicit knowledge inference technology is used to establish associations between knowledge points in the multi-category knowledge system in the knowledge center, and a network system is established for the scattered content in the knowledge center. After the network system is established, knowledge discovery technology is used to timely update and expand the knowledge system, and then knowledge disambiguation technology is used to remove redundant knowledge items. Finally, knowledge fusion technology is used to form a knowledge center with a clear structure and distinct levels.
针对电力知识库,提取电力专业词汇,构建电力专业词汇表;Based on the power knowledge base, extract power professional vocabulary and build a power professional vocabulary list;
基于所述电力专业词汇表以及所述每类电力资源信息,获取所述电力专业词汇之间的联系;Based on the electric power professional vocabulary and each type of electric power resource information, obtaining the connection between the electric power professional vocabulary;
将每个电力专业词汇作为一个标签,以及将所述电力专业词汇之间的联系作为一条边,构建电力知识图谱。Each electric power professional vocabulary is taken as a label, and the connection between the electric power professional vocabulary is taken as an edge to construct an electric power knowledge graph.
知识资源经过解析识别处理之后,非结构化、多态性的文档或内容转换为具有一定结构的对象,对每一个对象都赋予一个唯一标识符,并建立起对象之间的关系,从而为后续的处理计算和内容关联融合奠定基础,比如,在同一个培训资源内部,可以建立起培训相关内容或主题的前后顺序关系、章节关系等;不同的培训资源,可以围绕着基于资源抽取的主题、知识点,或者外在的相关属性如部门、作者、事件等,建立起期刊论文文献的作者关系链、试题库与培训资料、案例事件之间的关联等等,使得分散的培训内容与资料融合形成一个知识体系。After the knowledge resources are parsed and identified, the unstructured, polymorphic documents or contents are converted into objects with a certain structure. Each object is assigned a unique identifier, and the relationship between objects is established, thus laying the foundation for subsequent processing calculations and content association fusion. For example, within the same training resource, the order of training-related content or topics, chapter relationships, etc. can be established; different training resources can be centered around resource-extracted topics, knowledge points, or external related attributes such as departments, authors, events, etc., to establish author relationship chains for journal articles and documents, relationships between question banks and training materials, case events, etc., so that scattered training content and materials can be integrated to form a knowledge system.
不同类型的知识对应的不同颜色的点,或具有不同标号的点。类似地,不同类型的关系对应不同颜色的边,或具有不同标号的边,如果关系是双向的,则对应的边是双向边,反之,对应单向边。不同强度的关系对应的边具有不同权重值。Different types of knowledge correspond to points of different colors, or points with different labels. Similarly, different types of relationships correspond to edges of different colors, or edges with different labels. If the relationship is bidirectional, the corresponding edge is a bidirectional edge, otherwise, it corresponds to a unidirectional edge. The edges corresponding to relationships of different strengths have different weight values.
获取不同岗位的员工信息;Get information about employees in different positions;
对每个岗位的员工信息进行特征提取,构建每个岗位的员工画像;Extract features from employee information of each position and build a profile of employees in each position;
获取不同岗位员工画像。Obtain portraits of employees in different positions.
构建不同岗位员工画像;Build portraits of employees in different positions;
根据所述不同岗位员工画像,分析不同岗位员工需求;Analyze the needs of employees in different positions based on the portraits of employees in different positions;
根据每个员工在所述电力知识图谱中的历史浏览信息,获取每个员工的喜好特征;Obtaining the preference characteristics of each employee based on the historical browsing information of each employee in the power knowledge graph;
根据每个员工当前所要查找的信息,以及每个员工的喜好特征,以及每个员工对应的岗位需求,为每个员工进行电力知识推荐。为员工推荐相似度最大的电力知识来进行学习。According to the information each employee is currently looking for, the preferences of each employee, and the corresponding job requirements of each employee, we recommend electricity knowledge to each employee. We recommend the most similar electricity knowledge to employees for learning.
获取每个员工对所述电力知识推荐的反馈信息;Obtaining feedback information from each employee on the power knowledge recommendation;
根据所述反馈信息,优化每个员工下一次的电力知识推荐内容。Based on the feedback information, the next recommended content of electricity knowledge for each employee is optimized.
所述电力知识库中包括电力知识试题;The electric power knowledge base includes electric power knowledge test questions;
针对每个员工的喜好特征,以及每个员工对应的岗位需要,为每个员工在所述电力知识库中选取电力知识试题,将选取的电力知识试题组成电力知识试卷;According to the preference characteristics of each employee and the corresponding job requirements of each employee, electric power knowledge test questions are selected from the electric power knowledge base for each employee, and the selected electric power knowledge test questions are formed into an electric power knowledge test paper;
对每个员工完成的电力知识试卷进行打分,获取每个员工的分数。Score the electricity knowledge test paper completed by each employee and obtain each employee's score.
获取每个员工对所述电力知识试卷的反馈信息;Obtaining feedback information from each employee on the electricity knowledge test paper;
根据所述反馈信息,以及根据每个员工的分数,优化每个员工下一次电力知识试卷的内容。Based on the feedback information and the score of each employee, the content of the next electricity knowledge test paper for each employee is optimized.
针对本申请中获取的电力知识试卷,采用网络爬虫处理技术,针对该部分具体操作为:For the power knowledge test paper obtained in this application, the web crawler processing technology is used, and the specific operations for this part are as follows:
1)首先选取一部分数据种子URL;1) First, select a part of the data seed URL;
2)将这些URL放入数据待抓取URL队列;2) Put these URLs into the URL queue to be crawled;
3)从待抓取URL队列中取出待抓取URL,只用利用在线的DNS解析器即可完成DNS解析处理,得到主机的IP地址,并将URL对应的网页下载下来,将其设置在特定的文件区域,也可以直接存在对应的网络云中,后续这些存储到已下载网页库中,此外再将这些URL放进已抓取URL队列;3) Take the URL to be crawled from the queue of URLs to be crawled, and use the online DNS resolver to complete the DNS resolution process, get the IP address of the host, and download the web page corresponding to the URL, set it in a specific file area, or directly store it in the corresponding network cloud, and then store it in the downloaded web page library, and then put these URLs into the queue of crawled URLs;
4)分析已抓取URL队列中的URL,同步进行分析其中的其他URL,并且将这些URL放入待抓取URL队列,从而进入下一个循环。4) Analyze the URLs in the crawled URL queue, analyze other URLs in it simultaneously, and put these URLs into the queue of URLs to be crawled, thus entering the next cycle.
本申请公开了一种基于异构资源融合的电力知识学习系统及方法,所述系统包括资源层、业务层和应用层;所述资源层用于获取电力资源信息;所述业务层用于根据所述电力资源信息,构建电力知识库和电力知识图谱,以及用于根据员工信息,构建不同岗位员工画像;所述应用层用于针对每一个员工,提供个性化服务,所述个性化服务包括电力知识推荐以及电力知识试卷评测。The present application discloses an electric power knowledge learning system and method based on heterogeneous resource fusion, the system comprising a resource layer, a business layer and an application layer; the resource layer is used to obtain electric power resource information; the business layer is used to construct an electric power knowledge base and an electric power knowledge graph according to the electric power resource information, and is used to construct portraits of employees in different positions according to employee information; the application layer is used to provide personalized services for each employee, and the personalized services include electric power knowledge recommendations and electric power knowledge test paper evaluations.
本申请可以对每一个员工进行针对性的培训,根据每一个员工的岗位需求以及偏好特征推荐相应的电力知识,还可以为每个员工提供个性化电力试题并进行评测;本申请还可以将庞大复杂的电力领域知识构建为电力知识库和电力知识图谱,将杂乱的电力知识进行梳理,能够清晰的看出电力知识之间的关系,提高了电网工作人员学习的效率。This application can provide targeted training for each employee, recommend corresponding electricity knowledge based on each employee's job requirements and preference characteristics, and provide personalized electricity test questions and conduct assessments for each employee; this application can also construct the huge and complex knowledge in the power field into an electricity knowledge base and an electricity knowledge graph, sort out the messy electricity knowledge, clearly see the relationship between electricity knowledge, and improve the learning efficiency of power grid staff.
以上结合具体实施方式和范例性实例对本申请进行了详细说明,不过这些说明并不能理解为对本申请的限制。本领域技术人员理解,在不偏离本申请精神和范围的情况下,可以对本申请技术方案及其实施方式进行多种等价替换、修饰或改进,这些均落入本申请的范围内。本申请的保护范围以所附权利要求为准。The present application is described in detail above in conjunction with specific implementation methods and exemplary examples, but these descriptions cannot be understood as limitations on the present application. Those skilled in the art understand that, without departing from the spirit and scope of the present application, a variety of equivalent replacements, modifications or improvements can be made to the technical solutions and implementation methods of the present application, all of which fall within the scope of the present application. The scope of protection of the present application shall be subject to the attached claims.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111064267.3A CN113742496B (en) | 2021-09-10 | 2021-09-10 | A power knowledge learning system and method based on heterogeneous resource fusion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111064267.3A CN113742496B (en) | 2021-09-10 | 2021-09-10 | A power knowledge learning system and method based on heterogeneous resource fusion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113742496A CN113742496A (en) | 2021-12-03 |
CN113742496B true CN113742496B (en) | 2024-05-28 |
Family
ID=78738106
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111064267.3A Active CN113742496B (en) | 2021-09-10 | 2021-09-10 | A power knowledge learning system and method based on heterogeneous resource fusion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113742496B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114491059B (en) * | 2021-12-23 | 2024-08-06 | 北京科东电力控制系统有限责任公司 | Construction method and system of power grid skill training knowledge base |
CN114519131B (en) * | 2021-12-29 | 2023-07-25 | 航天科工网络信息发展有限公司 | Knowledge fusion processing method and device for heterogeneous resources |
CN114611002A (en) * | 2022-03-18 | 2022-06-10 | 南方电网科学研究院有限责任公司 | Method and device for generating power grid operation and maintenance post knowledge recommendation system |
CN116737954A (en) * | 2023-06-09 | 2023-09-12 | 南方电网科学研究院有限责任公司 | Generation method, device and electronic device of power system teaching aid tool |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107066599A (en) * | 2017-04-20 | 2017-08-18 | 北京文因互联科技有限公司 | A kind of similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning |
CN109359215A (en) * | 2018-12-03 | 2019-02-19 | 江苏曲速教育科技有限公司 | Video intelligent push method and system |
CN110543574A (en) * | 2019-08-30 | 2019-12-06 | 北京百度网讯科技有限公司 | A method, device, equipment and medium for constructing a knowledge graph |
CN110825881A (en) * | 2019-09-26 | 2020-02-21 | 中国电力科学研究院有限公司 | Method for establishing electric power knowledge graph |
CN110866850A (en) * | 2019-11-29 | 2020-03-06 | 北京仿真中心 | Accurate online education system of enterprise |
CN111753098A (en) * | 2020-06-23 | 2020-10-09 | 陕西师范大学 | A teaching method and system based on cross-media dynamic knowledge graph |
CN112860908A (en) * | 2021-01-27 | 2021-05-28 | 云南电网有限责任公司电力科学研究院 | Knowledge graph automatic construction method based on multi-source heterogeneous power equipment data |
KR20210063882A (en) * | 2019-11-25 | 2021-06-02 | 주식회사 데이터마케팅코리아 | A method and an apparatus for analyzing marketing information based on knowledge graphs supporting efficient classifying documents processing |
-
2021
- 2021-09-10 CN CN202111064267.3A patent/CN113742496B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107066599A (en) * | 2017-04-20 | 2017-08-18 | 北京文因互联科技有限公司 | A kind of similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning |
CN109359215A (en) * | 2018-12-03 | 2019-02-19 | 江苏曲速教育科技有限公司 | Video intelligent push method and system |
CN110543574A (en) * | 2019-08-30 | 2019-12-06 | 北京百度网讯科技有限公司 | A method, device, equipment and medium for constructing a knowledge graph |
CN110825881A (en) * | 2019-09-26 | 2020-02-21 | 中国电力科学研究院有限公司 | Method for establishing electric power knowledge graph |
KR20210063882A (en) * | 2019-11-25 | 2021-06-02 | 주식회사 데이터마케팅코리아 | A method and an apparatus for analyzing marketing information based on knowledge graphs supporting efficient classifying documents processing |
CN110866850A (en) * | 2019-11-29 | 2020-03-06 | 北京仿真中心 | Accurate online education system of enterprise |
CN111753098A (en) * | 2020-06-23 | 2020-10-09 | 陕西师范大学 | A teaching method and system based on cross-media dynamic knowledge graph |
CN112860908A (en) * | 2021-01-27 | 2021-05-28 | 云南电网有限责任公司电力科学研究院 | Knowledge graph automatic construction method based on multi-source heterogeneous power equipment data |
Also Published As
Publication number | Publication date |
---|---|
CN113742496A (en) | 2021-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113742496B (en) | A power knowledge learning system and method based on heterogeneous resource fusion | |
CN112131449B (en) | Method for realizing cultural resource cascade query interface based on ElasticSearch | |
CN107766371B (en) | Text information classification method and device | |
CN112749284A (en) | Knowledge graph construction method, device, equipment and storage medium | |
US11615241B2 (en) | Method and system for determining sentiment of natural language text content | |
CN110516067A (en) | Public opinion monitoring method, system and storage medium based on topic detection | |
CN117952209A (en) | Knowledge graph construction method and system | |
CN108376160A (en) | A kind of Chinese knowledge mapping construction method and system | |
CN111723256A (en) | A method and system for constructing government user portrait based on information resource database | |
CN113918725A (en) | A construction method of knowledge graph in water affairs field | |
CN115757689A (en) | Information query system, method and equipment | |
CN103514234A (en) | Method and device for extracting page information | |
US10740406B2 (en) | Matching of an input document to documents in a document collection | |
US8572118B2 (en) | Computer method and apparatus of information management and navigation | |
CN109947952A (en) | Search method, device, equipment and storage medium based on english knowledge map | |
CN113282955A (en) | Method, system, terminal and medium for extracting privacy information in privacy policy | |
CN113239111A (en) | Network public opinion visual analysis method and system based on knowledge graph | |
Sabou et al. | Towards semantically enhanced web service repositories | |
CN115713118A (en) | Power grid operation and maintenance post knowledge pushing method and system | |
CN118445406A (en) | Integration system based on massive polymorphic circuit heritage information | |
CN117171355A (en) | Cultural gene knowledge graph construction method and device | |
CN112035723A (en) | Resource library determination method and device, storage medium and electronic device | |
US8046360B2 (en) | Reduction of annotations to extract structured web data | |
CN118733713B (en) | Data processing method, data processing device and storage medium | |
CN108153754A (en) | A kind of data processing method and its device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |