[go: up one dir, main page]

CN120185930B - Threat information-based network threat rule generation method - Google Patents

Threat information-based network threat rule generation method

Info

Publication number
CN120185930B
CN120185930B CN202510648405.4A CN202510648405A CN120185930B CN 120185930 B CN120185930 B CN 120185930B CN 202510648405 A CN202510648405 A CN 202510648405A CN 120185930 B CN120185930 B CN 120185930B
Authority
CN
China
Prior art keywords
threat intelligence
network threat
language model
api
sigma
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202510648405.4A
Other languages
Chinese (zh)
Other versions
CN120185930A (en
Inventor
汉京宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Ruining Xinchuang Technology Co ltd
Original Assignee
Jiangsu Ruining Xinchuang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Ruining Xinchuang Technology Co ltd filed Critical Jiangsu Ruining Xinchuang Technology Co ltd
Priority to CN202510648405.4A priority Critical patent/CN120185930B/en
Publication of CN120185930A publication Critical patent/CN120185930A/en
Application granted granted Critical
Publication of CN120185930B publication Critical patent/CN120185930B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
    • H04L63/302Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information gathering intelligence information for situation awareness or reconnaissance
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0245Filtering by information in the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Technology Law (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

本发明涉及一种基于威胁情报的网络威胁规则生成方法,涉及网络安全领域。本申请爬取开源网络威胁情报;通过图像分析提示词引导多模态语言模型将图像类的开源网络威胁情报转换为文本类;统一本化内容格式得到初始网络威胁情报;通过语言模型辅助过滤初始网络威胁情报;智能体利用投票的方式从过滤后网络威胁情报中识别出第一类实体和第二类实体并建立联系;利用西格玛规则创建提示词控制智能体基于过滤后网络威胁情报块,从网络威胁情报块中提取的关联的第一类实体和第二类实体进行西格玛规则创建;利用西格玛规则优化提示词、西格玛规则验证提示词控制智能体所用语言模型对生成的西格玛规则进行优化、验证。

The present invention relates to a method for generating network threat rules based on threat intelligence, and relates to the field of network security. The present invention crawls open source network threat intelligence; uses image analysis prompt words to guide a multimodal language model to convert image-based open source network threat intelligence into text-based information; unifies the localized content format to obtain initial network threat intelligence; uses a language model to assist in filtering the initial network threat intelligence; an intelligent agent uses voting to identify first-category entities and second-category entities from the filtered network threat intelligence and establishes connections; uses sigma rules to create prompt words to control the intelligent agent to create sigma rules based on the associated first-category entities and second-category entities extracted from the filtered network threat intelligence block; uses sigma rules to optimize prompt words and verify prompt words; and optimizes and verifies the generated sigma rules using the language model used by the intelligent agent.

Description

Threat information-based network threat rule generation method
Technical Field
The invention relates to the technical field of network threat rule construction, in particular to a method for generating a network threat rule based on threat information.
Background
The sigma rule is in a general signature format, is a security event detection rule set and is used for a log analysis tool to help a security analyst to analyze and identify abnormal behaviors of a network, the sigma rule comprises a customized rule set customized for different attack modes, each sigma rule in the rule set is triggered to alarm by a condition, and the structure of each sigma rule can be divided into three main parts, namely a head part, an option part and a detection query part, wherein the head part comprises basic information of the sigma rule, such as id, title, description, author, date and the like of the rule. This information is critical to understanding the context, purpose, and source of the rule. The options section defines the context requirements of the rule, such as process creation time, use of a specific process ID, etc. The detection query part is the core of rule definition and describes specific conditions to be detected, which are usually fields in log sources and their values, and the fields and values in various log sources are used for accurate matching. The log source may be from an operating system, an application, or any other log generating source. These fields, in combination with the corresponding values, define the conditions for detection of the security threat. Sigma rules provide a series of conditional combinations and logical operators to create richer rule expressions. The conditional combination is to use logical operators and, or, not to connect different rule selection parts to build complex detection scenarios. In addition, rule sets may build hierarchies and dependencies, allowing related rules to be organized together to form a hierarchical structure. Such a structure helps manage a large number of rules and can improve the readability and maintainability of the rules, and the relationships between rule sets can be inclusion and dependency.
Open source cyber-threat intelligence is an important source for forming cyber-threat sigma rules, however, open source cyber-threat intelligence typically occurs in unstructured format and contains image information that requires further human analysis to form formatted sigma rules. Due to unstructured and image forms of the open source network threat information, the conventional automatic analysis of the open source network threat information through a regularization means to form sigma rules is limited. With the development of natural language technology, natural language processing technology is used for open source network threat intelligence analysis, and an advanced natural language model is utilized to extract the effective content of sigma rule from open source network threat intelligence text. However, in order to adapt these models to the field of cyber-threats, to the semantics of the field of cyber-threat intelligence, a great deal of pre-processing and fine-tuning is required.
Disclosure of Invention
In order to solve the technical problems or at least partially solve the technical problems, the invention provides a network threat rule generation method based on threat intelligence.
In a first aspect, the present invention provides a method for generating a network threat rule based on threat intelligence, including:
crawling an open source network threat information webpage from an open source network threat information source through a network crawling tool;
Guiding the multimodal language model to convert the crawled open source network threat information of the image class in the webpage element related to the open source network threat information into a text class through the image analysis prompting word;
converting the text content in the webpage element related to the open source network threat information into a unified text format to obtain initial network threat information;
analyzing keywords representing redundant contents in the initial network threat information titles through a language model, and removing repeated redundant contents in the initial network threat information according to the positions of text structure layers divided by all titles of target titles for the target titles with the keywords representing the redundant contents to obtain filtered network threat information;
providing the filtered network threat information to at least one intelligent agent based on a language model, wherein the intelligent agent utilizes the semantic analysis capability of the language model and the mode of multi-intelligent agent voting to identify a first type entity and a second type entity from the filtered network threat information and establish a connection, the first type entity is an entity necessary for forming a sigma rule detection query part, the first type entity comprises an API or a process call, request parameters of the API or the process call, an intrusion index, a log source and an event source, the second type entity provides the context information of the network threat information, and the second type entity comprises a title and a description in a sigma rule, a threat technique, a false report and a threat level;
the method comprises the steps that a sigma rule is used for creating a prompt word to control a language model used by an agent, and sigma rule creation is carried out on the first-class entity and the second-class entity which are extracted from a network threat information block based on the filtered network threat information block;
The generated sigma rule is optimized by utilizing a sigma rule optimizing prompt word to control a language model used by the intelligent agent;
and verifying the generated and optimized sigma rule by using a language model used by the sigma rule verification prompt word control agent.
Furthermore, the image analysis prompt word defines a multi-mode language model as a network security analysis expert and is specially used for carrying out text analysis on images from threat information sources, the task of defining the multi-mode language model is to provide text analysis on the images, and the task requirement is defined to include the steps of giving image description in a text form, presenting the image description in a format closely matched with the appearance of the images, ensuring that the output description is related to and accurately reflected on the image content, ensuring that the output description is complete and not adding any data related potential use or explanation, suggestion or opinion.
Further, the text content in the webpage element related to the open-source network threat information is converted into a unified text format, and in the process of obtaining the initial network threat information, the unified text format keeps the space, paragraph and code segmentation of the open-source network threat information content in the webpage element so as to keep the original layout of the text content, the title of the webpage element is marked by the unified text format so as to construct a corresponding hierarchical structure according to the content in the original HTML page by using the marked title, and the structural attribute of the HTML code containing the form and the nested list is kept by the unified text format.
Further, the providing the filtered cyber-threat intelligence to at least one language model-based agent, the agent identifying a first type of entity and a second type of entity from the filtered cyber-threat intelligence by utilizing semantic analysis capabilities of the language model and a manner of voting by multiple agents, and establishing a connection includes:
Dividing the filtered network threat information into semantically complete blocks through a text dividing tool, wherein the content in each filtered network threat information block is semantically uniform;
the first type entity extracts prompt words to guide an intelligent agent to use a language model to extract corresponding first type entities from the filtered network threat information block, the intelligent agent performs voting according to the first type entity result extracted by the intelligent agent, and if the voting number of any first type entity exceeds a set voting number threshold, the filtered network threat information block contains the first type entity;
The method includes the steps that a context block is built by taking filtered network threat information blocks from which first-class entities are extracted as the center and combining a preset number of filtered network threat information blocks before and after a text format;
And extracting the prompt words from the second type entities, guiding the language model used by the intelligent agent to refer to the extracted first type entities, extracting the corresponding second type entities from the context block, and constructing the connection between the second type entities and the corresponding first type entities.
Further, for API or process call in the first kind of entity, the language model used by the agent is guided to extract the explicitly mentioned API call or process call from the filtered network threat information block by the API or process call extraction prompt word, the agent votes according to the extracted API or process call extraction result, if the voting number of any API or process call exceeds the set vote number threshold, the filtered network threat information block contains the API or process call entity, wherein the content of the API or process call extraction prompt word comprises that the language model used by the agent is controlled to extract the API or process call entity in a word matching mode by using the word related to the API or process call, and the language model used by the agent is further guided to extract the API or process call entity in a word matching mode by using the API or process call semantic correlation analysis mode.
Furthermore, in order to realize that the language model used by the intelligent agent is guided to extract the API or the process calling entity according to the manner of calling semantic relativity analysis by the API or the process, a corresponding knowledge base is constructed for the intelligent agent, wherein the knowledge base is a vectorized database for supporting the recognition of the language model used by the intelligent agent, the knowledge base comprises semantic descriptions of network operation and API or process calling corresponding to the semantic descriptions of the network operation, the language model used by the intelligent agent is provided with a filtered network threat information block, the filtered network threat information block is used as query for vectorization, the language model used by the intelligent agent matches topk results which are most similar to the vectorized query from the knowledge base, and candidate API or process calling to be voted is predicted according to topk results.
Furthermore, extraction of threat technique in the second type of entity requires configuring a prompt dictionary in the agent, wherein the prompt dictionary contains a mapping between API or process call and threat technique in the first type of entity, and a mapping between threat technique and API or process call, and the agent uses a language model to extract threat technique labels corresponding to the API or process call according to the content of the context block and the extracted semantics, and referring to the prompt dictionary.
Further, the content of the sigma rule creating prompt word comprises defining roles which are used as network security analysis specialists and are specially used for generating sigma rules from open source network threat information contexts, defining tasks which are used for converting the network threat information contexts containing the following elements into sigma rules, namely analysis contents describing attack technique and tactics in the open source network threat information, events in the open source network threat information of log sources or event source groups, namely a first type entity and a second type entity related to the events, defining tasks which require that all provided event names, IDs, event sources and corresponding technical tactics of the events must be used, omitting any key information is forbidden, each event name only appears in one sigma rule, extracting details which can be converted into parameters in the network threat information contexts, combining the sigma rules with the same technical tactics, and strictly formatting the generated sigma rules.
Furthermore, in the process of optimizing the generated sigma rule by using the sigma rule optimizing prompt words to control the language model used by the intelligent agent, the content of the sigma rule optimizing prompt words comprises the steps of merging detection query fields with the same detection query standard and consistent condition logic in the sigma rule, and dividing the detection query fields with the same detection query standard and inconsistent condition logic.
Further, in the process of verifying the generated and optimized sigma rule by using the sigma rule verification prompt word to control the language model used by the intelligent agent, the content of the used sigma rule verification prompt word comprises verifying whether the format of the generated and optimized sigma rule meets the requirement of the sigma rule format by referring to the provided sigma rule format, verifying whether metadata in the generated and optimized sigma rule meets the original content of the open source network threat information by referring to the provided open source network threat information, and verifying the accuracy of the conditions in the sigma rule by referring to the provided open source network threat information.
In a second aspect, the invention provides a threat information-based network threat rule generation apparatus, which comprises at least one processing unit, wherein the processing unit is connected with a storage unit through a bus unit, the storage unit stores a computer program, and the threat information-based network threat rule generation method is realized when the computer program is executed by the processing unit.
In a third aspect, the present invention provides a computer readable storage medium storing a computer program, which when executed by a processor, implements the threat intelligence based network threat rule generation method.
Compared with the prior art, the technical scheme provided by the embodiment of the invention has the following advantages:
The method comprises the steps of crawling open-source network threat information, guiding a multi-mode language model through image analysis prompt words to convert the crawled open-source network threat information of image types into text types, converting text contents into a unified text format to obtain initial network threat information, assisting in filtering the initial network threat information through the language model, providing the filtered network threat information for at least one intelligent body based on the language model, identifying first-class entities and second-class entities from the filtered network threat information by the intelligent body in a voting mode, establishing connection, controlling the intelligent body to establish sigma rules based on the filtered network threat information blocks by utilizing sigma rule creation prompt words, and extracting the associated first-class entities and second-class entities from the network threat information blocks. The application utilizes the multi-modal language model and the intelligent agent constructed by the language model to automatically generate sigma rules according to the multi-modal network threat information. 92% accuracy and 96% recall are achieved on the critical API or process call extraction tasks in the first class of entities, and 98% accuracy and 97% recall are achieved on the critical intrusion index extraction tasks in the first class of entities. In addition, 98.28% of the generated sigma rule candidates were successfully compiled.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flowchart of a method for generating a network threat rule based on threat intelligence provided by an embodiment of the invention;
FIG. 2 is a schematic diagram of a method for generating a network threat rule based on threat information according to an embodiment of the present invention;
FIG. 3 is a flow chart for providing filtered cyber-threat intelligence to at least one language-model-based agent, the agent identifying a first type of entity and a second type of entity from the filtered cyber-threat intelligence and establishing a relationship using semantic analysis capabilities of the language model and a manner of multi-agent voting, provided by an embodiment of the invention;
fig. 4 is a schematic diagram of a network threat rule generating apparatus based on threat information according to an embodiment of the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
Example 1
As shown in fig. 1 and fig. 2, the technology of the present invention implements a method for generating a network threat rule based on threat intelligence, including:
s100, crawling the open source network threat information web page from the open source network threat information source through the network crawling tool.
The network crawling tool crawls the webpage codes of the open source network threat information sources, eliminates the webpage elements which are irrelevant to the open source network threat information in the webpage by checking the webpage elements, and determines the webpage elements which are relevant to the open source network threat information in the webpage.
S200, guiding the multimodal language model to convert the open source network threat information of the image class in the webpage element related to the open source network threat information into a text class through the image analysis prompt word.
In addition to carrying open source network threat intelligence for text classes, the web page elements may also contain open source network threat intelligence for image classes. The application controls the multi-modal language model to convert the image type open-source network threat information into text type through the image analysis prompt word. An exemplary image analysis prompter includes the role of a web security analysis expert dedicated to text analysis of images from threat intelligence sources. Task providing text analysis of the provided image. The task requires that the image description be given in text form, and the main focus should be to extract information from the image that helps to understand the attack. The description needs to be presented in a format that matches the content in the image. Ensuring that the output description is related to and accurately reflected by the image content and ensuring that the output description is complete. No additional information about the potential use or interpretation, suggestion or opinion of the data is added. Exemplary multimodal language models are e.g. Qwen-VL, cogVLM2, cogAgent, gpt-4o.
S300, converting the text content in the webpage element related to the open source network threat information into a unified text format to obtain the initial network threat information. In the process, a unified text format keeps the blank, paragraph and code segment of the open source network threat information content in the webpage element to keep the original layout of the text content, the unified text format marks the title of the webpage element to construct a corresponding hierarchical structure according to the content in the original HTML page by using the marked title, and the unified text format keeps the structural attribute of the HTML code containing the table and the nested list.
S400, analyzing keywords representing redundant contents in the initial network threat information titles through a language model, and eliminating repeated redundant contents in the initial network threat information according to the positions of text structure layers divided by all the target titles for the target titles with the keywords representing the redundant contents to obtain filtered network threat information.
In the implementation process, keywords representing redundant contents in each level of titles, such as abstract, brief introduction, overview and conclusion, are identified and positioned by using a language model, the level of the title where the keywords are positioned is defined, and the title of the level and the redundant contents under the level are deleted. The content corresponding to the abstract, the brief introduction, the overview and the conclusion is often a summary of more specific network threat information content, does not participate in the generation of sigma rules, and belongs to redundant information.
S500, providing the filtered network threat information to at least one intelligent agent based on the language model, wherein the intelligent agent identifies a first type entity and a second type entity from the filtered network threat information by utilizing the semantic analysis capability of the language model and the mode of voting by multiple intelligent agents, and establishes a connection. The first type of entity is the entity necessary to form the sigma rule detection query part. The first type of entity comprises an API or a process call, request parameters of the API or the process call, an intrusion index, a log source and an event source, the second type of entity provides context information of network threat information, and the second type of entity comprises titles and descriptions in sigma rules, threat technique, false alarm and threat level.
In the specific implementation process, as shown in fig. 3, step S500 includes the following steps:
S501, the filtered network threat information is divided into semantically complete blocks through a text segmentation tool, the content in each filtered network threat information block is semantically unified, and the language model is facilitated to extract the first type entity and the second type entity more accurately in a semantically block manner.
And S502, extracting prompt words from the first type entity to guide the intelligent agent to use a language model to extract the corresponding first type entity from the filtered network threat information block, voting by the intelligent agent according to the first type entity result extracted by the intelligent agent, and if the voting number of any first type entity exceeds a set vote number threshold, including the first type entity in the filtered network threat information block.
Taking API or process call as an example, for the API or process call in the first kind of entity, extracting prompt words through the API or process call to guide the language model used by the agent to extract the explicitly mentioned API call or process call from the filtered network threat information block, voting by the agent according to the extracted API or process call extraction result, and if the voting number of any API or process call exceeds a set vote number threshold, including the API or process call entity in the filtered network threat information block. The content of the API or process call extraction prompt word comprises that the control language model utilizes Get, call, request, create, register words which are related to the process call to extract the API or process call entity in a word matching mode, and further guides the language model to extract the API or process call entity in a semantic relevance analysis mode of the API or process call. In order to realize that the guide language model performs API or process calling entity extraction according to the method of the semantic relevance analysis of the API or process calling. The application constructs a corresponding knowledge base for the intelligent agent, wherein the knowledge base is a vectorized database supporting language model recognition, and the knowledge base comprises semantic descriptions of network operation and API or process calls corresponding to the semantic descriptions of the network operation. For example, the semantic description is "attacker applicable xxx download malicious load", and the semantic description may involve the API call being "url lib.request.url retriever" or "requests.get". After providing the filtered network threat information blocks for the language model of the intelligent agent, the filtered network threat information blocks are used as queries to carry out vectorization, the language model matches topk results which are most similar to the vectorized queries from a knowledge base, and candidate APIs or process calls to be voted are predicted according to topk results.
In one example, 9 agents are set, 5 is set as the vote threshold, and if 6 of the 9 agents extract GetNetUser calls from one filtered network threat intelligence block, getNetUser calls obtain a vote count of 6 exceeding the vote count threshold of 5. As another example, an agent is set to set the voting threshold to 5, the agent's process of extracting entities from a filtered network threat intelligence block is performed 9 times, and if the extraction is performed 6 times to GetNetUser calls, getNetUser calls the number of votes obtained to be 6.
In a preferred example, since the guided language model performs extraction of an API or process calling entity in a manner of API or process calling semantic relevance analysis with lower accuracy, a higher voting threshold is set to improve its accuracy when voting.
S503, the context block is built by taking the filtered network threat information blocks from which the first kind of entities are extracted as the center and combining the filtered network threat information blocks with the preset quantity before and after the text format.
And S504, extracting prompt words from the second kind of entities to guide the first kind of entities extracted by the language model reference of the intelligent agent to extract corresponding second kind of entities from the context block. The application provides a method for constructing a context block by combining a set number of filtered network threat information blocks before and after a text format, extracting the second type entity from the context block, and constructing a connection with the corresponding first type entity.
In the implementation process, the language model used by the intelligent agent can finish the extraction of the title, description, false alarm and threat level in the second kind of entity based on the semantic analysis capability. The extraction of the threat technique in the second kind of entity needs to configure a prompt dictionary in the agent, the prompt dictionary contains a mapping between API or process call and threat technique, the mapping between threat technique and API or process call, the language model used by the agent extracts threat technique labels corresponding to the API or process call according to the content of the context block and the extracted semantics, and the prompt dictionary is referred to.
In one embodiment, the second type entity is extracted through voting, and the voting principle is consistent with the voting process of the first type entity, and is not repeated.
S600, creating prompt words by utilizing sigma rules to control language models used by the intelligent agents to perform sigma rule creation based on the filtered network threat information blocks and the associated first-class entities and second-class entities extracted from the network threat information blocks.
The content of the sigma rule creation prompt word comprises:
Role as network security analysis expert, specially used for generating sigma rule from open source network threat intelligence context.
The task is that the context of the network threat information containing the following elements is converted into sigma rules, namely the analysis content describing the attack technique in the open source network threat information, the event in the open source network threat information of the log source or event source group, the first kind entity and the second kind entity related to the event;
The task requires that all provided event names, IDs, event sources and technical tactics corresponding to the events must be used, omission of any key information is forbidden, each event name only appears in one sigma rule, details which can be converted into parameters in the context of network threat information are extracted, sigma rules with the same technical tactics are combined, and the generated sigma rules are strictly formatted.
In the specific implementation process, the filtered network threat information block, the associated first type entity and second type entity extracted from the network threat information block, together with the sigma rule creation prompt word, are provided for a language model used by an intelligent agent, and the language model used by the intelligent agent is based on the filtered network threat information block, the associated first type entity and second type entity extracted from the network threat information block, and the sigma rule is primarily created according to the content of the sigma rule creation prompt word.
S700, optimizing the generated sigma rule by using a language model used by the sigma rule optimizing prompt word control agent. The content of the sigma rule optimizing prompt word comprises the detection query fields with the same detection query standard and consistent condition logic in the combined sigma rule. The detection query fields having the same detection query criteria and but inconsistent conditional logic are partitioned. Some conditions of the generated sigma rule with the same detection query standard and the same condition logic are separated into different detection query fields, and the detection query fields with the same detection query standard and the same condition logic in the sigma rule are combined through the language analysis capability of the language model used by the agent. Some detection query standards are the same in the same generated sigma rule, but the condition logic is inconsistent but divided into the same detection query fields, the detection query fields with the same detection query standards and the inconsistent condition logic in the sigma rule are segmented through the language analysis capability of the language model used by the intelligent agent, and the condition logic is adaptively adjusted.
S800, verifying the generated and optimized sigma rule by using a language model used by the sigma rule verification prompt word control agent. The content of the sigma rule verification prompt word comprises the steps of verifying whether the format of the generated and optimized sigma rule meets the requirement of the sigma rule format or not by referring to the provided sigma rule format, verifying whether metadata in the generated and optimized sigma rule meets the original open source network threat information content or not by referring to the provided open source network threat information, and verifying the accuracy of the conditions in the sigma rule by referring to the provided open source network threat information.
The language model used for the filtering task alone and the language model used by the agent may be identical or not identical. For example gml, deepseek, qwen.
Example 2
Referring to fig. 4, an embodiment of the present invention provides a threat intelligence based network threat rule generating apparatus, which includes at least one processing unit, where the processing unit is connected to a storage unit through a bus unit, and the storage unit is used as a computer readable storage medium and may be used to store a software program, a computer executable program, and a module, where the threat intelligence based network threat rule generating method in the embodiment of the present invention corresponds to the software program, the computer executable program, and the module. The processing unit executes a software program, a computer executable program and a module stored in the storage unit, so as to realize the network threat rule generation method based on threat information, and the method comprises the following steps:
crawling an open source network threat information webpage from an open source network threat information source through a network crawling tool;
Guiding a multi-mode language model to convert the open-source network threat information of the image class in the webpage element related to the open-source network threat information into a text class through the image analysis prompt word;
converting the text content in the webpage element related to the open source network threat information into a unified text format to obtain initial network threat information;
analyzing keywords representing redundant contents in the initial network threat information titles through a language model, and removing repeated redundant contents in the initial network threat information according to the positions of text structure layers divided by all titles of target titles for the target titles with the keywords representing the redundant contents to obtain filtered network threat information;
providing the filtered network threat information to at least one intelligent agent based on a language model, wherein the intelligent agent utilizes the semantic analysis capability of the language model and the mode of multi-intelligent agent voting to identify a first type entity and a second type entity from the filtered network threat information and establish a connection, the first type entity is an entity necessary for forming a sigma rule detection query part, the first type entity comprises an API or a process call, request parameters of the API or the process call, an intrusion index, a log source and an event source, the second type entity provides the context information of the network threat information, and the second type entity comprises a title and a description in a sigma rule, a threat technique, a false report and a threat level;
And utilizing a language model used by the sigma rule creation prompt word control agent to perform sigma rule creation based on the filtered network threat information block and extracting the associated first-class entity and second-class entity from the network threat information block.
Of course, the storage unit in the device for implementing the threat information-based network threat rule generation method provided by the embodiment of the invention is not limited to the above method operation, and the related operation in the threat information-based network threat rule generation method provided by any embodiment of the invention can be executed.
Example 3
The embodiment of the invention provides a computer readable storage medium, which stores a computer program, and when the computer program is executed, the method for generating the network threat rule based on threat information is realized, and comprises the following steps:
crawling an open source network threat information webpage from an open source network threat information source through a network crawling tool;
Guiding a multi-mode language model to convert the open-source network threat information of the image class in the webpage element related to the open-source network threat information into a text class through the image analysis prompt word;
converting the text content in the webpage element related to the open source network threat information into a unified text format to obtain initial network threat information;
analyzing keywords representing redundant contents in the initial network threat information titles through a language model, and removing repeated redundant contents in the initial network threat information according to the positions of text structure layers divided by all titles of target titles for the target titles with the keywords representing the redundant contents to obtain filtered network threat information;
providing the filtered network threat information to at least one intelligent agent based on a language model, wherein the intelligent agent utilizes the semantic analysis capability of the language model and the mode of multi-intelligent agent voting to identify a first type entity and a second type entity from the filtered network threat information and establish a connection, the first type entity is an entity necessary for forming a sigma rule detection query part, the first type entity comprises an API or a process call, request parameters of the API or the process call, an intrusion index, a log source and an event source, the second type entity provides the context information of the network threat information, and the second type entity comprises a title and a description in a sigma rule, a threat technique, a false report and a threat level;
And utilizing a language model used by the sigma rule creation prompt word control agent to perform sigma rule creation based on the filtered network threat information block and extracting the associated first-class entity and second-class entity from the network threat information block.
The computer readable storage medium according to the embodiment of the present invention stores a computer program not limited to the above-described method operations, but also can perform related operations in a threat intelligence-based network threat rule generation method according to any embodiment of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed structures and methods may be implemented in other manners. For example, the structural embodiments described above are merely illustrative, and for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via interfaces, structures or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1.一种基于威胁情报的网络威胁规则生成方法,其特征在于,包括:1. A method for generating network threat rules based on threat intelligence, comprising: 通过网络爬取工具从开源网络威胁情报源爬取开源网络威胁情报网页;Use web crawling tools to crawl open source threat intelligence web pages from open source threat intelligence sources; 通过图像分析提示词引导多模态语言模型将爬取的与开源网络威胁情报相关的网页元素中图像类的开源网络威胁情报转换为文本类;The multimodal language model is guided by image analysis prompt words to convert the image-based open source network threat intelligence in the crawled web page elements related to open source network threat intelligence into text-based open source network threat intelligence; 将与开源网络威胁情报相关的网页元素中的文本化内容转换成统一的文本格式,得到初始网络威胁情报;Convert the textual content in web page elements related to open source cyber threat intelligence into a unified text format to obtain initial cyber threat intelligence; 通过语言模型分析初始网络威胁情报标题中表示冗余内容的关键词,对于具有表示冗余内容的关键词的目标标题,按照目标标题在全部标题所划分的文本结构层次的位置排除初始网络威胁情报中重复冗余的内容得到过滤后网络威胁情报;The language model is used to analyze keywords representing redundant content in the initial network threat intelligence titles. For target titles containing keywords representing redundant content, the redundant content in the initial network threat intelligence is removed based on the position of the target title in the text structure hierarchy of all titles to obtain filtered network threat intelligence. 将过滤后网络威胁情报提供给至少一个基于语言模型的智能体,智能体利用语言模型的语义分析能力和多智能体投票的方式从过滤后网络威胁情报中识别出第一类实体和第二类实体,并建立联系;第一类实体是形成西格玛规则检测查询部分所必需的实体,第一类实体包括:API或进程调用,API或进程调用的请求参数,入侵指标,日志来源和事件来源;第二类实体提供网络威胁情报的上下文信息,第二类实体包括西格玛规则中的标题和描述,威胁技战术,误报和威胁级别;The filtered network threat intelligence is provided to at least one language model-based intelligent agent, which uses the semantic analysis capability of the language model and multi-agent voting to identify first-category entities and second-category entities from the filtered network threat intelligence and establish connections. The first-category entities are necessary to form the query part of the Sigma rule detection. The first-category entities include: API or process calls, request parameters of API or process calls, intrusion indicators, log sources, and event sources. The second-category entities provide contextual information for the network threat intelligence. The second-category entities include the title and description in the Sigma rule, threat techniques and tactics, false positives, and threat levels. 利用西格玛规则创建提示词控制智能体所用语言模型基于过滤后网络威胁情报块,从网络威胁情报块中提取的关联的第一类实体和第二类实体进行西格玛规则创建;The language model used to create prompt words to control the intelligent agent is based on the filtered network threat intelligence block, and the sigma rules are created based on the associated first-category entities and second-category entities extracted from the network threat intelligence block; 利用西格玛规则优化提示词控制智能体所用语言模型对生成的西格玛规则进行优化;Utilize the Sigma rule optimization prompt word control agent to optimize the generated Sigma rule using the language model; 利用西格玛规则验证提示词控制智能体所用语言模型对生成和优化后的西格玛规则进行验证。The generated and optimized sigma rules are verified by using the language model used by the sigma rule verification prompt words to control the intelligent agent. 2.根据权利要求1所述的基于威胁情报的网络威胁规则生成方法,其特征在于,所述图像分析提示词定义多模态语言模型作为一个网络安全分析专家,专门用于对来自威胁情报源的图像进行文本分析;定义了多模态语言模型的任务是提供图像的文本分析;定义了任务要求包括:以文本的形式给出图像描述,以与图像外观密切匹配的格式呈现图像描述,确保输出的描述与图像内容相关并准确反映,确保输出的描述完整,不添加任何关于数据潜在用途或解释、建议或意见。2. The method for generating network threat rules based on threat intelligence according to claim 1 is characterized in that the image analysis prompt word defines a multimodal language model as a network security analysis expert, which is specifically used to perform text analysis on images from threat intelligence sources; the task of the multimodal language model is defined to provide text analysis of images; and the task requirements are defined to include: giving an image description in the form of text, presenting the image description in a format that closely matches the appearance of the image, ensuring that the output description is relevant to and accurately reflects the image content, and ensuring that the output description is complete and does not add any potential uses or explanations, suggestions or opinions about the data. 3.根据权利要求1所述的基于威胁情报的网络威胁规则生成方法,其特征在于,将与开源网络威胁情报相关的网页元素中的文本化内容转换成统一的文本格式,得到初始网络威胁情报过程中,统一的文本格式保持网页元素中的开源网络威胁情报内容的空格,段落和代码分段,以保留文本化内容的原始布局;统一的文本格式标记网页元素的标题,以利用被标记的标题按保持原始HTML页面内内容构建相应的层次结构;对于包含表格和嵌套列表的HTML代码,统一的文本格式保留其结构属性。3. The method for generating network threat rules based on threat intelligence according to claim 1 is characterized in that the textual content in the web page elements related to the open source network threat intelligence is converted into a unified text format to obtain the initial network threat intelligence. The unified text format maintains the spaces, paragraphs and code segments of the open source network threat intelligence content in the web page elements to retain the original layout of the textual content; the unified text format marks the titles of the web page elements to use the marked titles to construct a corresponding hierarchical structure by maintaining the content in the original HTML page; for HTML code containing tables and nested lists, the unified text format retains their structural attributes. 4.根据权利要求1所述的基于威胁情报的网络威胁规则生成方法,其特征在于,所述将过滤后网络威胁情报提供给至少一个基于语言模型的智能体,智能体利用语言模型的语义分析能力和多智能体投票的方式从过滤后网络威胁情报中识别出第一类实体和第二类实体,并建立联系包括:4. The method for generating network threat rules based on threat intelligence according to claim 1, wherein providing the filtered network threat intelligence to at least one language model-based agent, wherein the agent utilizes the semantic analysis capability of the language model and multi-agent voting to identify the first category of entities and the second category of entities from the filtered network threat intelligence and establish a relationship therebetween comprises: 将过滤后网络威胁情报通过文本分割工具分割成语义完整的块,每个过滤后网络威胁情报块内的内容在语义上统一;The filtered network threat intelligence is segmented into semantically complete blocks using text segmentation tools. The content within each filtered network threat intelligence block is semantically unified. 第一类实体提取提示词指导智能体所用语言模型从过滤后网络威胁情报块中提取相应的第一类实体,智能体根据自身提取的第一类实体结果来进行投票,如果任意第一类实体的投票数超过设定的票数阈值,则过滤后网络威胁情报块中包含该第一类实体;The first-category entity extraction prompt guides the language model used by the agent to extract the corresponding first-category entities from the filtered network threat intelligence block. The agent votes based on the first-category entity results it has extracted. If the number of votes for any first-category entity exceeds the set vote threshold, the first-category entity is included in the filtered network threat intelligence block. 以提取出第一类实体的过滤后网络威胁情报块为中心,结合在文本格式前后设定数量的过滤后网络威胁情报块来构建上下文块;Centering on the filtered cyber threat intelligence block that extracts the first type of entity, a context block is constructed by combining a set number of filtered cyber threat intelligence blocks before and after the text format; 用第二类实体提取提示词指导智能体所用语言模型参考提取的第一类实体从上下文块中提取相应的第二类实体,并构建与相应第一类实体之间的联系。The second-category entity extraction prompt words are used to guide the language model used by the intelligent agent to refer to the extracted first-category entities to extract corresponding second-category entities from the context block and build connections between them and the corresponding first-category entities. 5.根据权利要求4所述的基于威胁情报的网络威胁规则生成方法,其特征在于,对于第一类实体中的API或进程调用,通过API或进程调用提取提示词指导智能体所用语言模型从过滤后网络威胁情报块中提取被明确提到的API调用或进程调用,智能体根据自身提取的API或进程调用提取结果来进行投票,如果任意API或进程调用的投票数超过设定的票数阈值,则过滤后网络威胁情报块中包含该API或进程调用实体;其中,所述API或进程调用提取提示词的内容包括:控制智能体所用语言模型利用API或进程调用会涉及到的词语,来按照词语匹配的方式进行API或进程调用实体提取;且进一步引导智能体所用语言模型按照API或进程调用语义相关性分析的方式进行API或进程调用实体提取。5. The method for generating network threat rules based on threat intelligence according to claim 4 is characterized in that, for API or process calls in the first category of entities, the language model used by the intelligent agent is guided by API or process call extraction prompt words to extract the explicitly mentioned API calls or process calls from the filtered network threat intelligence block, and the intelligent agent votes based on the API or process call extraction results extracted by itself. If the number of votes for any API or process call exceeds a set vote threshold, the API or process call entity is included in the filtered network threat intelligence block; wherein the content of the API or process call extraction prompt words includes: controlling the language model used by the intelligent agent to use the words involved in the API or process call to extract the API or process call entity in a word matching manner; and further guiding the language model used by the intelligent agent to extract the API or process call entity in a way of API or process call semantic relevance analysis. 6.根据权利要求5所述的基于威胁情报的网络威胁规则生成方法,其特征在于,为了实现引导智能体所用语言模型按照API或进程调用语义相关性分析的方式进行API或进程调用实体提取,为智能体构建相应的知识库,所述知识库为支持智能体所用语言模型识别的向量化的数据库,所述知识库中包括网络操作的语义描述及对应网络操作语义描述的API或进程调用;向智能体所用语言模型提供过滤后网络威胁情报块后,过滤后网络威胁情报块作为查询进行向量化,智能体所用语言模型从知识库中匹配topk个与向量化查询最相似的结果,根据topk个结果预测待投票的候选API或进程调用。6. The method for generating network threat rules based on threat intelligence according to claim 5 is characterized in that, in order to guide the language model used by the intelligent agent to extract API or process call entities in accordance with the API or process call semantic correlation analysis, a corresponding knowledge base is constructed for the intelligent agent, and the knowledge base is a vectorized database that supports the recognition of the language model used by the intelligent agent, and the knowledge base includes semantic descriptions of network operations and API or process calls corresponding to the semantic descriptions of network operations; after providing the filtered network threat intelligence block to the language model used by the intelligent agent, the filtered network threat intelligence block is vectorized as a query, and the language model used by the intelligent agent matches the topk results most similar to the vectorized query from the knowledge base, and predicts the candidate API or process calls to be voted based on the topk results. 7.根据权利要求5所述的基于威胁情报的网络威胁规则生成方法,其特征在于,第二类实体中威胁技战术的提取需要在智能体配置提示字典,所述提示字典包含第一类实体中API或进程调用与威胁技战术之间的映射,威胁技战术和API或进程调用之间的映射,智能体所用语言模型根据上下文块的内容和所提取的语义,参考所述提示字典提取对应API或进程调用的威胁技战术标签。7. The method for generating network threat rules based on threat intelligence according to claim 5 is characterized in that the extraction of threat techniques and tactics from the second type of entities requires configuring a prompt dictionary in the intelligent agent, wherein the prompt dictionary contains a mapping between API or process calls and threat techniques and tactics in the first type of entities, and a mapping between threat techniques and tactics and API or process calls. The language model used by the intelligent agent extracts the threat technique and tactic labels corresponding to the API or process call with reference to the prompt dictionary based on the content of the context block and the extracted semantics. 8.根据权利要求1所述的基于威胁情报的网络威胁规则生成方法,其特征在于,所述西格玛规则创建提示词的内容包括:定义角色:作为网络安全分析专家,专门用于从开源网络威胁情报上下文中生成西格玛规则;定义任务:将包含以下要素的网络威胁情报上下文转化为西格玛规则:开源网络威胁情报中描述攻击技战术的分析内容;日志源或事件源分组的开源网络威胁情报中事件,事件涉及的第一类实体和第二类实体;定义任务要求:必须使用所有提供的事件名称、ID、事件源及事件对应的技战术;禁止遗漏任何关键信息;每个事件名称仅出现在一条西格玛规则中;提取网络威胁情报上下文中可转化为参数的细节;合并具有相同技战术的西格玛规则;生成的西格玛规则严格的格式化。8. The method for generating network threat rules based on threat intelligence according to claim 1 is characterized in that the content of the Sigma rule creation prompt words includes: defining a role: as a network security analysis expert, specifically used to generate Sigma rules from the context of open source network threat intelligence; defining a task: converting the network threat intelligence context containing the following elements into Sigma rules: analysis content describing attack techniques and tactics in open source network threat intelligence; events in open source network threat intelligence grouped by log sources or event sources, and the first and second category entities involved in the events; defining task requirements: all provided event names, IDs, event sources and techniques and tactics corresponding to the events must be used; no key information is allowed to be omitted; each event name only appears in one Sigma rule; extracting details that can be converted into parameters in the network threat intelligence context; merging Sigma rules with the same techniques and tactics; the generated Sigma rules are strictly formatted. 9.根据权利要求1所述的基于威胁情报的网络威胁规则生成方法,其特征在于,在利用西格玛规则优化提示词控制智能体所用语言模型对生成的西格玛规则进行优化的过程中,用到的西格玛规则优化提示词的内容包括:合并西格玛规则中具有相同检测查询标准和一致条件逻辑的检测查询字段;分割具有相同检测查询标准和但条件逻辑不一致的检测查询字段。9. The method for generating network threat rules based on threat intelligence according to claim 1 is characterized in that, in the process of optimizing the generated sigma rules using the language model used by the sigma rule optimization prompt word control agent, the content of the sigma rule optimization prompt word used includes: merging detection query fields with the same detection query criteria and consistent conditional logic in the sigma rule; and splitting detection query fields with the same detection query criteria but inconsistent conditional logic. 10.根据权利要求1所述的基于威胁情报的网络威胁规则生成方法,其特征在于,在利用西格玛规则验证提示词控制智能体所用语言模型对生成和优化后的西格玛规则进行验证的过程中,用到的西格玛规则验证提示词的内容包括:参考提供的西格玛规则格式来验证生成和优化后的西格玛规则的格式是否属实符合西格玛规则格式要求;参考提供的开源网络威胁情报验证生成和优化后的西格玛规则中的元数据是否符合原始的开源网络威胁情报内容;参考提供的开源网络威胁情报验证西格玛规则中条件的准确性。10. The method for generating network threat rules based on threat intelligence according to claim 1 is characterized in that, in the process of using the sigma rule verification prompt word to control the language model used by the intelligent agent to verify the generated and optimized sigma rules, the content of the sigma rule verification prompt word used includes: referring to the provided sigma rule format to verify whether the format of the generated and optimized sigma rule is true and meets the sigma rule format requirements; referring to the provided open source network threat intelligence to verify whether the metadata in the generated and optimized sigma rule meets the original open source network threat intelligence content; and referring to the provided open source network threat intelligence to verify the accuracy of the conditions in the sigma rule.
CN202510648405.4A 2025-05-20 2025-05-20 Threat information-based network threat rule generation method Active CN120185930B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202510648405.4A CN120185930B (en) 2025-05-20 2025-05-20 Threat information-based network threat rule generation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202510648405.4A CN120185930B (en) 2025-05-20 2025-05-20 Threat information-based network threat rule generation method

Publications (2)

Publication Number Publication Date
CN120185930A CN120185930A (en) 2025-06-20
CN120185930B true CN120185930B (en) 2025-08-05

Family

ID=96030040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202510648405.4A Active CN120185930B (en) 2025-05-20 2025-05-20 Threat information-based network threat rule generation method

Country Status (1)

Country Link
CN (1) CN120185930B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523301A (en) * 2023-11-17 2024-02-06 中电云计算技术有限公司 Event fusion analysis processing method and system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20250014244A (en) * 2023-07-19 2025-02-03 주식회사 샌즈랩 Apparatus for processing cyber threat information, method for processing cyber threat information, and medium for storing a program processing cyber threat information
CN117725458A (en) * 2023-08-28 2024-03-19 陈镜冰 Method and device for obtaining threat information sample data generation model
US12149558B1 (en) * 2024-03-13 2024-11-19 QRShield LLC Cybersecurity architectures for multi-contextual risk quantification
CN118138362A (en) * 2024-04-10 2024-06-04 宁波和利时信息安全研究院有限公司 LLM-driven self-adaptive industrial network safety protection method and firewall device
CN119109635A (en) * 2024-08-22 2024-12-10 浪潮云信息技术股份公司 A method and system for intelligent analysis of network false threat intelligence based on a large security model
CN119728233A (en) * 2024-12-19 2025-03-28 北京天融信网络安全技术有限公司 A network threat evidence collection method, program product, electronic device and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117523301A (en) * 2023-11-17 2024-02-06 中电云计算技术有限公司 Event fusion analysis processing method and system

Also Published As

Publication number Publication date
CN120185930A (en) 2025-06-20

Similar Documents

Publication Publication Date Title
US10942977B2 (en) Systems and methods for targeting, reviewing, and presenting online social network data by evidence context
US7174507B2 (en) System method and computer program product for obtaining structured data from text
CN120641878A (en) Generative AI crawling and chunking
RU2605077C2 (en) Method and system for storing and searching information extracted from text documents
KR20120009446A (en) Automated Semantic Labeling System and Method for Natural Language Text
RU61442U1 (en) SYSTEM OF AUTOMATED ORDERING OF UNSTRUCTURED INFORMATION FLOW OF INPUT DATA
JP2004362563A (en) System, method, and computer program recording medium for performing unstructured information management and automatic text analysis
US20220358379A1 (en) System, apparatus and method of managing knowledge generated from technical data
WO2023278052A1 (en) Automated troubleshooter
Han et al. Text Summarization Using FrameNet‐Based Semantic Graph Model
CN111209753B (en) Entity naming identification method and device
JP2025012467A (en) Information processing device, information processing method, and program
US20250061140A1 (en) Systems and methods for enhancing search using semantic search results
CN112507089A (en) Intelligent question-answering engine based on knowledge graph and implementation method thereof
Popova et al. Multilevel ontologies for big data analysis and processing
CN119903159A (en) A knowledge question-answering fast processing system based on artificial intelligence
CN118886415B (en) Interactive data analysis method based on large model
CN119646016A (en) Data query method, device, electronic device, medium and program product
CN116756341B (en) A complete knowledge graph construction method based on multi-source vulnerability data
Rogushina et al. Use of ontologies for metadata records analysis in big data
CN120185930B (en) Threat information-based network threat rule generation method
Haj et al. Automated generation of terminological dictionary from textual business rules
CN110705252A (en) Technical contract determination method, electronic device, computer device, and storage medium
Vickers Ontology-based free-form query processing for the semantic web
Kaladevi et al. Development of Background Ontology for Weather Systems through Ontology Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant