[go: up one dir, main page]

CN118171110B - Business log text processing method and system applied to intelligent government affairs - Google Patents

Business log text processing method and system applied to intelligent government affairs Download PDF

Info

Publication number
CN118171110B
CN118171110B CN202410593094.1A CN202410593094A CN118171110B CN 118171110 B CN118171110 B CN 118171110B CN 202410593094 A CN202410593094 A CN 202410593094A CN 118171110 B CN118171110 B CN 118171110B
Authority
CN
China
Prior art keywords
log
text
training
service
business
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410593094.1A
Other languages
Chinese (zh)
Other versions
CN118171110A (en
Inventor
李壮壮
李昕玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongke Jindezhu Intelligent Technology Co.,Ltd.
Original Assignee
Beijing Zhongkejin Finite Element Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongkejin Finite Element Technology Co ltd filed Critical Beijing Zhongkejin Finite Element Technology Co ltd
Priority to CN202410593094.1A priority Critical patent/CN118171110B/en
Publication of CN118171110A publication Critical patent/CN118171110A/en
Application granted granted Critical
Publication of CN118171110B publication Critical patent/CN118171110B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to the technical field of intelligent government affairs and artificial intelligence, in particular to a business log text processing method and system applied to intelligent government affairs. The innovation technology not only overcomes the defects of the traditional method in the aspects of single information source, update lag and the like, but also can accurately track and predict the service demand according to the real-time user behavior data. By the embodiment of the application, the user demand can be responded more intelligently, and the service efficiency and the user satisfaction are improved. Meanwhile, the embodiment of the application also has self-optimization capability, and can continuously adjust and optimize the network variable of the structured text processing network according to the training error in actual operation so as to adapt to the continuous change of government service requirements. The technical scheme provides powerful technical support for the sustainable development of intelligent government affairs.

Description

Business log text processing method and system applied to intelligent government affairs
Technical Field
The application relates to the technical field of intelligent government affairs and artificial intelligence, in particular to a business log text processing method and system applied to intelligent government affairs.
Background
In the field of intelligent government affairs, how to accurately track and understand business requirements is always a key problem for improving government affair service efficiency and quality. Traditional government service systems often only rely on static business log data for analysis, and cannot fully utilize dynamic information of user behaviors, so that accuracy and timeliness of business demand tracking are limited.
Disclosure of Invention
In order to improve the problems, the application provides a business log text processing method and a system applied to intelligent government affairs.
The embodiment of the application provides a business log text processing method applied to intelligent government affairs, which is applied to an NLP log text processing system, and comprises the following steps:
Acquiring a priori intelligent government affair log set and training marks of the priori intelligent government affair log set, wherein the priori intelligent government affair log set comprises a first business log training text and a second business log training text, and the training marks comprise conversation topic views corresponding to all business conversation fragments in the first business log training text and conversation topic views corresponding to all business conversation fragments in the second business log training text;
Mining a first business log knowledge training vector of the first business log training text and a second business log knowledge training vector of the second business log training text through a structured text processing network;
Integrating a page behavior vector of page user behavior data with the first business log knowledge training vector through the structured text processing network to obtain first multi-mode government integrated knowledge; the page user behavior data comprises X conversation topic views, wherein X is the total number of the conversation topic views, and X is an integer greater than 1;
Integrating the page behavior vector with the second business log knowledge training vector through the structured text processing network to obtain second multi-modal government integrated knowledge;
Carrying out service demand tracking according to the first multi-mode government affair integration knowledge and the second multi-mode government affair integration knowledge through the structured text processing network to obtain a service demand tracking label of the second service log training text compared with the first service log training text;
Determining a training error according to the service demand tracking label and the training label of the prior intelligent government log set;
And optimizing network variables of the structured text processing network according to the training errors.
Preferably, the first service log knowledge training vector includes a service session attribute vector of each service session segment in the first service log training text; the integrating the page behavior vector of the page user behavior data with the first business log knowledge training vector to obtain a first multi-modal government integrated knowledge comprises the following steps:
Integrating each business session attribute vector in the first business log knowledge training vector with the page behavior vector based on a linkage characteristic focusing strategy to obtain a first linkage integrated vector;
projecting the first linkage integration vector to a target knowledge vector coordinate system to obtain a second linkage integration vector;
projecting the page behavior vector to the target knowledge vector coordinate system to obtain a first page behavior vector;
integrating the second linkage integration vector and the first page behavior vector based on the feature integration factor to obtain first multi-modal government affair integration knowledge; the network variables of the structured text processing network include the feature integration factor.
Preferably, the service requirement tracking tag characterizes a first dynamic service session segment in the second service log training text that is updated compared to the first service log training text;
the determining the training error according to the service demand tracking label and the training label of the prior intelligent government affair log set comprises the following steps:
Determining a second dynamic service session fragment of the second service log training text, which is different from the first service log training text session theme viewpoint, according to the session theme viewpoint corresponding to each service session fragment in the first service log training text in the training label and the session theme viewpoint corresponding to each service session fragment in the second service log training text;
determining a service demand tracking training error according to the first dynamic service session segment and the second dynamic service session segment;
and tracking a training error based on the business requirement, and determining the training error.
Preferably, the method further comprises:
Performing conversation topic view identification based on the first business log knowledge training vector to obtain conversation topic views identified for each business conversation segment in the first business log training text;
Performing conversation topic view identification based on the second business log knowledge training vector to obtain conversation topic views identified for each business conversation segment in the second business log training text;
Determining a first topic point recognition training error based on a conversation topic point corresponding to each business conversation segment in the first business log training text and a conversation topic point recognized for each business conversation segment in the first business log training text;
Determining a second topic point recognition training error based on the conversation topic point corresponding to each service conversation segment in the second service log training text and the conversation topic point recognized for each service conversation segment in the second service log training text;
the step of tracking the training error based on the service requirement, the step of determining the training error comprises the following steps:
And integrating the first theme viewpoint recognition training error, the second theme viewpoint recognition training error and the business requirement tracking training error to obtain the training error.
Preferably, the structured document processing network comprises a log text feature embedded subnet; the first business log knowledge training vector comprises a first log text embedding vector generated by the log text feature embedding subnet for the first business log training text; the first journal text embedding vector comprises text description vectors of all business journal training text units in the first business journal training text; the second business log knowledge training vector comprises a second log text embedding vector generated by the log text feature embedding subnet for the second business log training text; the second log text embedding vector comprises text description vectors of all business log training text units in the second business log training text;
The method further comprises: text paragraph matching is carried out on the basis of text description vectors of all business log training text units in the first business log training text and text description vectors of all business log training text units in the second business log training text, so that text paragraph matching results are obtained, and the text paragraph matching results represent second business log training text units associated with first business log training text units in the first business log training text in the second business log training text;
The determining, according to the session topic views corresponding to the service session segments in the first service log training text and the session topic views corresponding to the service session segments in the second service log training text in the training label, the second dynamic service session segments of the second service log training text that are different from the first service log training text session topic views includes:
determining the content jump state of the business log training text unit in the first business log training text compared with the associated business log training text unit in the second business log training text according to the distribution characteristics of the first business log training text unit and the distribution characteristics of the second business log training text unit indicated by the text paragraph matching result;
determining service session fragments associated with each service session fragment in the second service log training text in the first service log training text based on service session fragment distribution characteristics of each service session fragment in the second service log training text and content jump states of service log training text units associated with the first service log training text in comparison with service log training text units associated with the second service log training text;
And aiming at each service session fragment in the second service log training text, if the session theme viewpoint corresponding to the service session fragment is different from the session theme viewpoint corresponding to the service session fragment associated with the service session fragment in the first service log training text, determining that the service session fragment is a second dynamic service session fragment.
Preferably, the structured text processing network further includes a text knowledge derivative subnet, and the first business log knowledge training vector further includes a first log text derivative knowledge vector generated by the text knowledge derivative subnet based on the first log text embedding vector; the second business log knowledge training vector further comprises a second log text derived knowledge vector generated by the text knowledge derived subnet based on the second log text embedded vector;
The step of identifying the session topic point of view based on the first service log knowledge training vector to obtain the session topic point of view identified for each service session segment in the first service log training text comprises the following steps: inputting the first log text derived knowledge vector into a topic identification branch to obtain a conversation topic view identified by the topic identification branch for each business conversation segment in the first business log training text;
The step of identifying the session topic point of view based on the second service log knowledge training vector to obtain the session topic point of view identified for each service session segment in the second service log training text comprises the following steps: and inputting the second log text derived knowledge vector into a topic judgment branch to obtain a conversation topic view identified by the topic judgment branch for each business conversation segment in the second business log training text.
Preferably, the service requirement tracking according to the first multi-modal government integrated knowledge and the second multi-modal government integrated knowledge to obtain a service requirement tracking tag of the second service log training text compared with the first service log training text includes:
Determining a first session preference linear vector according to a first log text derived knowledge vector and first multi-modal government affair integrated knowledge, wherein the first session preference linear vector comprises a first viewpoint interest vector of each service session fragment in the first service log training text under each session topic viewpoint in X session topic viewpoints;
integrating the session preference linear vector with the first log text derived knowledge vector to obtain first target multi-mode government affair integrated knowledge;
Determining a second session preference linear vector according to a second log text derived knowledge vector and second multi-modal government integrated knowledge, wherein the second session preference linear vector comprises a second viewpoint interest vector of each service session fragment in the second service log training text corresponding to each session topic viewpoint in X session topic viewpoints;
integrating the second viewpoint interest vector with the second log text derived knowledge vector to obtain second target multi-mode government affair integrated knowledge;
And combining the first target multi-mode government affair integrated knowledge with the second target multi-mode government affair integrated knowledge, and carrying out business demand tracking on a combined result to obtain a business demand tracking label of the second business log training text compared with the first business log training text.
Preferably, before the acquiring the prior intelligent government affair log set and the training label of the prior intelligent government affair log set, the method further includes:
acquiring training reference information and a service log highlighting training text aiming at a first paragraph set in the first service log training text; the training reference information is used for prompting a session event focused by the first paragraph set in the process of replying the service log training text;
Performing service log training text rendition on a first paragraph set in the first service log training text according to training reference information based on the first service log training text, the service log highlighting training text and training reference information through a log text rendition network to obtain a second service log training text;
And generating the prior intelligent government affair log set based on the first business log training text and the second business log training text.
Preferably, the method further includes, before the step of performing service log training text replication on the first paragraph set in the first service log training text according to the training reference information to obtain a second service log training text by using the log text replication network based on the first service log training text, the service log highlighting training text and the training reference information, the method further includes:
Acquiring a history service log training text, a history service log highlighting training text aiming at a second paragraph set in the history service log training text and history guidance features, wherein the history guidance features are used for prompting a history session event focused by the second paragraph set in the process of replying the service log training text;
performing service log training text reiteration on a second paragraph set in the history service log training text by using a log text reiteration network according to the history guidance characteristics, the history service log training text and the history service log highlighting training text to obtain a history reiteration service log training text;
Determining text replication errors according to the historical service log training text and the historical replication service log training text;
and optimizing network variables of the log text repetition network according to the text repetition errors.
Preferably, the determining the text replication error according to the historical service log training text and the historical replication service log training text includes:
Based on the training text elements of the second paragraph set in the history review service log training text process and the training text elements of the second paragraph set in the history service log training text, obtaining a session fragment consistency training error;
and determining a second training error based on the session segment consistency training error, wherein the session segment consistency training error and the second training error have a set quantization relationship.
Preferably, before the acquiring the prior intelligent government affair log set and the training label of the prior intelligent government affair log set, the method further includes:
optimizing network variables of service session fragments in a third section set in the first service log training text to obtain a second service log training text;
And generating the prior intelligent government affair log set based on the first business log training text and the second business log training text.
Preferably, the method comprises:
acquiring a preamble service log text and a service log text to be analyzed;
and carrying out service demand tracking based on the preamble service log text and the to-be-analyzed service log text by using a structured text processing network to obtain a service demand tracking label of the to-be-analyzed service log text compared with the preamble service log text.
The embodiment of the application provides an NLP log text processing system, which comprises at least one processor and a memory; the memory stores computer-executable instructions; the at least one processor executes the computer-executable instructions stored by the memory, causing the at least one processor to perform the method described above.
Embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when run, implements the method described above.
In the embodiment of the application, the intelligent government affair log set and the user behavior data are deeply fused to form a multidimensional and dynamically updated government affair knowledge base. The innovation technology not only overcomes the defects of the traditional method in the aspects of single information source, update lag and the like, but also can accurately track and predict the service demand according to the real-time user behavior data. By the embodiment of the application, the user demand can be responded more intelligently, and the service efficiency and the user satisfaction are improved. Meanwhile, the embodiment of the application also has self-optimization capability, and can continuously adjust and optimize the network variable of the structured text processing network according to the training error in actual operation so as to adapt to the continuous change of government service requirements. The technical scheme provides powerful technical support for the sustainable development of intelligent government affairs.
Drawings
Fig. 1 is a flowchart of a business log text processing method applied to intelligent government affairs according to an embodiment of the present application.
Fig. 2 is a schematic structural diagram of an NLP log text processing system 200 according to an embodiment of the present application.
Detailed Description
In order to better understand the above technical solutions, the following detailed description of the technical solutions of the present application is made by using the accompanying drawings and specific embodiments, and it should be understood that the specific features of the embodiments and the embodiments of the present application are detailed descriptions of the technical solutions of the present application, and not limiting the technical solutions of the present application, and the technical features of the embodiments and the embodiments of the present application may be combined with each other without conflict.
Fig. 1 shows a business log text processing method applied to intelligent government affairs, and applied to an NLP log text processing system, the method comprises the following steps 101-105.
Step 101, an NLP log text processing system acquires a priori intelligent government affair log set and training marks of the priori intelligent government affair log set, wherein the priori intelligent government affair log set comprises a first business log training text and a second business log training text, and the training marks comprise conversation topic views corresponding to all business conversation fragments in the first business log training text and conversation topic views corresponding to all business conversation fragments in the second business log training text.
In the embodiment of the application, the priori intelligent government affair log set contains a large amount of government affair system log data. The government affair system log data is naturally generated in the operation process of the government affair service platform, and records the information of interaction behavior, system response, business processing process and the like of the user and the platform. The data are raw data which are unprocessed, but contain abundant user behavior information and business requirements, and have important values for optimizing government service and improving the level of intelligence.
Training labels are manual or automatic labels on data in a set of a priori intelligent government logs. In these log data, important pieces of information, such as business sessions, may be labeled with specific labels or information to facilitate training and learning by the machine learning model. In this scenario, the training annotation refers specifically to the annotation of the topic views of the session corresponding to the service session fragment, that is, the topic or core views of each session are explicitly identified.
The first business log training text is a subset of the prior intelligent government log set and mainly contains log data of a certain type of business or under a scene. These data are used as training text, through training of machine learning models, enabling the models to learn and identify specific patterns and laws in such businesses or scenarios. Similar to the first business log training text, the second business log training text is also a subset of the a priori wisdom government log set, but represents log data for another type of business or scenario. These data are also used as training text to help models learn and identify features and behavior patterns in different businesses or scenarios.
In a government service platform, each interaction or dialogue of a user with the system can be considered as a business session segment. A business session segment may include a series of sequential actions and information exchanges such as a user's query, a system's response, a user's feedback, etc. These session fragments are important bases for analyzing user needs and optimizing system services.
In each business session segment, users may communicate and interact around one or more core topics or views. These core topics or views can be understood as session topic views, which are the essence and main content of session segments, and have important significance for understanding user needs and optimizing government services.
In detail, step 101 is the start of the entire workflow of the NLP log text processing system, which is also the data preparation phase. In step 101, the NLP log text processing system first obtains a large set of a priori intelligent government log sets from the intelligent government system. The log sets are data records naturally generated by the government service platform in daily operation, and contain rich user behavior information and business requirements. After the prior intelligent government log sets are obtained, the system also needs to obtain training marks corresponding to the log sets. The training labels are manual or automatic labels carried out on the conversation topic views corresponding to each business conversation fragment by a data scientist or a business expert according to the log content. Such labeling information is critical to subsequent machine learning model training because they provide a learned goal and direction for the model. The prior-inspection intelligent government affair log is concentrated, and the system pays attention to two types of business log training texts: a first traffic log training text and a second traffic log training text. The two training texts respectively represent different service scenes or user behavior modes and are important inputs for subsequent model training. Through comparing and analyzing the two training texts, the system can more comprehensively understand the user demands and improve the intelligent level of government service.
Step 102, the NLP log text processing system discovers a first business log knowledge training vector of the first business log training text and discovers a second business log knowledge training vector of the second business log training text through a structured text processing network.
In the embodiment of the application, the first business log knowledge training vector is a feature vector extracted from the first business log training text through a machine learning technology. This vector captures the key information and context in the text, converting it into a mathematical form that the computer can understand and operate on. The first service log knowledge training vector is a deep and structured representation of the first service log data, contains important characteristics such as user behavior, demand, preference and the like in the service scene, and has key effects on subsequent tasks such as service demand tracking, user behavior prediction and the like. Similar to the first business log knowledge training vector, the second business log knowledge training vector is a feature vector extracted from the second business log training text. The vector also captures core information and characteristics in the second class of business log data, and provides a data basis for subsequent multi-modal government integrated knowledge generation and business demand tracking. The second service log knowledge training vector reflects the characteristics of user behavior, requirements and the like in the second class of service scenes, and the dynamic change of the user requirements and the difference of the service scenes can be more comprehensively understood through comparison and analysis with the first service log knowledge training vector.
For example, there is a first business log training text whose content mainly consults business processes about tax registration around the user. A first business log knowledge training vector can be obtained through the processing of the structured text processing network. The first business log knowledge training vector may comprise a plurality of dimensions, each dimension representing a different feature in the text. The first traffic log knowledge training vector may be exemplary: [0.8,0.4,0.2,0.6,...]. Each value in the first business log knowledge training vector represents the importance or strength of a particular feature in the text. For example, a first value of 0.8 may characterize the frequency or importance of occurrence of the keyword "tax registration" in the text; a second value of 0.4 may characterize the urgency of a user query; a third value of 0.2 may reflect the complexity of the specific tax issue mentioned in the text; and a fourth value of 0.6 may indicate the familiarity of the user with the business process. The specific numerical values of the feature vectors are obtained by analyzing and learning a large number of first business log training texts through a structured text processing network. The network converts the information into a numerical form by identifying keywords, phrases and context in the text, thereby forming a first business log knowledge training vector.
Similarly, for the second business log training text, a similar second business log knowledge training vector may be obtained. The second business log training text mainly relates to the consultation of the user about the social security service. The second business log knowledge training vector may be exemplified by: [0.3,0.7,0.5,0.2,...]. In the second business log knowledge training vector, the first value of 0.3 can represent the occurrence frequency of the keyword of the social security in the text; the second value of 0.7 may reflect the user's attention to the service; a third value of 0.5 may represent the diversity of the specific questions noted in the text; and a fourth value of 0.2 may characterize the user's satisfaction with the service flow. The specific numerical value of the feature vector is obtained by further analyzing and learning the training text of the second service log through the structured text processing network.
It can be seen that the first business log knowledge training vector and the second business log knowledge training vector are deep, structured representations of the original log text. The method reflects various characteristics and key information in the text through different values, and provides powerful support for subsequent data processing, user behavior prediction and business demand tracking. The generation and utilization of the knowledge training vector of the business log is one of important applications of the NLP technology in the field of intelligent government affairs. It should be noted that the eigenvector values given above are only examples, and that the vectors in practical applications may be more complex and diverse. Furthermore, the specific meaning and interpretation of these vectors need to be determined according to the actual application scenario and business requirements.
In detail, step 102 involves mining knowledge training vectors in the first and second business log training texts through a structured text processing network. In this step, the NLP log text processing system first performs an in-depth analysis of the first traffic log training text. Through the structured text processing network, the system can identify key information in the text, such as user behavior, business requirements, time stamps, etc., and convert the information into a mathematical vector form, i.e., a first business log knowledge training vector. The vector captures the surface information in the text, and deeply digs the semantic relation and the context relation behind the text, thereby providing a rich feature basis for subsequent data processing and service demand tracking. Likewise, the system would perform a similar process on the second traffic log training text. And extracting a second business log knowledge training vector from the second business log training text through the structured text processing network. The vector reflects key characteristics such as user behavior, demand and the like in the second class of service scene, and is an important basis for follow-up service demand tracking and comparison analysis. In the mining process, structured text processing networks play an important role. It learns the ability to extract key features from text through a large amount of training data using deep learning techniques. The system can automatically and accurately mine valuable information from the log text, and powerful support is provided for subsequent government service optimization. Therefore, the corresponding knowledge training vectors are successfully mined from the two types of service log training texts through the structured text processing network, and a solid foundation is laid for subsequent multi-mode government integrated knowledge generation and service demand tracking.
Step 103, integrating a page behavior vector of page user behavior data with the first business log knowledge training vector through the structured text processing network by the NLP log text processing system to obtain first multi-mode government affair integration knowledge; the page user behavior data includes X conversation topic views, X is a total number of conversation topic views, and X is an integer greater than 1.
In the embodiment of the application, the page user behavior data refer to data generated in the process of user interaction with the government service platform page, and the data reflect the behavior habit, interest preference and requirement of the user. Such data includes, but is not limited to, a record of the user clicking on a page element (e.g., button, link, etc.), the user's dwell time on the page, the speed and depth of scrolling through the page, search keywords, etc. With this data, the user's real needs and intentions can be insight into, thereby optimizing the design and provision of government services.
Page behavior vectors are mathematical models that transform page user behavior data into a measurable and comparable numerical form. Such a vector may be made up of a plurality of eigenvalues, each eigenvalue representing some aspect of the user behavior. For example, a page behavior vector may contain characteristic values such as the duration of the user accessing the page, the number of clicks, the scrolling depth, etc., that is: [0.7,0.3,0.5..], wherein 0.7 may represent the average residence time of the user on the page (normalized), 0.3 represents the average number of clicks of the user, 0.5 represents the page scroll depth ratio of the user, etc.
The first multimodal government affair integration knowledge is a composite knowledge representation obtained by integrating a page behavior vector of page user behavior data with a first business log knowledge training vector. The integrated knowledge not only contains the information extracted from the service log, but also fuses the actual behavior data of the user on the page, thereby providing a more comprehensive and richer understanding of the user's needs and intentions, namely: [0.6,0.8,0.4,0.9.], which collectively reflect the user's comprehensive behavior and needs on the government service platform.
Based on this, step 103 is a key fusion step in the NLP log text processing system, which involves the efficient integration of page user behavior data with business log knowledge. In this step, the NLP log text processing system first gathers and collates page user behavior data. The data comprises various operation behaviors of the user on the government service platform, such as clicking, browsing, searching and the like, and each operation is accurately recorded and converted into a page behavior vector. This vector is a numerical representation of the user's behavior that captures every minute action of the user on the page and is presented in the form of a feature value. Next, the system integrates this page behavior vector with the first business log knowledge training vector previously extracted from the first business log. This integrated process is implemented by a specific algorithm that can fuse two different types of data together to form a new, more comprehensive knowledge representation. The integrated knowledge not only contains text information in the service log, but also integrates actual operation data of the user on the page, thereby providing more accurate and comprehensive basis for user behavior analysis and service demand prediction. In addition, the page user behavior data also comprises a plurality of session theme views, and the views are different views and requirements expressed by the user in the session process. The NLP log text processing system analyzes the session topic views one by one, extracts key information in the session topic views, combines the key information with a page behavior vector and a business log knowledge vector, and further enriches and perfects the content of the first multi-mode government integrated knowledge. Therefore, by integrating the page behavior vector of the page user behavior data with the first business log knowledge training vector, a more comprehensive and deeper user demand and behavior understanding is obtained: first multimodal government integrated knowledge. The integrated knowledge provides powerful data support and decision basis for subsequent user behavior prediction and business demand tracking.
And 104, integrating the page behavior vector and the second business log knowledge training vector by the NLP log text processing system through the structured text processing network to obtain second multi-mode government integrated knowledge.
In the embodiment of the application, the second multi-modal government integrated knowledge refers to a compound knowledge formed by effectively combining the page user behavior data and the second business log knowledge through a specific integration method. The knowledge not only contains the behavior characteristics of the user on the government service page, but also fuses the key information and the context relation in the second business log. In this way, the second multimodal government integrated knowledge can more fully reflect the needs and preferences of the user in a specific business scenario, such as [0.85,0.6,0.7,0.3,.+ -.), where the values represent features of different dimensions, such as the liveness of the user's behavior, the focus on certain types of information, the depth of interaction, and so on.
In detail, the goal of step 104 is to integrate the page user behavior data with a second business log knowledge training vector, thereby generating a second multimodal government affair integration knowledge. In this step, the system first reviews and analyzes page behavior vectors previously extracted from the page user behavior data. The vector records various behavior modes of the user on the government service page in detail, such as click habits, browsing orders, residence time and the like, and is an important basis for understanding the interaction mode of the user and the page. Next, the system invokes a second business log knowledge training vector that was previously mined from the second business log through the structured text processing network. The vector contains deep features of the user requirements and business related key information in the second business scene, and is important for understanding the behaviors and requirements of the user in specific government service. Then, through a certain algorithm and model, the NLP log text processing system integrates the page behavior vector with the second business log knowledge training vector. This process is not a simple data splice, but rather, through deep learning or other advanced data analysis techniques, the potential links and complementary information between the two are mined, thereby forming a new knowledge representation that fuses the two data sources: and second multimodal government integrated knowledge. This integrated knowledge not only enriches the single understanding of the user's behavior, but also provides a wider view to insight into the user's real intent and potential needs in government services. For example, by analyzing the second multimodal government integrated knowledge, the government service organization can more precisely locate the direction of service improvement, improving user experience and satisfaction. Therefore, the page behavior vector and the second business log knowledge training vector are integrated through the structured text processing network, and second multi-modal government affair integration knowledge is generated. This step is significant in improving the level of intellectualization and personalization of government services.
Step 105, the NLP log text processing system performs service demand tracking according to the first multi-mode government integrated knowledge and the second multi-mode government integrated knowledge through the structured text processing network to obtain a service demand tracking label of the second service log training text compared with the first service log training text; determining a training error according to the service demand tracking label and the training label of the prior intelligent government log set; and optimizing network variables of the structured text processing network according to the training errors.
In the embodiment of the application, the business requirement tracking tag refers to a tag which is set for identifying and tracking the specific business requirement of a user in a government service flow. The labels can be generated based on the behavior data, log text and other multi-mode information of the user, and are helpful for government institutions to quickly locate and respond to the service demands of the user. For example, when a user consults a question about tax registration via a government service platform, the system may generate a business need tracking tag of "tax registration consultation" for the user.
Structured text processing networks are a deep learning model that is dedicated to processing and analyzing structured text data. It is able to understand the semantic structure of text, extract key information, and convert this information into a computer-understandable numerical representation. In government services, a structured text processing network is used to parse queries, requests, or feedback from users, thereby helping government authorities to respond more effectively to user needs.
Training errors refer to the difference between the model predicted result and the actual result during the training of the machine learning model. In the field of NLP (natural language processing), training errors can be computed by comparing the output of a model with the true annotation data. By reducing the training error, the performance of the model can be optimized, and the prediction accuracy of the model can be improved.
Network variables of a structured text processing network refer to parameters that need to be learned in a network model. These parameters are adjusted during training by an optimization algorithm to minimize the difference between model predictions and actual results. The optimization of network variables is a core step of machine learning model training, and directly affects the performance and accuracy of the model.
In detail, step 105 aims at traffic demand tracking with the first multimodal government integrated knowledge and the second multimodal government integrated knowledge through the structured text processing network and optimizing the network model. First, the system uses a structured text processing network to conduct in-depth analysis on the first multi-modal government integrated knowledge and the second multi-modal government integrated knowledge. The integrated knowledge integrates user behavior data and business log knowledge, and provides a rich information basis for business demand tracking. The system then tracks business requirements based on the integrated knowledge. By comparing the difference and the commonality of the first business log training text and the second business log training text, the system can identify the specific demand change of the user in the government service, thereby generating a corresponding business demand tracking label. These tags not only reflect the current needs of the user, but also reveal the trend of the needs. Then, the system determines training errors according to the service demand tracking labels and training labels of the prior intelligent government log sets. Training errors are important indicators for measuring model prediction accuracy. By comparing the predicted result of the model with the actual labeling data, the system can calculate the magnitude of the training error, thereby evaluating the performance of the model. Finally, the system optimizes network variables of the structured document processing network based on the training errors. The method is an iterative process, and the system reduces training errors by continuously adjusting network variables, so that the prediction accuracy of the model is improved. The optimized model can better understand and respond to the business demands of users, and improves the efficiency and quality of government service. In this way, step 105 performs business demand tracking and model optimization through the structured text processing network, and provides a more accurate and efficient user demand response mechanism for government service.
On the basis of the above, the integrated explanation is carried out by combining with an intelligent government affair scene, and in the intelligent government affair scene, the NLP log text processing system can improve the intelligent level of government affair service by processing and analyzing the government affair log set.
First, the NLP log text processing system obtains a large set of prior intelligent government log sets from the intelligent government log text processing system. The prior intelligent government affair log set mainly comprises two types of business log training texts: a first traffic log training text and a second traffic log training text. Meanwhile, the NLP log text processing system also acquires training labels corresponding to the first business log training text and the second business log training text, and the training labels record conversation topic views corresponding to the business conversation fragments in detail.
Next, the NLP log text processing system uses a built-in structured text processing network to deeply mine the first business log training text and extract a first business log knowledge training vector. Similarly, the NLP log text processing system also performs the same processing on the second business log training text to obtain a second business log knowledge training vector. The first business log knowledge training vector and the second business log knowledge training vector are the basis for generating the subsequent multi-modal government integrated knowledge.
After these training vectors are obtained, the NLP log text processing system begins processing page user behavior data. The page user behavior data includes various behavior information of the user on the government service platform, such as clicking, browsing, searching and the like, and is converted into page behavior vectors. The NLP log text processing system integrates the page behavior vectors and the first business log knowledge training vector which is mined previously through a structured text processing network, and first multi-mode government integrated knowledge is generated. The first multi-mode government affair integration knowledge integrates page behavior data and government affair business log information of the user, and provides a rich data basis for follow-up business demand tracking.
Similarly, the NLP log text processing system also integrates the page behavior vector with the second business log knowledge training vector to generate second multi-modal government integrated knowledge. The second multi-modal government affair integration knowledge further enriches the data storage of the NLP log text processing system and provides powerful support for accurately tracking the user demands.
And finally, the NLP log text processing system performs service demand tracking according to the generated first multi-mode government integrated knowledge and second multi-mode government integrated knowledge by utilizing a structured text processing network. The NLP log text processing system obtains a service demand tracking label by comparing and analyzing the difference between the second service log training text and the first service log training text. The business need tracking tag accurately reflects the dynamic changes in user needs. By combining training labels of the priori intelligent government log sets, the NLP log text processing system can determine training errors and optimize network variables of the structured text processing network accordingly, so that processing performance and accuracy of the NLP log text processing system are improved.
In the whole process, the NLP log text processing system provides powerful technical support for intelligent government affairs through efficient data processing and accurate demand tracking. The intelligent level of the government service is improved, and the office efficiency and the user satisfaction of the government departments are also greatly improved.
According to the embodiment of the application, the business knowledge in the intelligent government affair log set is deeply mined, page user behavior data is innovatively combined, and multi-mode government affair integrated knowledge is formed. The fusion not only enriches the dimension of the data, but also improves the accuracy and flexibility of service demand tracking. Specifically, the embodiment of the application utilizes a structured text processing network to extract corresponding knowledge training vectors from a first business log training text and a second business log training text, and then integrates the knowledge training vectors with page behavior vectors of page user behavior data to construct first multi-mode government integrated knowledge and second multi-mode government integrated knowledge. In this way, the embodiment of the application not only captures the text information in the service log, but also integrates the real-time data of the user behavior, so that the service demand tracking is more comprehensive and deeper.
In addition, the embodiment of the application generates the business demand tracking label by comparing the business demand difference between different business log training texts, thereby providing powerful data support for optimizing government service. By combining training labels of the priori intelligent government log sets, the embodiment of the application can accurately determine training errors, thereby optimizing network variables of the structured text processing network and improving self-learning capacity and adaptability of the whole system. The self-optimizing mechanism enables the embodiment of the application to continuously provide an efficient and accurate solution when facing to the complex and changeable government service demands.
In some optional embodiments, the first service log knowledge training vector includes a service session attribute vector for each service session segment in the first service log training text; the integrating the page behavior vector of the page user behavior data with the first business log knowledge training vector to obtain a first multi-modal government integrated knowledge comprises the following steps: integrating each business session attribute vector in the first business log knowledge training vector with the page behavior vector based on a linkage characteristic focusing strategy to obtain a first linkage integrated vector; projecting the first linkage integration vector to a target knowledge vector coordinate system to obtain a second linkage integration vector; projecting the page behavior vector to the target knowledge vector coordinate system to obtain a first page behavior vector; integrating the second linkage integration vector and the first page behavior vector based on the feature integration factor to obtain first multi-modal government affair integration knowledge; the network variables of the structured text processing network include the feature integration factor.
Based on the embodiment, the NLP log text processing system will perform deep processing on the intelligent government affair log data to form multi-modal government affair integration knowledge. The core of this process is how to efficiently integrate page user behavior data with business log knowledge training vectors.
First, the NLP log text processing system obtains a first business log knowledge training vector, which is actually composed of business session attribute vectors of respective business session segments in the first business log training text. For example, the first service log training text contains three service session segments, each segment has a corresponding service session attribute vector, and then the first service log knowledge training vector can be represented as a set of the three vectors.
Next, the NLP log text processing system needs to integrate these business session attribute vectors with the page behavior vectors of the page user behavior data. A method called "linkage feature focus strategy" is employed herein. Specifically, the system performs linkage processing on each service session attribute vector and the page behavior vector to form a new vector, namely a first linkage integration vector. The vector contains attribute information of the service session and integrates the characteristics of user behaviors.
To further improve the accuracy and usability of the integrated knowledge, the NLP log text processing system also performs a series of vector projections and integration operations. Firstly, the system projects the first linkage integration vector into a preset target knowledge vector coordinate system to obtain a second linkage integration vector. At the same time, the page behavior vector is projected into this coordinate system, forming a first page behavior vector. The purpose of this is to allow the two vectors to be compared and integrated in the same dimensional space.
Finally, the NLP log text processing system integrates the second linkage integration vector and the first page behavior vector by utilizing a feature integration factor, so that first multi-mode government affair integration knowledge is obtained. This feature integration factor is an important network variable of the structured text processing network that can be adaptively adjusted according to errors in the training process to ensure optimality of the integrated knowledge.
To more intuitively illustrate this process, a specific numerical example may be given. A certain service session attribute vector is [0.5,0.3,0.2], a page behavior vector is [0.4,0.4,0.2], and a feature integration factor is 0.7. After the linkage feature focus strategy processing, the first linkage integration vector may be [0.6,0.4,0.3]. This vector and page behavior vector are then projected into the target knowledge vector coordinate system, resulting in a second linkage integration vector [0.7,0.5,0.4] and a first page behavior vector [0.5,0.5,0.3]. And finally, integrating by utilizing the characteristic integration factors to obtain the first multi-mode government affair integration knowledge [0.64,0.52,0.37].
Therefore, the NLP log text processing system can achieve more accurate and comprehensive business demand tracking, and further improves the efficiency and quality of government service. Specifically, the system forms multi-mode government affair integrated knowledge by deeply mining business knowledge in the intelligent government affair log set and combining page user behavior data. The integrated knowledge not only contains text information in the service log, but also integrates real-time data of user behaviors, so that service demand tracking is more accurate and timely. In addition, the system also has self-optimization capability, and can continuously adjust and optimize network variables of the structured text processing network according to training errors in actual operation so as to adapt to continuous changes of government service requirements. Therefore, the technical scheme provides powerful technical support for the sustainable development of intelligent government affairs, and has remarkable beneficial effects.
In some embodiments that follow, the business need tracking tag characterizes a first dynamic business session segment of the second business log training text that has been updated as compared to the first business log training text; determining a training error according to the service demand tracking tag and the training label of the prior intelligent government log set comprises: determining a second dynamic service session fragment of the second service log training text, which is different from the first service log training text session theme viewpoint, according to the session theme viewpoint corresponding to each service session fragment in the first service log training text in the training label and the session theme viewpoint corresponding to each service session fragment in the second service log training text; determining a service demand tracking training error according to the first dynamic service session segment and the second dynamic service session segment; and tracking a training error based on the business requirement, and determining the training error.
In this embodiment, the NLP log text processing system will track traffic demand and determine training errors by comparing conversational topic views in different traffic log training texts. This process is critical to optimizing the tracking performance of the system and improving the efficiency of government service.
Firstly, an NLP log text processing system acquires a priori intelligent government log set, wherein the priori intelligent government log set comprises a first business log training text and a second business log training text. The two texts respectively record government service session data in different time periods. The system has annotated these data, including the conversational topic views to which each business conversation fragment corresponds.
Next, the system uses the business need tracking tag to identify a first dynamic business session segment in the second business log training text for which an update exists with respect to the first business log training text. These dynamic business session segments reflect changes and updates in government service requirements.
In order to determine a training error, the NLP log text processing system first finds out second dynamic business session fragments with different session topic views between a first business log training text and a second business log training text according to the session topic views of the first business log training text and the second business log training text in a training label. These different session topic perspectives may represent changes in user demand, policy updates, or adjustments to service flows.
For example, in the first service log training text, the session topic point of a certain session segment is "inquiry social security payment record", and in the second service log training text, the session topic point of a corresponding session segment becomes "application social security payment". The difference in these two session topic perspectives constitutes a second dynamic traffic session segment.
Then, the system combines the first dynamic service session segment and the second dynamic service session segment to determine a training error for service demand tracking. This training error reflects the accuracy and response speed of the system in tracking changes in traffic demand. Specifically, the system compares the degree of difference between the two dynamic service session segments and calculates the magnitude of the training error accordingly.
Finally, based on this business requirement tracking training error, the NLP log text processing system can determine the overall training error. This training error will be used to optimize the network variables of the structured text processing network, thereby improving the performance of the system in future business need tracking.
To more intuitively illustrate this process, a specific example may be given. When the system tracks business demands, the conversation topic view of one conversation segment in the first business log training text is found to be 'transacting residence', and the conversation topic view of the corresponding conversation segment in the second business log training text becomes 'updating residence information'. The system determines a second dynamic traffic session segment by comparing the difference in the views of the session topics. Then, the system combines the second dynamic service session segment and the corresponding first dynamic service session segment to calculate the training error of service demand tracking. Finally, the system uses this training error to optimize the network variables to better track future traffic demand changes.
Therefore, the NLP log text processing system can more accurately track the business demand change of the government service, so that the demands of users can be responded and met in time. The system determines the training error of the business demand tracking by comparing the conversational topic views in the training texts of different business logs, and optimizes the network variable by using the error. The tracking performance of the system is improved, and powerful data support is provided for the continuous improvement of government service. Therefore, the technical scheme has remarkable beneficial effects in improving the efficiency and the quality of government service.
In some independent embodiments, determining a traffic demand tracking training error from the first dynamic traffic session segment and the second dynamic traffic session segment comprises:
(1) Determining the difference of dynamic service session fragments: text comparison is performed on the first dynamic service session fragment and the second dynamic service session fragment, and content differences between the first dynamic service session fragment and the second dynamic service session fragment are identified, including but not limited to vocabulary, semantics and context changes;
(2) Calculating similarity of session topic views: and extracting session topic views in the first dynamic service session fragment and the second dynamic service session fragment. The similarity degree of the two session topic views is quantified by using a text similarity algorithm, such as cosine similarity, jaccard similarity or a similarity calculation method based on word embedding.
(3) Determining a service demand change index: and setting a threshold value based on the similarity calculation result, and considering that the service requirement is significantly changed when the similarity is lower than the threshold value. Session segments below the threshold are marked as service demand change points and the change types and degrees are recorded.
(4) Calculating a service demand tracking training error: for each session segment marked as a service demand change point, a corresponding error weight is allocated according to the change type and degree. And calculating a comprehensive business requirement tracking training error value by combining the error weight and the similarity score. This value reflects the accuracy and sensitivity of the system in tracking changes in traffic demand.
(5) Optimizing and adjusting: and adjusting parameters of the NLP log text processing system according to the service demand tracking training error so as to improve the tracking capability of service demand change. Machine learning algorithms, such as gradient descent, may be employed to minimize training errors and update model parameters of the system.
In this way, by accurately calculating the difference and the similarity between the first dynamic service session segment and the second dynamic service session segment, the training error of service demand tracking is further determined, and a clear direction and a quantization index are provided for optimizing the NLP log text processing system. By the mode, the system can be more intelligently adapted to the rapid change of government service demands, and service efficiency and user satisfaction are improved.
In other independent embodiments, the tracking training error based on the business need, determining the training error, comprises:
(1) Collecting service demand tracking training errors: continuously monitoring and recording service demand tracking training errors generated by the NLP log text processing system when tracking service demands in a plurality of training periods;
(2) Analyzing the distribution and trend of training errors: carrying out statistical analysis on the collected business demand tracking training errors, wherein the statistical analysis comprises calculation of an average value statistic, a standard deviation statistic, a maximum value statistic and a minimum value statistic of the errors; analyzing the distribution condition of the training errors, and identifying the service field or session type with larger errors and the change trend of the errors along with time.
(3) Weighted traffic demand tracking training errors: according to the importance of the service field, the frequency of the session type and the historical error data, different weights are distributed for the service demand tracking training errors of different types; a comprehensive weighted training error is calculated using a weighted average method.
(4) Determining the training error: comprehensively analyzing the comprehensive weighted training errors and auxiliary training errors (such as text classification errors, entity identification errors and the like); and according to the influence degree of each error on the system performance, calculating the training error by adopting weighted average or weighted average square error.
Therefore, the training errors are tracked through weighting the service demands and comprehensively considering other related errors, so that the overall training errors are determined, and a comprehensive error evaluation method is provided for optimizing the NLP log text processing system. The method not only considers the accuracy of service demand tracking, but also considers the performance of the system on other tasks, thereby being beneficial to realizing more comprehensive system optimization and promotion.
Under still other possible design considerations, the method further comprises: performing conversation topic view identification based on the first business log knowledge training vector to obtain conversation topic views identified for each business conversation segment in the first business log training text; performing conversation topic view identification based on the second business log knowledge training vector to obtain conversation topic views identified for each business conversation segment in the second business log training text; determining a first topic point recognition training error based on a conversation topic point corresponding to each business conversation segment in the first business log training text and a conversation topic point recognized for each business conversation segment in the first business log training text; determining a second topic point recognition training error based on the conversation topic point corresponding to each service conversation segment in the second service log training text and the conversation topic point recognized for each service conversation segment in the second service log training text;
said tracking training error based on said traffic demand, determining said training error comprising: and integrating the first theme viewpoint recognition training error, the second theme viewpoint recognition training error and the business requirement tracking training error to obtain the training error.
In this embodiment, the NLP log text processing system will determine the training error through a series of steps, primarily to more accurately evaluate the performance of the system and find possible improvement points.
First, the system will collect traffic demand tracking training errors. This is not a one-time activity, but is continued for a number of training periods. The system monitors and records various training errors generated when processing government logs and tracking business demands. These errors may result from different causes such as misunderstanding of the model for a particular type of session, response delay for new traffic demands, etc.
After sufficient data is collected, the system will perform a distribution and trend analysis of the training errors. The step is mainly to calculate key statistics such as an average value, a standard deviation, a maximum value, a minimum value and the like of errors by using a statistical method. These statistics help to understand the overall condition and fluctuation range of the error. Meanwhile, the system can further analyze the error distribution condition, such as which service areas have larger errors, which session types have poor processing effects, and whether the errors change with time.
The following is the weighted traffic demand tracking training error. This step is to more accurately reflect the impact of different types of traffic demands on system performance. The system can allocate different weights for the tracking training errors of different types of service demands according to the importance of the service field, the frequency of the session types and the historical error data. For example, for session types that occur frequently and have a large impact on government services, the system may be given a higher weight. Then, using a weighted average method, the system can calculate a comprehensive weighted training error that more fully reflects the system's performance in handling various business needs.
Finally, the system performs comprehensive analysis on the comprehensive weighted training errors and other auxiliary training errors (such as text classification errors, entity identification errors and the like). These training aid errors are also important factors affecting system performance. By comprehensively considering the various errors and calculating a weighted average or weighted average squared error based on their degree of impact on system performance, the system can derive a more comprehensive and accurate training error value.
This process not only helps to more accurately evaluate the performance of the NLP log text processing system, but also provides powerful data support for further optimization of the system. In this way, the system can more effectively identify and improve the defects in the process of processing government affair logs and tracking business demands, thereby improving the efficiency and quality of government affair services.
Thus, by continuously collecting and analyzing the business demand tracking training errors, weighting different types of errors and comprehensively considering other auxiliary training errors, a more comprehensive and accurate training error value is obtained. This not only helps to more accurately evaluate the performance of the NLP log text processing system, but also provides powerful data support and direction for continued improvement and optimization of the system. Therefore, the technical scheme has remarkable beneficial effects on improving the efficiency and quality of government service.
In a next step, the structured document processing network includes a log text feature embedded subnet; the first business log knowledge training vector comprises a first log text embedding vector generated by the log text feature embedding subnet for the first business log training text; the first journal text embedding vector comprises text description vectors of all business journal training text units in the first business journal training text; the second business log knowledge training vector comprises a second log text embedding vector generated by the log text feature embedding subnet for the second business log training text; the second log text embedding vector comprises text description vectors of all business log training text units in the second business log training text.
The method further comprises: and performing text paragraph matching based on the text description vector of each business log training text unit in the first business log training text and the text description vector of each business log training text unit in the second business log training text to obtain a text paragraph matching result, wherein the text paragraph matching result characterizes a second business log training text unit associated with the first business log training text unit in the first business log training text in the second business log training text.
Further, the determining, according to the session topic views corresponding to the service session segments in the first service log training text and the session topic views corresponding to the service session segments in the second service log training text in the training label, the second dynamic service session segment of the second service log training text with a different session topic view than the first service log training text includes: determining the content jump state of the business log training text unit in the first business log training text compared with the associated business log training text unit in the second business log training text according to the distribution characteristics of the first business log training text unit and the distribution characteristics of the second business log training text unit indicated by the text paragraph matching result; determining service session fragments associated with each service session fragment in the second service log training text in the first service log training text based on service session fragment distribution characteristics of each service session fragment in the second service log training text and content jump states of service log training text units associated with the first service log training text in comparison with service log training text units associated with the second service log training text; and aiming at each service session fragment in the second service log training text, if the session theme viewpoint corresponding to the service session fragment is different from the session theme viewpoint corresponding to the service session fragment associated with the service session fragment in the first service log training text, determining that the service session fragment is a second dynamic service session fragment.
In this embodiment, the NLP log text processing system will determine the second dynamic business session segment through a complex series of processing steps. These steps involve text embedding, text paragraph matching, and comparison of conversational topic views.
Firstly, the NLP log text processing system embeds the log text characteristics in the structured text processing network into a subnet to process the first service log training text and the second service log training text. This subnet can convert text into a vector form, facilitating subsequent numerical calculations and comparisons. Specifically, the system generates a first log text embedding vector for the first traffic log training text, the vector comprising text description vectors for respective traffic log training text units in the text. Likewise, the system may also generate a second log text embedding vector for the second traffic log training text.
Next, the system uses these text description vectors for text paragraph matching. This process is to find text units in the second traffic log training text that are associated with respective units in the first traffic log training text. The matching result characterizes the association relations, and provides a basis for subsequent content jump state analysis and conversation topic viewpoint comparison.
After determining the association, the system further analyzes the distribution characteristics of the associated text units. By comparing the content of the traffic log training text unit in the first traffic log training text with the content of the associated traffic log training text unit in the second traffic log training text, the system is able to determine a content jump status. This jump status may indicate that some content or view has changed between two texts.
Then, the system combines the distribution characteristics of each service session segment in the second service log training text and the content jump state determined before to find out the service session segment associated with each service session segment in the first service log training text. This step is to establish correspondence between the conversation fragments in the two texts.
Finally, the system compares, for each business session segment in the second business log training text, its session topic point of view with the session topic point of view of the business session segment associated in the first business log training text. If the two session topic perspectives are different, the traffic session segment is determined to be a second dynamic traffic session segment. This means that in this session fragment, the traffic demand or perspective changes dynamically.
It can be seen that the NLP log text processing system can accurately identify dynamically changing portions in the business log through fine text embedding, paragraph matching and topic view comparison. This is critical to government service because it can help government departments to capture changes in demand of people in time, thereby adjusting service policies and improving service efficiency.
It should be noted that the content hopping state is a core application point of the embodiment of the present application, and it involves comparing two pieces of associated text content to identify and determine content changes therein during text processing.
In detail, the content hopping state refers to a case where some content or view is changed between two pieces of associated text. Such changes may be due to a variety of factors, such as time, context, and the like. In NLP log text processing systems, the identification of content jump states is critical to accurately understand text intent, capture information changes, and perform subsequent data analysis.
In some independent embodiments, the innovative way of determining the content transition state includes one of the following:
(1) Based on the comparison of text embedded vectors: the text is converted into a vector form by utilizing the log text feature embedded subnet. By comparing the embedded vectors of the two pieces of text, the similarity and difference between them can be quantitatively analyzed. When the difference exceeds a certain threshold, the content jump is considered to occur;
(2) Text paragraph matching and distribution feature analysis: and finding out the associated text units in the two text sections through a text paragraph matching technology. The distribution characteristics of these associated text units, such as location, frequency, etc., are analyzed. If the distribution characteristics change obviously, indicating that content jump exists;
(3) Comparison of conversational topic views: and extracting the conversation topic views in the two text sections. These perspectives are compared for consistency and variability to determine if the content is jumped. This approach allows a deeper understanding of the substantial variations in text content. Dynamic threshold setting and adaptive adjustment: and dynamically setting a threshold value for judging content jump according to the specific content and the context environment of the text. The system can adaptively adjust the threshold to improve accuracy of content hop state identification.
(4) Multimodal fusion analysis: and comprehensively analyzing information of various modes such as texts, images, audios and the like. Through the multi-mode fusion technology, the content hopping signals are more comprehensively captured.
By the design, quantitative analysis is performed through vector comparison, and qualitative analysis is performed by combining text paragraph matching and conversation topic viewpoint comparison, so that the content jump state is more comprehensively identified. By dynamically setting and adjusting the threshold, the system can more accurately identify content hops under different situations. And the multi-mode information is utilized for comprehensive analysis, so that the accuracy and the comprehensiveness of identifying the content jump state are improved.
In a preferred embodiment, the structured text processing network further comprises a text knowledge derivative sub-network, the first business log knowledge training vector further comprising a first log text derivative knowledge vector generated by the text knowledge derivative sub-network based on the first log text embedding vector; the second business log knowledge training vector further comprises a second log text derived knowledge vector generated by the text knowledge derived subnet based on the second log text embedded vector; the step of identifying the session topic point of view based on the first service log knowledge training vector to obtain the session topic point of view identified for each service session segment in the first service log training text, comprising: inputting the first log text derived knowledge vector into a topic identification branch to obtain a conversation topic view identified by the topic identification branch for each business conversation segment in the first business log training text.
Further, the identifying the session topic point of view based on the second service log knowledge training vector to obtain a session topic point of view identified for each service session segment in the second service log training text includes: and inputting the second log text derived knowledge vector into a topic judgment branch to obtain a conversation topic view identified by the topic judgment branch for each business conversation segment in the second business log training text.
In this preferred embodiment, the structured text processing network of the NLP log text processing system is further enhanced by introducing a component named text knowledge derivative subnet. This newly added subnet can enrich the information content of the log text, thereby helping the system to more accurately identify the topic views of the session.
Specifically, the text knowledge derivative subnet receives as input the first text-embedded vector and generates the first text-derived knowledge vector through its complex network structure. The process is essentially deep feature extraction and knowledge reasoning of the original log text, can mine information and association implicit in the text, and provides a richer knowledge background for subsequent conversation topic view identification.
Similarly, for the second log text embedded vector, the text knowledge derivative subnet will also generate a corresponding second log text derived knowledge vector. The derived knowledge vectors not only contain the information of the original text, but also incorporate new knowledge and associations derived through the subnetwork, so that the system can understand the content and context of the log text more deeply.
After these derived knowledge vectors are obtained, the NLP log text processing system inputs them into the topic decision branch. This branch is specifically designed to identify topic views of the session, which can accurately identify topic views in each business session segment by analyzing these derived knowledge vectors.
For example, with a piece of web service log text about customer complaints, through the processing of a text knowledge derivative subnet, the system may mine the customer for derivative knowledge such as discontent emotion, specific service problems, and potential improvement suggestions. These derived knowledge vectors are then input into topic discriminant branches, helping the system to more accurately identify topic views of the session, such as "poor network quality of service", "need to improve some function", etc.
In this way, the NLP log text processing system is able to understand log text deeper, thereby providing businesses with more accurate, valuable business insight. The method is not only helpful for improving customer satisfaction, but also helps enterprises to find problems in time and improve service. Thus, by introducing a text knowledge derivative subnet and corresponding derivative knowledge vectors, the ability of the NLP log text processing system to identify conversational topic views is significantly improved. The improvement not only improves the intelligent level of the system, but also brings more efficient journal text processing and analysis capability for enterprises, thereby promoting the scientization and the precision of business decisions.
In some optional embodiments, the performing service requirement tracking according to the first multi-modal government integrated knowledge and the second multi-modal government integrated knowledge to obtain a service requirement tracking tag of the second service log training text compared with the first service log training text includes: determining a first session preference linear vector according to a first log text derived knowledge vector and first multi-modal government affair integrated knowledge, wherein the first session preference linear vector comprises a first viewpoint interest vector of each service session fragment in the first service log training text under each session topic viewpoint in X session topic viewpoints; integrating the session preference linear vector with the first log text derived knowledge vector to obtain first target multi-mode government affair integrated knowledge; determining a second session preference linear vector according to a second log text derived knowledge vector and second multi-modal government integrated knowledge, wherein the second session preference linear vector comprises a second viewpoint interest vector of each service session fragment in the second service log training text corresponding to each session topic viewpoint in X session topic viewpoints; integrating the second viewpoint interest vector with the second log text derived knowledge vector to obtain second target multi-mode government affair integrated knowledge; and combining the first target multi-mode government affair integrated knowledge with the second target multi-mode government affair integrated knowledge, and carrying out business demand tracking on a combined result to obtain a business demand tracking label of the second business log training text compared with the first business log training text.
In this embodiment, first, the system determines a first session preference linear vector based on the first log-text derived knowledge vector and the first multimodal government integrated knowledge. This linear vector is effectively a quantitative representation of the point of interest of each business session segment in the first business log training text under the X session topic points of view, that is, it reflects the user's interest and preference under each topic point of view. The term "multimodal government integrated knowledge" refers to government domain knowledge fused with various modal information such as text, image, sound and the like.
Next, the NLP log text processing system integrates the first session preference linear vector with the first log text derived knowledge vector to obtain a first target multimodal government affair integration knowledge. This process can be seen as an organic fusion of knowledge of the session preferences with text-derived knowledge, resulting in a more comprehensive, richer knowledge representation.
The system will then process the second traffic log training text in the same way. And determining a second session preference linear vector according to the second log text derived knowledge vector and the second multi-modal government integrated knowledge, wherein the second session preference linear vector also reflects the interests and preferences of each business session fragment in the second business log under each topic view. And then, integrating the second viewpoint interest vector and a second log text derived knowledge vector by the system to obtain second target multi-mode government affair integrated knowledge.
And finally, the NLP log text processing system combines the first target multi-modal government integrated knowledge with the second target multi-modal government integrated knowledge, and performs business demand tracking on the combined result. The process is to identify the change and trend of the second business log training text in business requirements compared with the first business log training text by comparing and analyzing the multi-modal government integrated knowledge of two different time points. Such changes and trends are ultimately quantified as business demand tracking tags that can provide valuable insight to businesses, helping them better understand the changes in user demand and thus make more accurate decisions.
In this way, the NLP log text processing system not only can process and understand a large amount of log text data, but also can extract valuable business requirement information from the log text data, thereby providing support for decision making of enterprises. The implementation of the method can certainly improve the operation efficiency and user satisfaction of enterprises, and simultaneously provides possibility for the intellectualization and individuation of government service.
In still other preferred embodiments, before the acquiring the prior art wisdom government log set and the training label of the prior art wisdom log set, the method further comprises: acquiring training reference information and a service log highlighting training text aiming at a first paragraph set in the first service log training text; the training reference information is used for prompting a session event focused by the first paragraph set in the process of replying the service log training text; performing service log training text rendition on a first paragraph set in the first service log training text according to training reference information based on the first service log training text, the service log highlighting training text and training reference information through a log text rendition network to obtain a second service log training text; and generating the prior intelligent government affair log set based on the first business log training text and the second business log training text.
Based on this embodiment, the NLP log text processing system also performs a series of important steps to prepare and optimize training data before obtaining the a priori set of intelligent government logs and their training annotations.
First, the system obtains training reference information and a service log highlighting training text for a first paragraph set in a first service log training text. The "training reference information" herein refers to information for prompting the first paragraph set of session events that should be focused in the process of replying to the service log training text. In other words, it directs the system how to focus on critical parts of the log text, which may contain important business information or user feedback. The service log highlighting training text is the emphasis or marking of key information in the first paragraph set, which is helpful for the system to better identify and retain the information in the repeating process.
Next, the NLP log text processing system uses a component named "log text review network" to review the first paragraph set based on the first business log training text, the business log highlighting training text, and the training reference information. The re-rendering is performed as directed by the training reference information in order to generate a second business log training text that both retains the original information and is more refined and explicit. This re-mastering technique helps to increase the readability and information content of the log text, making it more suitable for subsequent training and analysis.
Finally, based on the first business log training text and the second business log training text obtained through repeated description, the NLP log text processing system generates a priori intelligent government log set. The log set integrates the advantages of the original log and the repeated log, and provides richer and higher-quality training data for the system. Through the data, the system can learn more accurate business knowledge and user behavior patterns, so that the performance and accuracy of the system in processing the actual government affair log are improved.
In this way, the NLP log text processing system is able to more efficiently utilize and process government log data. This not only improves the level of intelligence of the system, but also allows government services to respond more efficiently and accurately to user needs. Meanwhile, training data is optimized through a repeated technology, so that the learning capacity and adaptability of the system are further enhanced, and powerful support is provided for the construction of intelligent government affairs.
Under an optional design thought, the method further includes, before performing service log training text reiteration on a first paragraph set in the first service log training text according to the training reference information and obtaining a second service log training text based on the first service log training text, the service log highlighting training text and the training reference information through the log text reiterating network, the method further includes: acquiring a history service log training text, a history service log highlighting training text aiming at a second paragraph set in the history service log training text and history guidance features, wherein the history guidance features are used for prompting a history session event focused by the second paragraph set in the process of replying the service log training text; performing service log training text reiteration on a second paragraph set in the history service log training text by using a log text reiteration network according to the history guidance characteristics, the history service log training text and the history service log highlighting training text to obtain a history reiteration service log training text; determining text replication errors according to the historical service log training text and the historical replication service log training text; and optimizing network variables of the log text repetition network according to the text repetition errors.
Based on the design thought, the NLP log text processing system also performs a series of preparation and optimization works before executing log text replication. This process essentially involves obtaining historical data, training and optimizing the log text replication network with these data to improve accuracy and efficiency of the replication.
First, the system obtains historical business log training text, historical business log highlighting training text for a second set of paragraphs in the text, and historical guidance features. The "history guidance feature" herein is a feature for prompting the second paragraph set of historical session events that should be of interest in the process of replying to the business log training text. These historical data provide the system with rich reference information that helps the system better understand and review current business logs.
And then, the system uses a journal text re-description network to re-describe a second paragraph set in the history service journal training text according to the history guidance characteristics, the history service journal training text and the history service journal highlight training text to obtain a history re-description service journal training text. This process is actually a pre-training of the web to document the log, which aims to let the web learn how to extract key information from the original log and to generate an accurate and smooth document.
Then, the system determines text replication errors according to the historical service log training text and the historical replication service log training text. This error reflects the difference between the duplicate network and the original text and is an important basis for optimizing the network.
Finally, the system optimizes network variables of the log text review network based on the text review errors. The process is to adjust the parameters and structure of the network so that the network can more accurately capture and retain the key information of the original text in the subsequent repeated task, thereby generating the repeated text with higher quality.
Through the design thought, the NLP log text processing system can fully utilize the historical data to train and optimize the log text reproduction network, so that the accuracy and efficiency of reproduction are improved. The system is not only helpful for better understanding and processing the current business log, but also provides more reliable and more accurate data support for subsequent intelligent government applications. Meanwhile, the optimization method also enhances the self-adaptive capacity and generalization performance of the system, so that the system can better cope with various complex practical application scenes.
It is worth mentioning that text reproduction errors play a critical role in the above technical solution, which is a key indicator for measuring the difference between the reproduction text and the original text.
The text reproduction error refers to the degree of difference between the reproduction text and the original text in terms of content, structure, semantics, and the like. An ideal, repeated text should be able to accurately convey the meaning of the original text while maintaining the fluency and naturalness of the language. However, in practice, for various reasons (such as insufficient training of models, data deviation, etc.), the repeated text may deviate from the original text to some extent, and this deviation is a text repeated error.
In detail, the conventional text repetition error evaluation may mainly focus on the literal similarity between texts, but the technical scheme of the application adopts a multi-dimensional evaluation method. Besides the literal similarity of the text, the method also comprehensively considers a plurality of aspects such as semantic similarity, structural similarity and information retention degree. The multi-dimensional evaluation mode can more fully reflect the quality of the repeated text.
The present application also innovatively introduces human assessment in order to more accurately evaluate text reproduction errors. By inviting professionals or ordinary users to score and evaluate the repeated text, error feedback more close to the actual use scene can be obtained. The evaluation mode can reflect the readability and accuracy of the repeated text more truly.
The application uses deep learning technique to deeply analyze text repetition errors. By constructing a deep learning model, error patterns in the repeated text can be automatically learned and identified, thereby providing powerful support for optimizing the repeated strategy. The error analysis method based on the deep learning has higher efficiency and accuracy.
In practical applications, the text replication error requirements of different fields and scenes may be different. Therefore, the application also provides a method for dynamically adjusting the error threshold value. According to the requirements and data characteristics of specific application scenes, the error threshold is flexibly set and adjusted so as to ensure that the quality of the repeated text meets specific requirements.
Therefore, the text replication error determination mode has the innovative characteristics of multidimensional evaluation, human evaluation introduction, error analysis based on deep learning, dynamic adjustment of an error threshold value and the like. The innovation points enable the determination of text replication errors to be more accurate, comprehensive and flexible, and provide powerful support for improving the performance of an NLP log text processing system.
In the next step, said determining text replication errors from said historical traffic log training text and said historical replication traffic log training text comprises: based on the training text elements of the second paragraph set in the history review service log training text process and the training text elements of the second paragraph set in the history service log training text, obtaining a session fragment consistency training error; and determining a second training error based on the session segment consistency training error, wherein the session segment consistency training error and the second training error have a set quantization relationship.
For example, NLP log text processing systems need to determine text replication errors from historical business log training text and historical replication business log training text. The process includes several key steps, namely firstly, obtaining session segment consistency training errors based on historical replication business log training texts and training text elements of a second paragraph set in the historical business log training texts. "training text elements" herein may include keywords, semantic roles, emotional tendencies, and the like elements that reflect text content and intent.
Specifically, the system will compare the second set of paragraphs in the two texts and analyze their consistency in training text elements. If there is a large difference between the elements in the duplicate text and the elements in the original text, the session fragment consistency training error increases accordingly. This error reflects the shortfall of the duplicate text in terms of preserving the original text key information and intent.
Next, the system determines a second training error based on the session segment consistency training error. The "second training error" is a more comprehensive and comprehensive error index, which not only considers the consistency of text elements, but also can comprise various aspects of language fluency, structural rationality and the like. The session segment consistency training error and the second training error have a set quantization relationship (such as positive correlation), which means that the system will convert the session segment consistency training error into the second training error according to a certain algorithm or rule.
In this way, the NLP log text processing system can evaluate the quality of the repeated text more accurately, thereby providing a powerful basis for subsequent optimization. The evaluation method not only considers the content elements of the text, but also comprehensively considers a plurality of aspects such as language, structure and the like, so that the evaluation result is more comprehensive and objective.
Thus, the method for determining the text repetition error is beneficial to an NLP log text processing system to more accurately measure the difference between the repeated text and the original text, and provides an explicit direction for subsequent optimization. The quality of the repeated text is improved, and more reliable and more accurate data support is provided for application fields such as intelligent government affairs. Meanwhile, the method also enhances the self-adaptive capacity and generalization performance of the system, so that the system can better cope with various complex practical application scenes.
In other alternative embodiments, before the acquiring the prior art wisdom government log set and the training label of the prior art wisdom log set, the method further includes: optimizing network variables of service session fragments in a third section set in the first service log training text to obtain a second service log training text; and generating the prior intelligent government affair log set based on the first business log training text and the second business log training text.
Based on the embodiment, the NLP log text processing system performs an important step of optimizing network variables of the service session segments in the third paragraph set in the first service log training text before acquiring the prior intelligent government log set and the training labels thereof.
In deep learning, network variables may refer to parameters of a neural network that are optimized during training to enable the network to learn and adapt better to the data. In this embodiment, the system specifically focuses on the traffic session segments within the third set of paragraphs in the first traffic log training text and optimizes the network variables for these session segments.
The optimization process is to adjust the values of the network variables so that the network can achieve better effect when processing the service session fragments. In particular, the system may use a technique called "back propagation" to gradually adjust the values of the network variables based on the performance of the network (e.g., accuracy, loss function values, etc.) in processing the session segments.
After optimization, the system obtains a new and improved second business log training text. The traffic session segments within the third paragraph set in this text are already better understood and handled by the network.
Next, the NLP log text processing system generates a priori set of intelligent government affair logs based on the original first business log training text and the optimized second business log training text. The log set not only contains the original business log information, but also integrates the improvement effect brought by the optimized network variable.
In this way, the NLP log text processing system can generate a more accurate and efficient prior intelligent government log set. This log set may provide more reliable data support for subsequent government services and decisions. Meanwhile, by optimizing the network variables, the system also improves the capability of the system when processing the complex service session fragments, thereby improving the overall performance and accuracy. Therefore, the quality of the service log training text is improved by optimizing the network variable, and a better priori intelligent government log set is generated. The system not only improves the performance and accuracy of the system, but also provides powerful support for the intellectualization and the high efficiency of government service.
In combination with the above, the method comprises: acquiring a preamble service log text and a service log text to be analyzed; and carrying out service demand tracking based on the preamble service log text and the to-be-analyzed service log text by using a structured text processing network to obtain a service demand tracking label of the to-be-analyzed service log text compared with the preamble service log text.
In this embodiment, the NLP log text processing system first obtains two types of key text information when executing a new task: preamble traffic log text and traffic log text to be analyzed. Both of these texts play an indispensable role in business analysis.
The preamble traffic log text, as the name implies, refers to the traffic log record that was generated before the traffic log text was intended to be analyzed. It may contain important information such as previous business needs, user feedback, system response, etc., which is key to understanding the current business state and history context.
The service log text to be analyzed is the service log which needs to be analyzed and processed currently. It records the latest business activity, user demand or system state change, and is the main object of business analysis.
After both types of text are obtained, the NLP log text processing system uses a structured text processing network for business requirement tracking. Structured text processing networks are a powerful deep learning model that can extract useful structured information from large amounts of text data, helping the system to better understand and analyze the text content.
In this process, the system compares the content of the intended analysis service log text with the content of the preceding service log text, identifying the differences and associations between the two. By means of the comparison, the system can be used for observing the change and trend of the service demands, and therefore support is provided for subsequent service decisions.
The results of the business need tracking would be packaged as business need tracking tags. These labels reflect concisely and clearly the changes in business requirements of the business log text to be analyzed as compared to the predecessor business log text. For example, one tag may indicate "a business need has been newly added" and another tag may indicate "a business need has been cancelled or modified".
In this way, the NLP log text processing system can efficiently track and analyze the change of the business requirement, and provides powerful support for business decisions of enterprises. By implementing the method, the accuracy and the efficiency of business analysis are improved, and enterprises are helped to better grasp market dynamics and user demands, so that more intelligent decisions are made.
Further, fig. 2 is a schematic structural diagram of an NLP log text processing system 200 according to an embodiment of the present application. The NLP log text processing system 200 as shown in fig. 2 includes a processor 210, and the processor 210 may call and run a computer program from memory to implement the method in an embodiment of the present application.
Optionally, as shown in fig. 2, NLP log text processing system 200 may also include memory 230. Wherein the processor 210 may call and run a computer program from the memory 230 to implement the method in an embodiment of the application.
Wherein the memory 230 may be a separate device from the processor 210 or may be integrated into the processor 210.
Optionally, as shown in fig. 2, the NLP log text processing system 200 may further include a transceiver 220, and the processor 210 may control the transceiver 220 to interact with other devices, and in particular, may send information or data to other devices, or receive information or data sent by other devices.
Optionally, the NLP log text processing system 200 may implement the storage engine or a component (such as a processing module) in the storage engine or a corresponding flow corresponding to a device in which the storage engine is deployed in each method of the embodiments of the present application, which is not described herein for brevity.
It should be appreciated that the processor of an embodiment of the present application may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method embodiments may be implemented by integrated logic circuits of hardware in a processor or instructions in software form. The Processor may be a general purpose Processor, a digital signal Processor (DIGITAL SIGNAL Processor, DSP), an Application SPECIFIC INTEGRATED Circuit (ASIC), an off-the-shelf programmable gate array (Field Programmable GATE ARRAY, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
It will be appreciated that the memory in embodiments of the application may be volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as external cache memory. By way of example, and not limitation, many forms of RAM are available, such as static random access memory (STATIC RAM, SRAM), dynamic random access memory (DYNAMIC RAM, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate Synchronous dynamic random access memory (Double DATA RATE SDRAM, DDR SDRAM), enhanced Synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCHLINK DRAM, SLDRAM), and Direct memory bus RAM (DR RAM). It should be noted that the memory of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
It should be appreciated that the above memory is exemplary but not limiting, and for example, the memory in the embodiments of the present application may also be static random access memory (STATIC RAM, SRAM), dynamic random access memory (DYNAMIC RAM, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (doubledata RATE SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCH LINK DRAM, SLDRAM), direct Rambus RAM (DR RAM), and the like. That is, the memory in embodiments of the present application is intended to comprise, without being limited to, these and any other suitable types of memory.
On the basis of the above, a computer readable storage medium is provided, on which a computer program is stored, which computer program, when run, implements the method described above.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art.

Claims (6)

1. A business log text processing method applied to intelligent government affairs is characterized by comprising the following steps:
Acquiring a priori intelligent government affair log set and training marks of the priori intelligent government affair log set, wherein the priori intelligent government affair log set comprises a first business log training text and a second business log training text, and the training marks comprise conversation topic views corresponding to all business conversation fragments in the first business log training text and conversation topic views corresponding to all business conversation fragments in the second business log training text;
Mining a first business log knowledge training vector of the first business log training text and a second business log knowledge training vector of the second business log training text through a structured text processing network;
Integrating a page behavior vector of page user behavior data with the first business log knowledge training vector through the structured text processing network to obtain first multi-mode government integrated knowledge; the page user behavior data comprises X conversation topic views, wherein X is the total number of the conversation topic views, and X is an integer greater than 1;
Integrating the page behavior vector with the second business log knowledge training vector through the structured text processing network to obtain second multi-modal government integrated knowledge;
Carrying out service demand tracking according to the first multi-mode government affair integration knowledge and the second multi-mode government affair integration knowledge through the structured text processing network to obtain a service demand tracking label of the second service log training text compared with the first service log training text;
Determining a training error according to the service demand tracking label and the training label of the prior intelligent government log set;
Optimizing network variables of the structured text processing network according to the training errors;
The business requirement tracking tag characterizes a first dynamic business session segment in the second business log training text, which is updated compared with the first business log training text; the determining the training error according to the service demand tracking label and the training label of the prior intelligent government affair log set comprises the following steps: determining a second dynamic service session fragment of the second service log training text, which is different from the first service log training text session theme viewpoint, according to the session theme viewpoint corresponding to each service session fragment in the first service log training text in the training label and the session theme viewpoint corresponding to each service session fragment in the second service log training text; determining a service demand tracking training error according to the first dynamic service session segment and the second dynamic service session segment; tracking a training error based on the business requirement, determining the training error;
The method further comprises the steps of: performing conversation topic view identification based on the first business log knowledge training vector to obtain conversation topic views identified for each business conversation segment in the first business log training text; performing conversation topic view identification based on the second business log knowledge training vector to obtain conversation topic views identified for each business conversation segment in the second business log training text; determining a first topic point recognition training error based on a conversation topic point corresponding to each business conversation segment in the first business log training text and a conversation topic point recognized for each business conversation segment in the first business log training text; determining a second topic point recognition training error based on the conversation topic point corresponding to each service conversation segment in the second service log training text and the conversation topic point recognized for each service conversation segment in the second service log training text; the step of tracking the training error based on the service requirement, the step of determining the training error comprises the following steps: integrating the first theme viewpoint identification training error, the second theme viewpoint identification training error and the business requirement tracking training error to obtain the training error;
The structured document processing network comprises a log text feature embedded subnet; the first business log knowledge training vector comprises a first log text embedding vector generated by the log text feature embedding subnet for the first business log training text; the first journal text embedding vector comprises text description vectors of all business journal training text units in the first business journal training text; the second business log knowledge training vector comprises a second log text embedding vector generated by the log text feature embedding subnet for the second business log training text; the second log text embedding vector comprises text description vectors of all business log training text units in the second business log training text; the method further comprises: text paragraph matching is carried out on the basis of text description vectors of all business log training text units in the first business log training text and text description vectors of all business log training text units in the second business log training text, so that text paragraph matching results are obtained, and the text paragraph matching results represent second business log training text units associated with first business log training text units in the first business log training text in the second business log training text; the determining, according to the session topic views corresponding to the service session segments in the first service log training text and the session topic views corresponding to the service session segments in the second service log training text in the training label, the second dynamic service session segments of the second service log training text that are different from the first service log training text session topic views includes: determining the content jump state of the business log training text unit in the first business log training text compared with the associated business log training text unit in the second business log training text according to the distribution characteristics of the first business log training text unit and the distribution characteristics of the second business log training text unit indicated by the text paragraph matching result; determining service session fragments associated with each service session fragment in the second service log training text in the first service log training text based on service session fragment distribution characteristics of each service session fragment in the second service log training text and content jump states of service log training text units associated with the first service log training text in comparison with service log training text units associated with the second service log training text; aiming at each service session fragment in the second service log training text, if the session theme viewpoint corresponding to the service session fragment is different from the session theme viewpoint corresponding to the service session fragment associated with the service session fragment in the first service log training text, determining that the service session fragment is a second dynamic service session fragment;
The structured text processing network further comprises a text knowledge derivative sub-network, and the first business log knowledge training vector further comprises a first log text derivative knowledge vector generated by the text knowledge derivative sub-network based on the first log text embedding vector; the second business log knowledge training vector further comprises a second log text derived knowledge vector generated by the text knowledge derived subnet based on the second log text embedded vector; the step of identifying the session topic point of view based on the first service log knowledge training vector to obtain the session topic point of view identified for each service session segment in the first service log training text comprises the following steps: inputting the first log text derived knowledge vector into a topic identification branch to obtain a conversation topic view identified by the topic identification branch for each business conversation segment in the first business log training text; the step of identifying the session topic point of view based on the second service log knowledge training vector to obtain the session topic point of view identified for each service session segment in the second service log training text comprises the following steps: inputting the second log text derived knowledge vector into a topic identification branch to obtain a conversation topic view identified by the topic identification branch for each business conversation segment in the second business log training text; the service demand tracking according to the first multi-modal government integrated knowledge and the second multi-modal government integrated knowledge to obtain a service demand tracking tag of the second service log training text compared with the first service log training text, including: determining a first session preference linear vector according to a first log text derived knowledge vector and first multi-modal government affair integrated knowledge, wherein the first session preference linear vector comprises a first viewpoint interest vector of each service session fragment in the first service log training text under each session topic viewpoint in X session topic viewpoints; integrating the session preference linear vector with the first log text derived knowledge vector to obtain first target multi-mode government affair integrated knowledge; determining a second session preference linear vector according to a second log text derived knowledge vector and second multi-modal government integrated knowledge, wherein the second session preference linear vector comprises a second viewpoint interest vector of each service session fragment in the second service log training text corresponding to each session topic viewpoint in X session topic viewpoints; integrating the second viewpoint interest vector with the second log text derived knowledge vector to obtain second target multi-mode government affair integrated knowledge; and combining the first target multi-mode government affair integrated knowledge with the second target multi-mode government affair integrated knowledge, and carrying out business demand tracking on a combined result to obtain a business demand tracking label of the second business log training text compared with the first business log training text.
2. The method of claim 1, wherein the first business log knowledge training vector comprises a business session attribute vector for each business session segment in the first business log training text; the integrating the page behavior vector of the page user behavior data with the first business log knowledge training vector to obtain a first multi-modal government integrated knowledge comprises the following steps:
Integrating each business session attribute vector in the first business log knowledge training vector with the page behavior vector based on a linkage characteristic focusing strategy to obtain a first linkage integrated vector;
projecting the first linkage integration vector to a target knowledge vector coordinate system to obtain a second linkage integration vector;
projecting the page behavior vector to the target knowledge vector coordinate system to obtain a first page behavior vector;
integrating the second linkage integration vector and the first page behavior vector based on the feature integration factor to obtain first multi-modal government affair integration knowledge; the network variables of the structured text processing network include the feature integration factor.
3. The method of any one of claims 1 to 2, wherein prior to the obtaining the set of a priori intelligent government affair logs and the training labels of the set of a priori intelligent government affair logs, the method further comprises:
acquiring training reference information and a service log highlighting training text aiming at a first paragraph set in the first service log training text; the training reference information is used for prompting a session event focused by the first paragraph set in the process of replying the service log training text;
Performing service log training text rendition on a first paragraph set in the first service log training text according to training reference information based on the first service log training text, the service log highlighting training text and training reference information through a log text rendition network to obtain a second service log training text;
Generating the prior intelligent government affair log set based on the first business log training text and the second business log training text;
the method further comprises the steps of:
Acquiring a history service log training text, a history service log highlighting training text aiming at a second paragraph set in the history service log training text and history guidance features, wherein the history guidance features are used for prompting a history session event focused by the second paragraph set in the process of replying the service log training text;
performing service log training text reiteration on a second paragraph set in the history service log training text by using a log text reiteration network according to the history guidance characteristics, the history service log training text and the history service log highlighting training text to obtain a history reiteration service log training text;
Determining text replication errors according to the historical service log training text and the historical replication service log training text;
Optimizing network variables of the log text repeating network according to the text repeating error;
Wherein determining text replication errors according to the historical service log training text and the historical replication service log training text comprises:
Based on the training text elements of the second paragraph set in the history review service log training text process and the training text elements of the second paragraph set in the history service log training text, obtaining a session fragment consistency training error;
and determining a second training error based on the session segment consistency training error, wherein the session segment consistency training error and the second training error have a set quantization relationship.
4. The method of any one of claims 1 to 2, wherein prior to the obtaining the set of a priori intelligent government affair logs and the training labels of the set of a priori intelligent government affair logs, the method further comprises:
optimizing network variables of service session fragments in a third section set in the first service log training text to obtain a second service log training text;
And generating the prior intelligent government affair log set based on the first business log training text and the second business log training text.
5. The method according to claim 1, characterized in that the method comprises:
acquiring a preamble service log text and a service log text to be analyzed;
and carrying out service demand tracking based on the preamble service log text and the to-be-analyzed service log text by using a structured text processing network to obtain a service demand tracking label of the to-be-analyzed service log text compared with the preamble service log text.
6. An NLP log text processing system, comprising at least one processor and memory; the memory stores computer-executable instructions; the at least one processor executing computer-executable instructions stored in the memory causes the at least one processor to perform the method of any one of claims 1-5.
CN202410593094.1A 2024-05-14 2024-05-14 Business log text processing method and system applied to intelligent government affairs Active CN118171110B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410593094.1A CN118171110B (en) 2024-05-14 2024-05-14 Business log text processing method and system applied to intelligent government affairs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410593094.1A CN118171110B (en) 2024-05-14 2024-05-14 Business log text processing method and system applied to intelligent government affairs

Publications (2)

Publication Number Publication Date
CN118171110A CN118171110A (en) 2024-06-11
CN118171110B true CN118171110B (en) 2024-07-19

Family

ID=91350857

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410593094.1A Active CN118171110B (en) 2024-05-14 2024-05-14 Business log text processing method and system applied to intelligent government affairs

Country Status (1)

Country Link
CN (1) CN118171110B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677828A (en) * 2016-01-04 2016-06-15 成都陌云科技有限公司 User information processing method based on big data
CN113360763A (en) * 2021-06-19 2021-09-07 阚忠建 Service attention tendency prediction method based on artificial intelligence and artificial intelligence cloud system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9361604B2 (en) * 2010-09-23 2016-06-07 Avaya Inc. System and method for a context-based rich communication log

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677828A (en) * 2016-01-04 2016-06-15 成都陌云科技有限公司 User information processing method based on big data
CN113360763A (en) * 2021-06-19 2021-09-07 阚忠建 Service attention tendency prediction method based on artificial intelligence and artificial intelligence cloud system

Also Published As

Publication number Publication date
CN118171110A (en) 2024-06-11

Similar Documents

Publication Publication Date Title
US20210081376A1 (en) Construction method, device, computing device, and storage medium for constructing patent knowledge database
US20220164683A1 (en) Generating a domain-specific knowledge graph from unstructured computer text
CN109947902B (en) Data query method and device and readable medium
CN119089398A (en) Content recommendation method and system based on semantic recognition
CN118797077B (en) A security knowledge generation method and system based on large language model
CN117972160B (en) Multi-mode information processing method and device
CN118520035B (en) Meteorological service platform data management method and system based on artificial intelligence
CN118503392B (en) Retrieval enhancement type intelligent question-answering method and system based on large model
CN118484592A (en) A tourism information consultation method and system based on big data analysis
CN117785964A (en) Data processing method and system applied to network service
CN112165639B (en) Content distribution method, device, electronic equipment and storage medium
CN119316389B (en) Dialogue method, device, equipment and medium
CN119295091B (en) A micro-marketing auxiliary method and system based on large language model
CN119226234B (en) AI-based electronic file intelligent management method and system
CN119599619B (en) Human resources matching system based on big data
Han et al. Fostering college students’ mental well-being: the impact of social networking site utilization on emotion management and regulation
CN118446856B (en) Personalized matching method and system of digital cultural education resources based on AIGC
CN115203282A (en) Intelligent enterprise user data processing method and system combined with deep learning
Li et al. Enterprise precision marketing effectiveness model based on data mining technology
CN118171110B (en) Business log text processing method and system applied to intelligent government affairs
Bahrainian et al. Predicting topics in scholarly papers
CN120371980B (en) An intelligent data-driven rapid construction optimization system for power production scenarios
CN118446416B (en) Digital enterprise management method and system based on machine learning
Fugini et al. A text analytics architecture for smart companies
CN118446417B (en) Data analysis method and system based on digital enterprise management

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Room 226, 2nd Floor, No. 5 Guanghua Road, Zhangjiawan Town, Tongzhou District, Beijing, 101199

Patentee after: Beijing Zhongke Jindezhu Intelligent Technology Co.,Ltd.

Country or region after: China

Address before: Room 226, 2nd Floor, No. 5 Guanghua Road, Zhangjiawan Town, Tongzhou District, Beijing, 101199

Patentee before: Beijing Zhongkejin Finite Element Technology Co.,Ltd.

Country or region before: China