CN118805166A

CN118805166A - A compliance detection method, device, equipment and medium for large language model data interaction

Info

Publication number: CN118805166A
Application number: CN202380020133.2A
Authority: CN
Inventors: 杨子言
Original assignee: Individual
Current assignee: Individual
Priority date: 2023-07-11
Filing date: 2023-11-02
Publication date: 2024-10-18
Anticipated expiration: 2043-11-02
Also published as: CN118805166B

Abstract

The present invention provides a compliance detection method, device, equipment and medium for data interaction of a large language model. The method at least includes a user interaction detection process, that is, a compliance detection is performed on the user request content itself to determine whether there is any violation information. If not, the relevant data stream of the user request is input to the large language model for processing to obtain the processing result returned by the large language model; the processing result is subjected to compliance detection to determine whether there is any violation information. If so, the violation information is filtered and the processing result is output; the compliance detection includes sensitive word checking and sensitive information checking. The present invention at least performs compliance detection on the data in the user request interaction stage, thereby realizing the standardized use of the large language model.

Description

Compliance detection method, device, equipment and medium for large language model data interaction

Technical Field

The invention relates to the technical field of large language model data interaction, in particular to a method, a device, equipment and a medium for detecting compliance of large language model data interaction.

Background

Along with the commercialization process of the large language model, in order to adapt to the supervision requirement of the related laws and regulations, and follow the related laws and regulations such as related content examination and filtration, data security, personal information protection and the like, effective and normative detection needs to be carried out on products applying the large language model. However, no relevant method or product has yet emerged.

Summary of The Invention

Technical problem

The technical problem to be solved by the invention is to provide a compliance detection method for large language model data interaction

The device, the equipment and the medium at least carry out compliance detection on the data in the interaction stage of the user request, thereby realizing the standard use of the large language model.

Technical solution

In a first aspect, the present invention provides a method for detecting compliance of large language model data interaction, including a user interaction detection process, where the user interaction detection process includes:

s1, carrying out compliance detection on the user request content, judging whether illegal information exists, if yes, directly returning a detection result and ending interaction, and if not, carrying out the next step;

S3, inputting the related data stream requested by the user into a large language model for processing, and obtaining a processing result returned by the large language model;

s4, carrying out compliance detection on the processing result, judging whether illegal information exists, if yes, filtering the illegal information, then carrying out the next step, and if not, directly carrying out the next step;

S5, outputting the processing result;

the compliance detection comprises a sensitive word check and a sensitive information check which do not accord with laws and regulations.

In a second aspect, the present invention provides a compliance detection apparatus for large language model data interaction, including a user interaction detection module, where the user interaction detection module is configured to perform the following steps:

S5, outputting the processing result;

In a third aspect, the invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of the first aspect when executing the program.

In a fourth aspect, the present invention provides a computer-readable storage medium having a computer program stored thereon,

The program, when executed by a processor, implements the method of the first aspect.

Advantageous effects

One or more technical solutions provided in the embodiments of the present invention at least have the following technical effects or advantages:

The embodiment of the invention carries out compliance detection on the data interaction process (comprising an internal interaction process and an external interaction process) of the existing large language model, so that the use of the large language model can accord with the regulation of related laws and regulations, and the blank that the current large language model has no related compliance detection is made up. The embodiment of the invention can simultaneously carry out relevant security detection such as user access authority detection, use limit detection or content authenticity detection besides the compliance detection of relevant laws and regulations, thereby ensuring the use safety of a large language model. The embodiment of the invention also pairs the user request with the request processing result and/or the supplementary result and stores the result, so that the interactive history record can be reserved for conveniently conducting safety compliance audit in the future.

The above description is merely an overview of the technical solutions of the present invention, in order to make the technical means of the present invention more clearly understood

But rather should be understood to be in the light of the present disclosure, and in order that the above-recited and other objects, features and advantages of the present invention will be more readily understood, a more particular description of the invention will be rendered by reference to specific embodiments thereof.

Drawings

The invention will be further described with reference to examples of embodiments with reference to the accompanying drawings.

FIG. 1 is a schematic diagram of a frame of a system of the present invention;

FIG. 2 is a functional schematic diagram of a safety compliance detection module according to the present invention;

FIG. 3 is a flow chart of a method according to a first embodiment of the invention;

FIG. 4 is a schematic diagram of a device according to a second embodiment of the present invention;

fig. 5 is a schematic structural diagram of an electronic device according to a third embodiment of the present invention;

fig. 6 is a schematic structural diagram of a medium in a fourth embodiment of the present invention.

Embodiments of the invention

The embodiment of the application provides a method, a device, equipment and a medium for detecting the compliance of large language model data interaction, which are used for detecting the compliance of data at least in the interaction stage of a user request, thereby realizing the standard use of the large language model.

The technical scheme in the embodiment of the application has the following overall thought: and performing compliance detection on the data interaction process (comprising an internal interaction process and an external interaction process) of the existing large language model, so that the use of the large language model can accord with the regulation of related laws and regulations, and the blank that the current large language model has no related compliance detection is made up.

The embodiment of the invention can simultaneously carry out relevant security detection such as user access authority detection, use limit detection or content authenticity detection besides the compliance detection of relevant laws and regulations, thereby ensuring the use safety of a large language model. The embodiment of the invention also pairs the user request with the request processing result and/or the supplementary result and stores the result, so that the interactive history record can be reserved for conveniently conducting safety compliance audit in the future.

Before describing a specific embodiment, a system framework corresponding to the method of the embodiment of the present application is described, and as shown in fig. 1 and fig. 2, the system is roughly divided into three parts:

Large language models, which are machine learning techniques, are used to study probability distributions of natural language data and use these distributions to accomplish language-dependent tasks such as text classification, natural language understanding, machine translation, etc.

The interactive flow is to receive the request of the user, process the related processing for the content of the request and return the processing result.

The data storage module is also required to construct a private knowledge base for a large language model product with the private knowledge base; the private knowledge base may supplement the user's request with information.

The safety compliance detection module is used for carrying out safety compliance detection (mainly comprising sensitive word detection and sensitive information detection which do not meet legal regulations) and safety detection (mainly comprising user access authority detection, use limit detection or false content and the like) on the large language model data interaction process so as to ensure that the use of the large language model can meet the regulation of related legal regulations and the use safety. The method specifically comprises the following steps:

checking sensitive words, namely checking the sensitive words which do not accord with laws and regulations;

sensitive information inspection, i.e. inspection involving enterprise confidential, personal privacy information, etc.;

checking the access authority of the user, namely managing the file authority;

checking the limit, namely controlling the interaction cost, so as to realize the purpose of financial audit;

Interface authentication, which is to avoid being attacked or maliciously utilized when an external interface is called;

The trusted source check is carried out, when external data is accessed, the trusted source of the external data is confirmed (similar to a white list), so that the returned external data is prevented from being polluted;

Checking the authenticity, checking whether the returned result is real in content and not kneading, but reducing the authenticity detection weight in the interaction of creative generation, and ensuring that the creative is smoothly generated;

And audit service, archiving the history log for future security audit and problem tracing.

Example 1

As shown in fig. 3, the present embodiment provides a method for detecting compliance of large language model data interaction, including a user interaction detection process, where the user interaction detection process includes:

S5, outputting the processing result;

The compliance detection comprises a sensitive word check and a sensitive information check which do not accord with laws and regulations, and is used for avoiding sensitive word transmission and sensitive information leakage.

Further, as a better or more specific implementation manner of this embodiment, the user interaction detection process further includes:

S2, judging whether the user request content has a part which needs to access the knowledge base and relies on the knowledge base as the supplementary information, if so, carrying out compliance checking and user access authority checking on the supplementary result when the knowledge base returns the supplementary result, judging whether the supplementary result has the illegal information or the unauthorized access information, if so, filtering the illegal information or the unauthorized access information, returning, and entering into S3, otherwise, entering into S3. Thereby avoiding the security of sensitive information leakage or unauthorized access caused by the supplement result returned by the knowledge base.

The S1 also carries out user request content per se, including user access authority checking and use limit checking, and judges whether the user request content has unauthorized access or usage amount overrun; if yes, directly returning a detection result and ending interaction, and if not, carrying out the next step;

In the step S2, user access permission checking is also carried out on the supplementary result, and whether the supplementary result has unauthorized access or not is judged; if yes, filtering the unauthorized access content, returning to the S3, and if not, entering the S3;

In the step S3, interface authentication is carried out for the call initiated by the large language model, the next step is carried out when the authentication is passed, and the authentication result is returned when the authentication is not passed;

in the step S4, user access authority checking and content authenticity checking are also carried out on the processing result; judging whether the processing result has unauthorized access or false content, if so, filtering the processing result and then carrying out the next step, and if not, directly carrying out the next step;

the step S5 further includes: and matching the user request content with the processing result and/or the supplementary result and storing the matched user request content.

Further, as a more preferred or specific implementation of this embodiment, the method of this embodiment further includes at least one of the following processes:

The private knowledge base construction or update detection process: when the private knowledge base is constructed or updated, the compliance detection and the user access authority check are carried out on the input document, whether the document has illegal information or unauthorized access is judged, if not, the document content and related parameters are stored in the knowledge base, if so, the document is prompted to have related risks, the user authority is confirmed, after the user authorization, the related processing log is recorded, the document content risk level and the user level information are marked, and then the document, the document content and the related parameters are stored in the knowledge base.

External data access interaction detection process: when a large language model accesses an external data source, performing compliance detection, trusted source detection and content authenticity detection on the external data returned by interaction, judging whether the external data has illegal information, an untrusted source and false content, if yes, filtering the external data, if yes, judging whether the external data has illegal information, an untrusted source and false content;

the method ensures that the returned external data does not contain sensitive information, and the data sources are reliable and the data is true and reliable.

The function module invokes an interaction detection process: when the large language model calls the functional module, performing compliance detection, trusted source detection and content authenticity detection on the functional data interactively returned by the functional module, judging whether illegal information, an untrusted source and false content exist in the functional data, and if yes, filtering the functional data. So as to ensure that the returned functional data does not contain sensitive information, and the data sources are reliable and the data is true and reliable.

Based on the same inventive concept, the application also provides a device corresponding to the method in the first embodiment, and the details of the second embodiment are shown.

Example two

As shown in fig. 4, in this embodiment, a compliance detecting device for large language model data interaction is provided,

The system comprises a user interaction detection module, wherein the user interaction detection module is used for executing the following steps:

S5, outputting the processing result;

Further, as a better or more specific implementation manner of this embodiment, the user interaction detection module is further configured to perform the following steps after performing the step S1:

S2, judging whether the user request content has a part which needs to access the knowledge base and relies on the knowledge base as the supplementary information, if so, carrying out compliance checking and user access authority checking on the supplementary result when the knowledge base returns the supplementary result, judging whether the supplementary result has the illegal information or the unauthorized access information, if so, filtering the illegal information or the unauthorized access information, returning, and entering into S3, otherwise, entering into S3.

Further, as a more preferred or specific implementation of this embodiment, the apparatus of this embodiment further includes at least one of the following modules:

the private knowledge base construction or update detection module: the method is used for carrying out the compliance detection and the user access authority check on the input document when the private knowledge base is constructed or updated, judging whether the document has illegal information or unauthorized access, if not, storing the document, the document content and related parameters into the knowledge base, if so, prompting that the document has related risks, confirming the user authority, and recording related processing logs after the user is authorized

And marking the document, the document content risk level and the user level information, and storing the document, the document content and related parameters into a knowledge base.

External data access interaction detection module: when the large language model accesses an external data source, performing compliance detection, trusted source detection and content authenticity detection on the external data returned by interaction, judging whether the external data has illegal information, an untrusted source and false content, if yes, filtering the external data, if yes, judging whether the external data has illegal information, an untrusted source and false content; the method ensures that the returned external data does not contain sensitive information, and the data sources are reliable and the data is true and reliable.

The function module calls the interaction detection module: and when the large language model calls the functional module, performing compliance detection, trusted source detection and content authenticity detection on the functional data interactively returned by the functional module, judging whether the functional data has illegal information, unreliable sources and false content, and if so, filtering the functional data. So as to ensure that the returned functional data does not contain sensitive information, and the data sources are reliable and the data is true and reliable.

Since the device described in the second embodiment of the present invention is a device for implementing the method described in the first embodiment of the present invention, based on the method described in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and the deformation of the device, and thus the detailed description thereof is omitted herein. All devices used in the method according to the first embodiment of the present invention are within the scope of the present invention.

Based on the same inventive concept, the application provides an electronic device embodiment corresponding to the first embodiment, and the details of the third embodiment are shown in the specification.

Example III

The present embodiment provides an electronic device, as shown in fig. 5, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where any implementation of the first embodiment may be implemented when the processor executes the computer program.

Since the electronic device described in this embodiment is the device for implementing the method in the first embodiment of the present application,

Therefore, based on the method described in the first embodiment of the present application, those skilled in the art can understand the specific implementation of the electronic device and various modifications thereof, so how the electronic device implements the method in the first embodiment of the present application will not be described in detail herein. The apparatus used to implement the methods of embodiments of the present application will be within the scope of the intended protection of the present application.

Based on the same inventive concept, the application provides a storage medium corresponding to the first embodiment, and the detail of the fourth embodiment is shown in the specification.

Example IV

The present embodiment provides a computer readable storage medium, as shown in fig. 6, on which a computer program is stored, which when executed by a processor, can implement any implementation of the first embodiment.

The technical scheme provided by the embodiment of the application has at least the following technical effects or advantages: the embodiment of the application provides a method, a device, a system, equipment and a medium,

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the flowchart of the process or processes and-

Or a block diagram of one or more functions specified in the block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that the specific embodiments described are illustrative only and not intended to limit the scope of the invention, and that equivalent modifications and variations of the invention in light of the spirit of the invention will be covered by the claims of the present invention.

Claims

1. A large language model data interaction compliance detection method is characterized in that: the method comprises a user interaction detection process, wherein the user interaction detection process comprises the following steps:

s4, carrying out compliance detection on the processing result, judging whether illegal information exists, if so,

Filtering the violation information and then carrying out the next step, if not, directly carrying out the next step;

S5, outputting the processing result;

2. The method for detecting compliance with large language model data interaction according to claim 1, wherein the method comprises the steps of: the user interaction detection process further comprises:

3. The method for detecting compliance with large language model data interaction according to claim 2, wherein the method comprises the steps of:

The S1 also carries out the access right check and the use limit check on the user request content itself, and judges whether the user request content has the condition of unauthorized access or excessive consumption; if yes, directly returning a detection result and ending interaction, and if not, carrying out the next step;

4. The method for detecting compliance with large language model data interaction according to claim 1, wherein the method comprises the steps of: also included is at least one of the following:

The private knowledge base construction or update detection process: when a private knowledge base is constructed or updated, carrying out compliance detection and user access authority check on an incoming document, judging whether the document has illegal information or unauthorized access, if not, storing the document, document content and related parameters into the knowledge base, if so, prompting that the document has related risks, confirming the user authority, recording related processing logs, marking the document, document content risk level and user level information after the user is authorized, and storing the document, the document content and related parameters into the knowledge base;

The function module invokes an interaction detection process: when the large language model calls the functional module, performing compliance detection, trusted source detection and content authenticity detection on the functional data interactively returned by the functional module, judging whether illegal information, an untrusted source and false content exist in the functional data, and if yes, filtering the functional data.

5. A compliance detection device of large language model data interaction is characterized in that: the system comprises a user interaction detection module, wherein the user interaction detection module is used for executing the following steps:

S5, outputting the processing result;

6. The compliance testing system for large language model data interactions of claim 5, wherein: the user interaction detection module is further configured to execute the following steps after executing the step S1:

7. The compliance testing system of claim 6, wherein said compliance testing system further comprises: the user interaction detection module is further used for:

8. The compliance testing system for large language model data interactions of claim 5, wherein: also included is at least one of the following modules:

The private knowledge base construction or update detection module: the method comprises the steps of when a private knowledge base is constructed or updated, carrying out compliance detection and user access authority check on an incoming document, judging whether the document has illegal information or unauthorized access, if not, storing the document, document content and related parameters into the knowledge base, if so, prompting that the document has related risks, confirming the user authority, recording related processing logs, marking the document, document content risk level and user level information after the user authority is authorized, and storing the document, the document content and related parameters into the knowledge base;

external data access interaction detection module: when the large language model accesses an external data source, performing compliance detection, trusted source detection and content authenticity detection on the external data returned by interaction, judging whether the external data has illegal information, an untrusted source and false content, if yes, filtering the external data, if yes, judging whether the external data has illegal information, an untrusted source and false content;

the function module calls the interaction detection module: and when the large language model calls the functional module, performing compliance detection, trusted source detection and content authenticity detection on the functional data interactively returned by the functional module, judging whether the functional data has illegal information, unreliable sources and false content, and if so, filtering the functional data.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 4 when the program is executed by the processor.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any one of claims 1 to 4.