Summary of The Invention
Technical problem
The technical problem to be solved by the invention is to provide a compliance detection method for large language model data interaction
The device, the equipment and the medium at least carry out compliance detection on the data in the interaction stage of the user request, thereby realizing the standard use of the large language model.
Technical solution
In a first aspect, the present invention provides a method for detecting compliance of large language model data interaction, including a user interaction detection process, where the user interaction detection process includes:
s1, carrying out compliance detection on the user request content, judging whether illegal information exists, if yes, directly returning a detection result and ending interaction, and if not, carrying out the next step;
S3, inputting the related data stream requested by the user into a large language model for processing, and obtaining a processing result returned by the large language model;
s4, carrying out compliance detection on the processing result, judging whether illegal information exists, if yes, filtering the illegal information, then carrying out the next step, and if not, directly carrying out the next step;
S5, outputting the processing result;
the compliance detection comprises a sensitive word check and a sensitive information check which do not accord with laws and regulations.
In a second aspect, the present invention provides a compliance detection apparatus for large language model data interaction, including a user interaction detection module, where the user interaction detection module is configured to perform the following steps:
s1, carrying out compliance detection on the user request content, judging whether illegal information exists, if yes, directly returning a detection result and ending interaction, and if not, carrying out the next step;
S3, inputting the related data stream requested by the user into a large language model for processing, and obtaining a processing result returned by the large language model;
s4, carrying out compliance detection on the processing result, judging whether illegal information exists, if yes, filtering the illegal information, then carrying out the next step, and if not, directly carrying out the next step;
S5, outputting the processing result;
the compliance detection comprises a sensitive word check and a sensitive information check which do not accord with laws and regulations.
In a third aspect, the invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of the first aspect when executing the program.
In a fourth aspect, the present invention provides a computer-readable storage medium having a computer program stored thereon,
The program, when executed by a processor, implements the method of the first aspect.
Advantageous effects
One or more technical solutions provided in the embodiments of the present invention at least have the following technical effects or advantages:
The embodiment of the invention carries out compliance detection on the data interaction process (comprising an internal interaction process and an external interaction process) of the existing large language model, so that the use of the large language model can accord with the regulation of related laws and regulations, and the blank that the current large language model has no related compliance detection is made up. The embodiment of the invention can simultaneously carry out relevant security detection such as user access authority detection, use limit detection or content authenticity detection besides the compliance detection of relevant laws and regulations, thereby ensuring the use safety of a large language model. The embodiment of the invention also pairs the user request with the request processing result and/or the supplementary result and stores the result, so that the interactive history record can be reserved for conveniently conducting safety compliance audit in the future.
The above description is merely an overview of the technical solutions of the present invention, in order to make the technical means of the present invention more clearly understood
But rather should be understood to be in the light of the present disclosure, and in order that the above-recited and other objects, features and advantages of the present invention will be more readily understood, a more particular description of the invention will be rendered by reference to specific embodiments thereof.
Embodiments of the invention
The embodiment of the application provides a method, a device, equipment and a medium for detecting the compliance of large language model data interaction, which are used for detecting the compliance of data at least in the interaction stage of a user request, thereby realizing the standard use of the large language model.
The technical scheme in the embodiment of the application has the following overall thought: and performing compliance detection on the data interaction process (comprising an internal interaction process and an external interaction process) of the existing large language model, so that the use of the large language model can accord with the regulation of related laws and regulations, and the blank that the current large language model has no related compliance detection is made up.
The embodiment of the invention can simultaneously carry out relevant security detection such as user access authority detection, use limit detection or content authenticity detection besides the compliance detection of relevant laws and regulations, thereby ensuring the use safety of a large language model. The embodiment of the invention also pairs the user request with the request processing result and/or the supplementary result and stores the result, so that the interactive history record can be reserved for conveniently conducting safety compliance audit in the future.
Before describing a specific embodiment, a system framework corresponding to the method of the embodiment of the present application is described, and as shown in fig. 1 and fig. 2, the system is roughly divided into three parts:
Large language models, which are machine learning techniques, are used to study probability distributions of natural language data and use these distributions to accomplish language-dependent tasks such as text classification, natural language understanding, machine translation, etc.
The interactive flow is to receive the request of the user, process the related processing for the content of the request and return the processing result.
The data storage module is also required to construct a private knowledge base for a large language model product with the private knowledge base; the private knowledge base may supplement the user's request with information.
The safety compliance detection module is used for carrying out safety compliance detection (mainly comprising sensitive word detection and sensitive information detection which do not meet legal regulations) and safety detection (mainly comprising user access authority detection, use limit detection or false content and the like) on the large language model data interaction process so as to ensure that the use of the large language model can meet the regulation of related legal regulations and the use safety. The method specifically comprises the following steps:
checking sensitive words, namely checking the sensitive words which do not accord with laws and regulations;
sensitive information inspection, i.e. inspection involving enterprise confidential, personal privacy information, etc.;
checking the access authority of the user, namely managing the file authority;
checking the limit, namely controlling the interaction cost, so as to realize the purpose of financial audit;
Interface authentication, which is to avoid being attacked or maliciously utilized when an external interface is called;
The trusted source check is carried out, when external data is accessed, the trusted source of the external data is confirmed (similar to a white list), so that the returned external data is prevented from being polluted;
Checking the authenticity, checking whether the returned result is real in content and not kneading, but reducing the authenticity detection weight in the interaction of creative generation, and ensuring that the creative is smoothly generated;
And audit service, archiving the history log for future security audit and problem tracing.
Example 1
As shown in fig. 3, the present embodiment provides a method for detecting compliance of large language model data interaction, including a user interaction detection process, where the user interaction detection process includes:
s1, carrying out compliance detection on the user request content, judging whether illegal information exists, if yes, directly returning a detection result and ending interaction, and if not, carrying out the next step;
S3, inputting the related data stream requested by the user into a large language model for processing, and obtaining a processing result returned by the large language model;
s4, carrying out compliance detection on the processing result, judging whether illegal information exists, if yes, filtering the illegal information, then carrying out the next step, and if not, directly carrying out the next step;
S5, outputting the processing result;
The compliance detection comprises a sensitive word check and a sensitive information check which do not accord with laws and regulations, and is used for avoiding sensitive word transmission and sensitive information leakage.
Further, as a better or more specific implementation manner of this embodiment, the user interaction detection process further includes:
S2, judging whether the user request content has a part which needs to access the knowledge base and relies on the knowledge base as the supplementary information, if so, carrying out compliance checking and user access authority checking on the supplementary result when the knowledge base returns the supplementary result, judging whether the supplementary result has the illegal information or the unauthorized access information, if so, filtering the illegal information or the unauthorized access information, returning, and entering into S3, otherwise, entering into S3. Thereby avoiding the security of sensitive information leakage or unauthorized access caused by the supplement result returned by the knowledge base.
The S1 also carries out user request content per se, including user access authority checking and use limit checking, and judges whether the user request content has unauthorized access or usage amount overrun; if yes, directly returning a detection result and ending interaction, and if not, carrying out the next step;
In the step S2, user access permission checking is also carried out on the supplementary result, and whether the supplementary result has unauthorized access or not is judged; if yes, filtering the unauthorized access content, returning to the S3, and if not, entering the S3;
In the step S3, interface authentication is carried out for the call initiated by the large language model, the next step is carried out when the authentication is passed, and the authentication result is returned when the authentication is not passed;
in the step S4, user access authority checking and content authenticity checking are also carried out on the processing result; judging whether the processing result has unauthorized access or false content, if so, filtering the processing result and then carrying out the next step, and if not, directly carrying out the next step;
the step S5 further includes: and matching the user request content with the processing result and/or the supplementary result and storing the matched user request content.
Further, as a more preferred or specific implementation of this embodiment, the method of this embodiment further includes at least one of the following processes:
The private knowledge base construction or update detection process: when the private knowledge base is constructed or updated, the compliance detection and the user access authority check are carried out on the input document, whether the document has illegal information or unauthorized access is judged, if not, the document content and related parameters are stored in the knowledge base, if so, the document is prompted to have related risks, the user authority is confirmed, after the user authorization, the related processing log is recorded, the document content risk level and the user level information are marked, and then the document, the document content and the related parameters are stored in the knowledge base.
External data access interaction detection process: when a large language model accesses an external data source, performing compliance detection, trusted source detection and content authenticity detection on the external data returned by interaction, judging whether the external data has illegal information, an untrusted source and false content, if yes, filtering the external data, if yes, judging whether the external data has illegal information, an untrusted source and false content;
the method ensures that the returned external data does not contain sensitive information, and the data sources are reliable and the data is true and reliable.
The function module invokes an interaction detection process: when the large language model calls the functional module, performing compliance detection, trusted source detection and content authenticity detection on the functional data interactively returned by the functional module, judging whether illegal information, an untrusted source and false content exist in the functional data, and if yes, filtering the functional data. So as to ensure that the returned functional data does not contain sensitive information, and the data sources are reliable and the data is true and reliable.
Based on the same inventive concept, the application also provides a device corresponding to the method in the first embodiment, and the details of the second embodiment are shown.
Example two
As shown in fig. 4, in this embodiment, a compliance detecting device for large language model data interaction is provided,
The system comprises a user interaction detection module, wherein the user interaction detection module is used for executing the following steps:
s1, carrying out compliance detection on the user request content, judging whether illegal information exists, if yes, directly returning a detection result and ending interaction, and if not, carrying out the next step;
S3, inputting the related data stream requested by the user into a large language model for processing, and obtaining a processing result returned by the large language model;
s4, carrying out compliance detection on the processing result, judging whether illegal information exists, if yes, filtering the illegal information, then carrying out the next step, and if not, directly carrying out the next step;
S5, outputting the processing result;
the compliance detection comprises a sensitive word check and a sensitive information check which do not accord with laws and regulations.
Further, as a better or more specific implementation manner of this embodiment, the user interaction detection module is further configured to perform the following steps after performing the step S1:
S2, judging whether the user request content has a part which needs to access the knowledge base and relies on the knowledge base as the supplementary information, if so, carrying out compliance checking and user access authority checking on the supplementary result when the knowledge base returns the supplementary result, judging whether the supplementary result has the illegal information or the unauthorized access information, if so, filtering the illegal information or the unauthorized access information, returning, and entering into S3, otherwise, entering into S3.
The S1 also carries out user request content per se, including user access authority checking and use limit checking, and judges whether the user request content has unauthorized access or usage amount overrun; if yes, directly returning a detection result and ending interaction, and if not, carrying out the next step;
In the step S2, user access permission checking is also carried out on the supplementary result, and whether the supplementary result has unauthorized access or not is judged; if yes, filtering the unauthorized access content, returning to the S3, and if not, entering the S3;
In the step S3, interface authentication is carried out for the call initiated by the large language model, the next step is carried out when the authentication is passed, and the authentication result is returned when the authentication is not passed;
in the step S4, user access authority checking and content authenticity checking are also carried out on the processing result; judging whether the processing result has unauthorized access or false content, if so, filtering the processing result and then carrying out the next step, and if not, directly carrying out the next step;
the step S5 further includes: and matching the user request content with the processing result and/or the supplementary result and storing the matched user request content.
Further, as a more preferred or specific implementation of this embodiment, the apparatus of this embodiment further includes at least one of the following modules:
the private knowledge base construction or update detection module: the method is used for carrying out the compliance detection and the user access authority check on the input document when the private knowledge base is constructed or updated, judging whether the document has illegal information or unauthorized access, if not, storing the document, the document content and related parameters into the knowledge base, if so, prompting that the document has related risks, confirming the user authority, and recording related processing logs after the user is authorized
And marking the document, the document content risk level and the user level information, and storing the document, the document content and related parameters into a knowledge base.
External data access interaction detection module: when the large language model accesses an external data source, performing compliance detection, trusted source detection and content authenticity detection on the external data returned by interaction, judging whether the external data has illegal information, an untrusted source and false content, if yes, filtering the external data, if yes, judging whether the external data has illegal information, an untrusted source and false content; the method ensures that the returned external data does not contain sensitive information, and the data sources are reliable and the data is true and reliable.
The function module calls the interaction detection module: and when the large language model calls the functional module, performing compliance detection, trusted source detection and content authenticity detection on the functional data interactively returned by the functional module, judging whether the functional data has illegal information, unreliable sources and false content, and if so, filtering the functional data. So as to ensure that the returned functional data does not contain sensitive information, and the data sources are reliable and the data is true and reliable.
Since the device described in the second embodiment of the present invention is a device for implementing the method described in the first embodiment of the present invention, based on the method described in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and the deformation of the device, and thus the detailed description thereof is omitted herein. All devices used in the method according to the first embodiment of the present invention are within the scope of the present invention.
Based on the same inventive concept, the application provides an electronic device embodiment corresponding to the first embodiment, and the details of the third embodiment are shown in the specification.
Example III
The present embodiment provides an electronic device, as shown in fig. 5, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where any implementation of the first embodiment may be implemented when the processor executes the computer program.
Since the electronic device described in this embodiment is the device for implementing the method in the first embodiment of the present application,
Therefore, based on the method described in the first embodiment of the present application, those skilled in the art can understand the specific implementation of the electronic device and various modifications thereof, so how the electronic device implements the method in the first embodiment of the present application will not be described in detail herein. The apparatus used to implement the methods of embodiments of the present application will be within the scope of the intended protection of the present application.
Based on the same inventive concept, the application provides a storage medium corresponding to the first embodiment, and the detail of the fourth embodiment is shown in the specification.
Example IV
The present embodiment provides a computer readable storage medium, as shown in fig. 6, on which a computer program is stored, which when executed by a processor, can implement any implementation of the first embodiment.
The technical scheme provided by the embodiment of the application has at least the following technical effects or advantages: the embodiment of the application provides a method, a device, a system, equipment and a medium,
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the flowchart of the process or processes and-
Or a block diagram of one or more functions specified in the block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that the specific embodiments described are illustrative only and not intended to limit the scope of the invention, and that equivalent modifications and variations of the invention in light of the spirit of the invention will be covered by the claims of the present invention.