CN116028881B

CN116028881B - Method and system for predicting recurrence of restrictive behavior based on multidimensional database

Info

Publication number: CN116028881B
Application number: CN202310247910.9A
Authority: CN
Inventors: 汪天翔; 张雨侠; 刘庄
Original assignee: Chinese University of Hong Kong Shenzhen
Current assignee: Chinese University of Hong Kong Shenzhen
Priority date: 2023-03-15
Filing date: 2023-03-15
Publication date: 2023-06-02
Anticipated expiration: 2043-03-15
Also published as: CN116028881A

Abstract

The invention discloses a prediction method, system, computer equipment, and storage medium for reoccurrence restriction behavior based on a multidimensional database. The realization of the method includes: constructing a multidimensional heterogeneous database; generating a sample data set, the sample data set includes a training data set and Test data set; according to the restricted behavior data of people with restricted activities within the preset period, divide positive and negative cases as positive and negative case labels; build the original short-term and short-term recurrence restricted behavior risk prediction model; through association query in The feature vector group of preset dimensions is selected in the training data set as input data, which is input to the original short-term re-occurrence restriction behavior risk prediction model for prediction processing, so as to output the prediction probability of re-appearance restriction behavior; calculate the loss function value, according to The loss function value iteratively trains the original short-term and short-term re-emergence behavior risk prediction model until the preset conditions are met, and generates the actual short-term and medium-term re-emergence restriction behavior risk prediction model. It can effectively improve the accuracy of forecasting and improve the overall efficiency of personnel management in places where activities are restricted.

Description

Method and system for predicting recurrence of restrictive behavior based on multidimensional database

技术领域technical field

本发明涉及深度神经网络技术领域，尤其涉及一种基于多维数据库的再次出现限制行为的预测方法、系统、计算机设备及存储介质。The present invention relates to the technical field of deep neural networks, and in particular to a prediction method, system, computer equipment and storage medium based on a multidimensional database-based recurrence limiting behavior.

背景技术Background technique

对于限制活动人员的管理和改造无论是学术界还是国家管理层面都是十分重要的议题，如何在符合人道主义前提下又可以的发现和规制再次出现限制行为风险就显得尤为重要，而在限制活动场所内的社交网络是阻碍限制活动场所内限制活动人员改造的重要不利因素，也是限制活动场所内的限制活动人员交叉干扰的重要危险变量。然而，目前传统的方法是需要依靠监管人员通过视频、谈话、人工来对限制行为人员的异常进行侦查后再进行防范。不仅效率低且费事费力，无法对限制活动场所内的高危社交网络进行充分利用，且由于通过人工方式进行再次出现限制行为概率预测，存在准确率低且不科学的问题。The management and transformation of personnel with restricted activities is a very important issue, whether it is in the academic circle or at the national management level. How to discover and regulate the risk of restricted behaviors again under the premise of humanitarianism is particularly important, and in the restricted activities The social network in the venue is an important unfavorable factor that hinders the transformation of restricted activity personnel in restricted activity venues, and it is also an important risk variable for the cross-interference of restricted activity personnel in restricted activity venues. However, the current traditional method is to rely on supervisors to detect abnormalities of restricted behavior personnel through video, conversation, and manual work before taking precautions. Not only is it inefficient and labor-intensive, it cannot make full use of the high-risk social networks in restricted activity venues, but also has low accuracy and unscientific problems due to the manual method of predicting the probability of reoccurrence of restricted behaviors.

发明内容Contents of the invention

基于此，有必要针对上述技术问题，提供一种基于多维数据库的再次出现限制行为的预测方法、系统、计算机设备及存储介质，以解决上述现有技术中存在的至少一个问题。Based on this, it is necessary to provide a prediction method, system, computer equipment and storage medium based on a multi-dimensional database based on the above-mentioned technical problems, so as to solve at least one problem existing in the above-mentioned prior art.

第一方面，提供了一种基于多维数据库的再次出现限制行为的预测方法，包括：In the first aspect, a method for predicting the recurrence of limiting behavior based on a multidimensional database is provided, including:

构建多维异构数据库，所述多维异构数据库中包括前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据；Constructing a multi-dimensional heterogeneous database, the multi-dimensional heterogeneous database includes the data of the previous legal documents, the relevant data of the previous entry into the restricted activity site, the monitoring data in the previous restricted activity site, and the legal document data of the current restricted behavior;

对所述前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据进行数据挖掘以及数据清洗，以生成样本数据集，所述样本数据集包括训练数据集以及测试数据集；Perform data mining and data cleaning on the data of the previous legal documents, the relevant data of the previous entry into the restricted activity place, the monitoring data in the previous restricted activity place, and the legal document data of the current restricted behavior to generate a sample data set. The sample data set includes a training data set and a testing data set;

根据限制活动人员在预设期限内的再次出现限制行为数据，进行正负例划分，以作为正、负例标签；According to the restricted behavior data of the restricted activities personnel reappearing within the preset period, positive and negative cases are divided as positive and negative case labels;

构建原始中短期再次出现限制行为风险预测模型；Construct the original short-to-medium-term reappearance restriction behavior risk prediction model;

通过关联查询在所述训练数据集中选取预设维度的特征向量组，作为输入数据，输入至所述原始中短期再次出现限制行为风险预测模型中进行预测处理，以输出再次出现限制行为预测概率；Selecting a eigenvector group of preset dimensions in the training data set through an associated query, as input data, inputting it into the original medium and short-term re-occurrence restriction behavior risk prediction model for prediction processing, so as to output the re-appearance restriction behavior prediction probability;

根据所述再次出现限制行为预测概率以及正、负例标签，计算损失函数值，根据所述损失函数值对所述原始中短期再次出现限制行为风险预测模型进行迭代训练，直到符合预设条件时，生成实际中短期再次出现限制行为风险预测模型，以通过所述实际中短期再次出现限制行为风险预测模型对限制活动人员预设时间范围内再次出现限制行为风险进行预测。Calculate the loss function value according to the prediction probability of re-occurrence restriction behavior and the positive and negative example labels, and iteratively train the original medium-term and short-term re-appearance restriction behavior risk prediction model according to the loss function value until the preset condition is met. , generating an actual short-to-medium-term risk prediction model for re-emergence of restricted behaviors, so as to predict the risk of restricted behaviors re-emerged within a preset time range by persons with restricted activities through the actual short-to-medium-term risk prediction model for re-emergence of restricted behaviors.

在一实施例中，所述构建多维异构数据库之后，包括：In one embodiment, after the multidimensional heterogeneous database is constructed, it includes:

通过对所述前次限制活动场所内的监控数据进行视频图像分析，获取不同限制活动人员之间的亲密度参数；Obtain the parameters of intimacy between persons with different restricted activities by performing video image analysis on the monitoring data in the previous restricted activity place;

根据所述亲密度参数，计算与其他共同参与限制行为的人员亲密度，以作为待预测数据，并归类至所述样本数据集中。According to the intimacy parameter, calculate the intimacy with other persons who jointly participate in the restrictive behavior as the data to be predicted, and classify it into the sample data set.

在一实施例中，根据限制活动人员在预设期限内的再次出现限制行为数据，对所述样本数据集中对应的限制活动人员进行正、负例划分，包括：In one embodiment, according to the restricted activity data of the restricted activity personnel again within the preset period, the corresponding restricted activity personnel in the sample data set are divided into positive and negative cases, including:

当所述限制活动人员在预设期限内发生再次出现限制行为事件时，将所述样本数据集中对应的限制活动人员划分为负例；When the restrictive behavior event occurs again within the preset time limit for the restricted activity personnel, classify the corresponding restricted activity personnel in the sample data set as a negative example;

当所述限制活动人员在预设期限内没有发生再次出现限制行为事件时，将所述样本数据集中对应的限制活动人员划分为正例。When the person with restricted activities does not reappear the restricted behavior event within the preset time limit, the corresponding person with restricted activities in the sample data set is classified as a positive case.

在一实施例中，所述根据所述再次出现限制行为预测概率以及正、负例标签，计算损失函数值，包括：In one embodiment, the calculation of the loss function value according to the predicted probability of reoccurrence of the restricted behavior and the positive and negative labels includes:

通过预设损失函数以及反向传播算法，对所述再次出现限制行为预测概率以及正、负例标签，计算所述损失函数值。By using a preset loss function and a backpropagation algorithm, the value of the loss function is calculated for the predicted probability of the recurrence restriction behavior and the labels of positive and negative examples.

在一实施例中，所述生成实际中短期再次出现限制行为风险预测模型之后，包括：In one embodiment, after generating the actual short-to-medium-term reoccurrence restriction behavior risk prediction model, it includes:

获取待预测限制活动人员的当前限制行为的法律文书数据、因限制行为而进入限制活动场所的相关数据以及限制活动场所内的生活数据；Obtain the legal document data of the current restricted behavior of the person to be predicted to restrict activities, the relevant data of entering the restricted activity place due to the restricted behavior, and the life data in the restricted activity place;

对所述当前限制行为的法律文书数据、因限制行为而进入限制活动场所的相关数据以及限制活动场所内的生活数据进行数据挖掘以及数据清洗，以生成待预测特征数据；Perform data mining and data cleaning on the legal document data of the current restricted behavior, the relevant data of entering the restricted activity place due to the restricted behavior, and the life data in the restricted activity place, so as to generate the characteristic data to be predicted;

将所述待预测特征数据输入至所述实际中短期再次出现限制行为风险预测模型，以输出所述待预测限制活动人员的再次出现限制行为预测概率。The to-be-predicted feature data is input into the actual short-to-medium-term recurrence risk prediction model to output the prediction probability of the recurrence restriction behavior of the person to be predicted to restrict activities.

在一实施例中，所述实际中短期再次出现限制行为风险预测模型包括输入层、多个隐藏层以及输出层，所述将所述待预测特征数据输入至所述实际中短期再次出现限制行为风险预测模型，以输出所述待预测限制活动人员的再次出现限制行为预测概率，包括：In one embodiment, the actual short-to-medium-term reappearance restrictive behavior risk prediction model includes an input layer, a plurality of hidden layers, and an output layer, and the characteristic data to be predicted is input to the actual short-term re-emergence restrictive behavior The risk prediction model is used to output the predicted probability of the restricted behavior of the person to be predicted to be restricted, including:

通过所述输入层接收所述待预测特征数据；receiving the feature data to be predicted through the input layer;

将所述输入层接收的待预测特征数据依次输入至多个隐藏层中进行处理；Inputting the feature data to be predicted received by the input layer into multiple hidden layers in sequence for processing;

通过所述输出层输出最后一层隐藏层处理后得到的所述待预测限制活动人员的再次出现限制行为预测概率。The output layer outputs the prediction probability of the reappearance restriction behavior of the person to be predicted and obtained after the last layer of hidden layer processing.

在一实施例中，所述根据所述实际中短期再次出现限制行为风险预测模型输出所述待预测限制活动人员的再次出现限制行为预测概率之后，包括：In an embodiment, after outputting the predicted probability of the re-emergence of restricted behaviors of the person to be predicted based on the actual short-term and short-term re-emergence restricted behavior risk prediction model, it includes:

当所述待预测限制活动人员的再次出现限制行为预测概率，大于预设阈值时，将所述待预测限制活动人员的再次出现限制行为风险进行可视化展示，并向限制活动场所管理平台发送重点观察名单。When the predicted probability of the reoccurrence of the restricted behavior of the person to be predicted with restricted activities is greater than the preset threshold, the risk of the reappeared restricted behavior of the person to be predicted with restricted activities is visualized, and the key observation is sent to the restricted activity place management platform list.

第二方面，提供了一种基于多维数据库的再次出现限制行为的预测系统，包括：In the second aspect, a multidimensional database-based prediction system for reoccurrence of limiting behavior is provided, including:

数据库构建单元，用于构建多维异构数据库，所述多维异构数据库中包括前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据；The database construction unit is used to construct a multi-dimensional heterogeneous database, which includes the data of the previous legal documents, the relevant data of the previous entry into the restricted activity site, the monitoring data in the previous restricted activity site, and the current restricted behavior. legal document data;

数据集生成单元，用于对所述前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据进行数据挖掘以及数据清洗，以生成样本数据集，所述样本数据集包括训练数据集以及测试数据集；a data set generation unit, configured to perform data mining and data cleaning on the previous legal document data, the relevant data of the previous entry into the restricted activity place, the monitoring data in the previous restricted activity place, and the legal document data of the current restricted behavior, To generate a sample data set, the sample data set includes a training data set and a testing data set;

标签生成单元，用于根据限制活动人员在预设期限内的再次出现限制行为数据，进行正负例划分，以作为正、负例标签；The label generation unit is used to divide the positive and negative examples according to the restricted behavior data of the restricted activity personnel within the preset period, so as to serve as positive and negative example labels;

模型构建单元，用于构建原始中短期再次出现限制行为风险预测模型；A model construction unit, used to construct the original short-to-medium-term reappearance restriction behavior risk prediction model;

训练单元，用于通过关联查询在所述训练数据集中选取预设维度的特征向量组，作为输入数据，输入至所述原始中短期再次出现限制行为风险预测模型中进行预测处理，以输出再次出现限制行为预测概率；The training unit is used to select a feature vector group of preset dimensions in the training data set through an associated query, as input data, and input it into the original short-term recurrence restriction behavior risk prediction model for prediction processing, and output the recurrence Constraining behavioral prediction probabilities;

预测单元，用于根据所述再次出现限制行为预测概率以及正、负例标签，计算损失函数值，根据所述损失函数值对所述原始中短期再次出现限制行为风险预测模型进行迭代训练，直到符合预设条件时，生成实际中短期再次出现限制行为风险预测模型，以通过所述实际中短期再次出现限制行为风险预测模型对限制活动人员预设时间范围内再次出现限制行为风险进行预测。The prediction unit is used to calculate the loss function value according to the predicted probability of re-occurrence restriction behavior and the positive and negative example labels, and iteratively train the original short-term re-emergence restriction behavior risk prediction model according to the loss function value until When the preset conditions are met, an actual medium-to-short-term re-emergence risk prediction model of restriction behavior is generated, so as to predict the risk of re-emergence of restriction behavior within the preset time range of restricted activity personnel through the actual medium-to-short-term re-emergence risk prediction model.

第三方面，提供了一种计算机设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令，所述处理器执行所述计算机可读指令时实现如上述所述基于多维数据库的再次出现限制行为的预测方法的步骤。In a third aspect, a computer device is provided, including a memory, a processor, and computer-readable instructions stored in the memory and operable on the processor, when the processor executes the computer-readable instructions The steps of implementing the method for predicting the reoccurrence limit behavior based on the multidimensional database as described above.

第四方面，提供了一种可读存储介质，所述可读存储介质存储有计算机可读指令，所述计算机可读指令被处理器执行时实现如上述所述基于多维数据库的再次出现限制行为的预测方法的步骤。In a fourth aspect, a readable storage medium is provided, the readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the recurrence restriction behavior based on the multidimensional database as described above is implemented The steps of the forecasting method.

上述基于多维数据库的再次出现限制行为的预测方法、系统、计算机设备及存储介质，其方法实现，包括：构建多维异构数据库，所述多维异构数据库中包括前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据；对所述前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据进行数据挖掘以及数据清洗，以生成样本数据集，所述样本数据集包括训练数据集以及测试数据集；根据限制活动人员在预设期限内的再次出现限制行为数据，进行正负例划分，以作为正、负例标签；构建原始中短期再次出现限制行为风险预测模型；通过关联查询在所述训练数据集中选取预设维度的特征向量组，作为输入数据，输入至所述原始中短期再次出现限制行为风险预测模型中进行预测处理，以输出再次出现限制行为预测概率；根据所述再次出现限制行为预测概率以及正、负例标签，计算损失函数值，根据所述损失函数值对所述原始中短期再次出现限制行为风险预测模型进行迭代训练，直到符合预设条件时，生成实际中短期再次出现限制行为风险预测模型，以通过所述实际中短期再次出现限制行为风险预测模型对限制活动人员预设时间范围内再次出现限制行为风险进行预测。本申请中通过分析公开的法律文书、限制活动人员的个人家庭信息、在限制活动场所内的社交生活数据等，结合深度神经网络，并通过对于指定期限内人员进行正负例划分，构建一个中短期的限制行为人员再次出现限制行为预测模型，对限制活动人员在预设期限内的再次出现限制行为概率进行预测，并且可结合可视化平台，给限制活动场所管理人员提供重点观察名单，以提升限制活动场所人员管理的整体效率。有效提高预测的准确性以及效率。The above-mentioned prediction method, system, computer equipment and storage medium based on the multi-dimensional database-based re-occurrence restriction behavior, the realization of the method includes: constructing a multi-dimensional heterogeneous database, the multi-dimensional heterogeneous database includes the previous legal document data, the previous entry Relevant data of restricted activity places, monitoring data in the previous restricted activity places, and legal document data of current restricted activities; Data mining and data cleaning of monitoring data and legal document data currently restricting behaviors to generate sample data sets, the sample data sets include training data sets and test data sets; Restrict behavioral data, and divide positive and negative cases as positive and negative case labels; construct the original short-term and short-term reappearance restriction behavior risk prediction model; select a feature vector group with preset dimensions in the training data set through correlation query as input The data is input into the original short-to-medium-term recurrence risk prediction model for prediction processing, so as to output the predicted probability of the restricted behavior again; according to the predicted probability of the restricted behavior again and the positive and negative example labels, the loss function value is calculated , according to the loss function value, iteratively train the original short-term reappearance restriction behavior risk prediction model until the preset conditions are met, and generate an actual short-term re-appearance restriction behavior risk prediction model to pass the actual short-term reappearance restriction behavior risk prediction model. The risk prediction model for the recurrence of restricted behaviors predicts the risk of restricted behaviors reappearing within the preset time range of restricted activity personnel. In this application, by analyzing public legal documents, personal family information of persons with restricted activities, social life data in places with restricted activities, etc., combined with deep neural network, and by dividing positive and negative cases of persons within a specified period, a middle-level The short-term prediction model for the reoccurrence of restricted behaviors of restricted behavior personnel predicts the probability of restricted behaviors reoccurring within the preset period, and can be combined with a visualization platform to provide key observation lists for restricted activity site managers to improve restrictions Overall efficiency of event venue personnel management. Effectively improve the accuracy and efficiency of forecasting.

附图说明Description of drawings

为了更清楚地说明本发明实施例的技术方案，下面将对本发明实施例的描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments of the present invention. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention , for those skilled in the art, other drawings can also be obtained according to these drawings without paying creative labor.

图1是本发明一实施例中基于多维数据库的再次出现限制行为的预测方法的一应用环境示意图；FIG. 1 is a schematic diagram of an application environment of a prediction method based on a multidimensional database-based recurrence restriction behavior in an embodiment of the present invention;

图2是本发明一实施例中基于多维数据库的再次出现限制行为的预测方法的一流程示意图；FIG. 2 is a schematic flow chart of a method for predicting reoccurrence restriction behavior based on a multidimensional database in an embodiment of the present invention;

图3是本发明一实施例中中短期再次出现限制行为风险预测模型的预测流程示意图；Fig. 3 is a schematic diagram of the forecasting process of the short-term and medium-term reappearance restriction behavior risk forecasting model in an embodiment of the present invention;

图4是本发明一实施例中中短期再次出现限制行为风险预测模型的一结构示意图；Fig. 4 is a schematic structural diagram of a risk prediction model for short-term and medium-term re-emergence of restricted behaviors in an embodiment of the present invention;

图5是本发明一实施例中基于多维数据库的再次出现限制行为的预测方法的一结构示意图；FIG. 5 is a structural schematic diagram of a method for predicting reoccurrence restriction behavior based on a multidimensional database in an embodiment of the present invention;

图6是本发明一实施例中计算机设备的一示意图。FIG. 6 is a schematic diagram of computer equipment in an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are some of the embodiments of the present invention, but not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

本实施例提供的基于多维数据库的再次出现限制行为的预测方法，可应用在如图1的应用环境中，其中，多维异构数据库用于提供中短期再次出现限制行为风险预测模型的训练样本数据以及测试样本数据，该中短期再次出现限制行为风险预测模型经训练样本数据以及测试样本数据的训练和测试后，可对限制活动人员的当前限制行为的法律文书数据、限制活动人员因限制行为而进入限制活动场所的信息以及限制活动场所域内生活信息等进行再次出现限制行为风险预测，并将预测结果通过网络等输出至再次出现限制行为风险可视化平台进行可视化展示，以提高限制活动场所管理人员监管的准确率以及效率。The method for predicting the re-emergence of restrictive behavior based on the multidimensional database provided in this embodiment can be applied in the application environment as shown in Figure 1, wherein the multi-dimensional heterogeneous database is used to provide training sample data for the short-term re-emergence of restrictive behavior risk prediction model And test sample data, the short-to-medium-term reappearance of restricted behavior risk prediction model after the training and testing of the training sample data and test sample data, the legal document data of the current restricted behavior of the restricted activities, the restricted activities of the restricted activities of the personnel due to the restricted behavior Predict the risk of reoccurrence of restricted behaviors based on the information of entering the restricted activity sites and the life information in the restricted activity site, and output the prediction results to the visualization platform for the reappearance of restricted behavior risks through the network, etc., to improve the management personnel supervision of restricted activity sites accuracy and efficiency.

其中，再次出现限制行为风险可视化平台包括但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。该中短期再次出现限制行为风险预测模型可设置于服务端，也可设置于本地监控管理平台，该服务端可以用独立的服务器或者是多个服务器组成的服务器集群来实现。Among them, re-emerging restricted conduct risk visualization platforms include, but are not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices. The short-to-medium-term reappearance restriction behavior risk prediction model can be set on the server side, and can also be set on the local monitoring and management platform. The server side can be realized by an independent server or a server cluster composed of multiple servers.

在一实施例中，如图2所示，提供了一种基于多维数据库的再次出现限制行为的预测方法，包括如下步骤：In one embodiment, as shown in FIG. 2 , a method for predicting the reoccurrence restriction behavior based on a multidimensional database is provided, including the following steps:

在步骤S110中，构建多维异构数据库，所述多维异构数据库中包括前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据；In step S110, a multi-dimensional heterogeneous database is constructed, the multi-dimensional heterogeneous database includes the data of the previous legal documents, the relevant data of the previous entry into the restricted activity place, the monitoring data in the previous restricted activity place, and the current law of restrictive behavior. document data;

可通过公开的文书数据库或者与限制活动场所管理平台、法律文书管理平台等进行合作，以获取原始的刑事裁判文书数据、限制活动场所内的限制活动人员进入限制活动场所时收集的相关信息数据、在限制活动场所内期间的监控视频数据、以及再次出现限制行为的文书数据等，以共同构成限制活动场所多源异构数据库，用于为后续的中短期再次出现限制行为风险预测模型的构建提供样本数据。Through the public document database or cooperation with the restricted activity site management platform, legal document management platform, etc., to obtain the original criminal judgment document data, relevant information and data collected when the restricted activity personnel in the restricted activity place enter the restricted activity place, Surveillance video data during the period of restricted activities, as well as document data on the recurrence of restricted behaviors, etc., together constitute a multi-source heterogeneous database of restricted activities, which is used to provide information for the construction of subsequent risk prediction models for short-term and medium-term re-emergence of restricted behaviors. sample.

其中，限制活动人员是指因实施限制行为而被限制活动的人员，该限制行为可以理解为违反法律规定的行为。该限制活动场所即可为将出现限制行为的限制活动人员进行集中管理，并限制人员活动范围的场所。Among them, the person whose activities are restricted refers to the person whose activities are restricted due to the implementation of restricted behavior, which can be understood as an act that violates the law. The restricted activity place can be a place where the restricted activity personnel who will have restricted activities are managed in a centralized manner and the scope of activities of the personnel is restricted.

其中，该前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据，具体可为用户上一次发生限制行为时所产生的法律文书数据，以及因限制行为而进入限制活动场所的信息数据和限制活动场所内的监控数据。Among them, the data of the previous legal document, the relevant data of the previous entry into the restricted activity place, and the monitoring data in the previous restricted activity place can specifically be the legal document data generated when the user’s last restricted behavior occurred, as well as the And enter the information data of restricted activity places and the monitoring data in restricted activity places.

在步骤S120中，对所述前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据进行数据挖掘以及数据清洗，以生成样本数据集，所述样本数据集包括训练数据集以及测试数据集；In step S120, data mining and data cleaning are performed on the previous legal document data, the relevant data of the previous entry into the restricted activity place, the monitoring data in the previous restricted activity place, and the legal document data of the current restricted behavior to generate A sample data set, the sample data set includes a training data set and a testing data set;

在本申请实施例中，在构建出多维异构数据库后，由于多维异构数据库中的数据繁杂，且存在非结构化数据，因此，对非结构化文本数据，例如法律文书，因限制行为而进入限制活动场所的信息数据等，可通过特征工程或者深度学习算法进行数据挖掘，以提取出可用的待检测数据，例如，text-CNN、NLP技术等，以提取出如下待检测数据，例如，判断人员是否是多人实施限制行为数据、是否实施新兴限制行为、父母相关情况、教育程度、职业等。In the embodiment of this application, after the multi-dimensional heterogeneous database is constructed, due to the complexity of the data in the multi-dimensional heterogeneous database and the existence of unstructured data, unstructured text data, such as legal documents, are restricted due to The information data that enters the restricted activity place, etc., can be mined through feature engineering or deep learning algorithm to extract the available data to be detected, for example, text-CNN, NLP technology, etc., to extract the following data to be detected, for example, Judging whether multiple people implement restricted behavior data, whether to implement emerging restricted behavior, parental related conditions, education level, occupation, etc.

进一步，对于视频监控数据，还可采用视频图像分析技术，对视频监控数据进行视频分析，例如，人脸识别、图像分割、物体检测等，以获取需要的数据维度，例如与下次其他共同参与限制行为的人员相处时间、冲突等。Further, for video surveillance data, video image analysis technology can also be used to perform video analysis on video surveillance data, such as face recognition, image segmentation, object detection, etc., to obtain the required data dimensions, such as participating in the next Restrict behavioral personnel's time together, conflicts, etc.

在本申请实施例中，在对前次法律文书数据进行数据挖掘时，主要提取的内容可包括前次限制行为数据文书对应的案由、案件发生的地点、是否多人实施限制行为、是否实施新兴限制行为。In this embodiment of the application, when data mining the previous legal document data, the main extracted content may include the cause of action corresponding to the previous restrictive behavior data document, the place where the case occurred, whether multiple people implemented the restrictive behavior, whether the emerging restrict behavior.

在本申请实施例中，前次进入限制活动场所的相关数据是指限制活动场所内的限制活动人员在进入限制活动场所时需要填写与搜集的相关数据，在进行数据挖掘时，其可提取的数据内容具体可包括数据维度分为两个部分：第一部分为：家庭信息；第二部分为：个人信息。In the embodiment of this application, the relevant data of the previous entry to the restricted activity place refers to the relevant data that the restricted activity personnel in the restricted activity place need to fill in and collect when entering the restricted activity place. During data mining, the data that can be extracted Specifically, the data content may include data dimensions divided into two parts: the first part is: family information; the second part is: personal information.

在本申请实施例中，对于前次限制活动场所内的监控数据进行数据挖掘时，其可挖掘的数据主要分为两个部分，第一部分为：限制行为场所常用的结构化数据，如在限制活动场所内的限制活动人员的危险行为次数、冲突发生次数、上次限制行为场所期间的表彰情况。第二部分为：再次出现限制行为行为发生后，需要利用之前存留的视频源数据，进行视频图像分析后得到量化数据，例如当事人与其他共同参与限制行为的人员相处的时间。In the embodiment of this application, when data mining is performed on the monitoring data in the previous restricted activity place, the data that can be mined is mainly divided into two parts. The first part is: structured data commonly used in restricted behavior places, such as restricted The number of dangerous behaviors, the number of conflicts, and the commendation situation during the last restricted behavior place of the restricted activity personnel in the activity place. The second part is: after the restricted behavior occurs again, it is necessary to use the previously stored video source data to analyze the video image to obtain quantitative data, such as the time the party spends with other people who participated in the restricted behavior.

进一步，对于存留的视频源数据可通过如下方式获取待预测数据：Further, for the remaining video source data, the data to be predicted can be obtained in the following ways:

根据所述亲密度参数，计算其他共同参与限制行为的人员亲密度，以作为待预测数据，并归类至所述样本数据集中。According to the intimacy parameter, calculate the intimacy of other persons who jointly participate in the restricted behavior as the data to be predicted, and classify it into the sample data set.

其中，普通限制活动场所内视频源文件的存留时间较短，一般为三个月，因此为了对更长时间内再次出现限制行为风险研究，因此该监控数据可为预设时间内保存的监控数据。例如，1年内。Among them, the storage time of video source files in ordinary places with restricted activities is relatively short, generally three months. Therefore, in order to study the risk of restricted behaviors recurring in a longer period of time, the monitoring data can be the monitoring data stored within a preset time . For example, within 1 year.

进一步，通过对所述前次限制活动场所内的监控数据进行视频图像分析，例如，人脸识别、物体识别、图像分割等技术手段，可以对监控数据中出现的用户进行识别，并可记录获取用户与不同限制活动人员之间的亲密度参数，例如，相处时间、冲突、交流密切等。Further, by performing video image analysis on the monitoring data in the previous restricted activity place, such as face recognition, object recognition, image segmentation and other technical means, the users appearing in the monitoring data can be identified, and records can be obtained Parameters of intimacy between the user and people with different restricted activities, such as time spent together, conflicts, close communication, etc.

由于，亲密程度的主要因变量为与下次其他共同参与限制行为的人员相处交流时间t，并且相处时间信息量并非线性增长，因此可借用信息熵概念，设定亲密度为公式：

，其中t为当事人与其他共同参与限制行为的人员相处的时间t小时，k为可变超参数，偏置量，用于控制前值的影响程度，初步设定为0.5。从而可计算得到其他共同参与限制行为的人员因密度，并可作为待预测数据，因为限制活动人员在限制活动场所中会有很多交流和社交，并且容易产生一些关系和限制期满，离开限制活动场所后，与其他人员共同再次实施限制行为的情形，因此通过对限制活动场所的监控数据进行视频分析得到限制活动人员之间的亲密关系，以便在通过中短期再次出现限制行为风险预测模型时，可充分考虑高危社交网络。Since the main dependent variable of the degree of intimacy is the next communication time t with other people who participate in the restricted behavior together, and the amount of information during the time spent together does not increase linearly, so the concept of information entropy can be used to set the degree of intimacy as the formula:

, where t is the time t hours that the party spends with other people who participated in the restricted behavior together, and k is a variable hyperparameter, bias, which is used to control the degree of influence of the previous value, and is initially set to 0.5. In this way, the density of other people who jointly participate in restricted behaviors can be calculated, and can be used as data to be predicted, because people with restricted activities will have a lot of communication and social interaction in restricted activity places, and it is easy to have some relationships and leave restricted activities when the restriction expires Therefore, through video analysis of the monitoring data of the restricted activity place, the intimate relationship between the restricted activity personnel can be obtained, so that when the risk prediction model for the restricted behavior reappears in the short and medium term, High-risk social networks can be given full consideration.

在本申请实施例中，在对当前再次出现限制行为人员的文书数据进行数据挖掘时，其挖掘的数据可包括案由、案件发生地、是否多人实施限制行为、是否与其他共同参与限制行为的人员有共同被关入限制活动场所的经历、是否属于新兴限制行为等。且该部分数据还可映射前次限制活动场所内的视频中共同限制行为数据人员，关联限制活动场所内的社交网络产生的其他共同参与限制行为的人员亲密度对再次出现限制行为的影响。In the embodiment of the present application, when data mining is performed on the clerical data of persons who currently have restricted behaviors again, the mined data may include the cause of the case, the place where the case occurred, whether multiple people carried out the restricted behavior, and whether they participated in the restricted behavior with other people. Personnel have the experience of being locked up in a restricted activity place together, whether it is an emerging restricted behavior, etc. And this part of the data can also map the people who shared the restricted behavior data in the video in the previous restricted activity place, and the influence of the intimacy of other people who jointly participated in the restricted behavior generated by the social network in the restricted activity place on the reoccurrence of restricted behavior.

在步骤S130中，根据限制活动人员在预设期限内的再次出现限制行为数据，进行正负例划分，以作为正、负例标签；In step S130, according to the restricted behavior data of the restricted activity personnel within the preset time limit, positive and negative cases are divided as positive and negative case labels;

在本申请实施例中，预设期限可为1年，也可为其他期限，例如，3年，3个月等，优选为1年。In the embodiment of the present application, the preset time limit may be 1 year, or other time limit, for example, 3 years, 3 months, etc., preferably 1 year.

在本申请实施例中，该限制活动人员在预设期限内的再次出现限制行为数据具体可通过在预设期限内持续对该限制活动人员的状况进行监控，以获取该限制活动人员的再次出现限制行为数据。In this embodiment of the application, the data on the reappearance and restricted behavior of the person with restricted activities within the preset time limit can be obtained by continuously monitoring the status of the person with restricted activities within the preset time limit to obtain the reappearance of the person with restricted activities Limit behavioral data.

在本申请实施例中，在获取到限制活动人员在预设期限内的再次出现限制行为数据后，可将该再次出现限制行为数据与该多维异构数据库中预存的数据进行对比，以确定该多维异构数据库中属于该限制活动人员的数据，即，确定在预设时间范围内，之前限制活动人员是否出现再次限制行为的情况，并基于该情况，对在前限制活动人员进行正、负例划分。In the embodiment of the present application, after obtaining the restricted behavior data of the restricted activity personnel within the preset period, the data can be compared with the pre-stored data in the multi-dimensional heterogeneous database to determine the The data belonging to the person with restricted activities in the multi-dimensional heterogeneous database, that is, to determine whether the person with previously restricted activities has restricted behavior again within the preset time range, and based on this situation, positive and negative Example division.

进一步，根据限制活动人员在预设期限内的再次出现限制行为，对所述样本数据集中对应的限制活动人员进行正、负例划分，包括：Further, according to the reoccurrence of the restrictive behavior of the restricted activity personnel within the preset period, the corresponding restricted activity personnel in the sample data set are divided into positive and negative cases, including:

具体的，当之前关押的限制活动人员在在预设期限内发生再次出现限制行为事件时，例如，1年内，则将所述样本数据集中对应的限制活动人员划分为负例，当所述限制活动人员在预设期限内没有发生再次出现限制行为事件时，例如，1年内，则将所述样本数据集中对应的限制活动人员划分为正例。通过设置该正、负例，即1年内不会再次出现限制行为和会再次出现限制行为两种情况的数据信息需要学习来做分类任务，实现对训练的中短期再次出现限制行为风险预测模型进行监督学习，以提高预测准确性。Specifically, when the previously detained persons with restricted activities reappear within the preset time limit, for example, within one year, the corresponding restricted persons in the sample data set are classified as negative cases, when the restricted When the restricted behavior event does not occur again within the preset period, for example, within 1 year, the corresponding restricted activity personnel in the sample data set are classified as positive cases. By setting the positive and negative examples, that is, the data information that will not appear again in one year and will appear again in the restricted behavior needs to be learned to do the classification task, so as to realize the risk prediction model for the short-term and medium-term reappearance of the training. Supervised learning to improve predictive accuracy.

在步骤S140中，构建原始中短期再次出现限制行为风险预测模型；In step S140, constructing the original short-to-medium-term recurrence risk prediction model of restrictive behavior;

在本申请实施例中，原始中短期再次出现限制行为风险预测模型可为DNN神经网络搭建的。DNN神经网络是一种多层感知机，通过模拟动物神经接受一个信号后会发送各个神经元，各个神经元接受输入后根据自身的判断，激活产生输入信号后汇总从而实现对信息源识别和分类的作用。结构简单，且可以对高度复杂的函数进行拟合。In the embodiment of the present application, the original medium- and short-term reappearance restriction behavior risk prediction model can be constructed by a DNN neural network. The DNN neural network is a multi-layer perceptron. After receiving a signal by simulating the animal nerve, it will send each neuron. After receiving the input, each neuron will activate and generate the input signal according to its own judgment, and then summarize it to realize the identification and classification of the information source. role. The structure is simple, and highly complex functions can be fitted.

其中，参见图3，原始中短期再次出现限制行为风险预测模型具体可包括：一个输入层、多个隐藏层以及一个输出层，该隐藏层可包括4层，分别可为第一个32维节点的隐藏层hidden layer1，第二个64个节点的隐藏层hidden layer2，第三个32个节点的隐藏层hidden layer3，第四个16个节点的隐藏层hidden layer4，其中各个层级间使用的激活函数为sigmod函数。Among them, referring to Fig. 3, the original short-to-medium-term recurrence restriction behavior risk prediction model may specifically include: an input layer, multiple hidden layers, and an output layer. The hidden layer may include 4 layers, each of which may be the first 32-dimensional node The hidden layer hidden layer1, the second hidden layer hidden layer2 with 64 nodes, the third hidden layer hidden layer3 with 32 nodes, the fourth hidden layer hidden layer4 with 16 nodes, and the activation function used between each level is the sigmod function.

在步骤S150中，通过关联查询在所述训练数据集中选取预设维度的特征向量组，作为输入数据，输入至所述原始中短期再次出现限制行为风险预测模型中进行预测处理，以输出再次出现限制行为预测概率；In step S150, a eigenvector group of preset dimensions is selected in the training data set through an associated query, as input data, and input into the original medium and short-term re-occurrence restriction behavior risk prediction model for prediction processing, to output re-occurrence Constraining behavioral prediction probabilities;

在本申请实施例中，该原始中短期再次出现限制行为风险预测模型可与该多维异构数据库进行对接，并将通过关联查询在所述训练数据集中选取预设维度的特征向量组，作为输入数据，输入至所述原始中短期再次出现限制行为风险预测模型中进行预测处理。In the embodiment of the present application, the original medium-to-short-term recurrence restriction behavior risk prediction model can be connected to the multi-dimensional heterogeneous database, and a feature vector group of preset dimensions can be selected from the training data set through correlation query as input The data are input into the original short-term recurrence restriction behavior risk prediction model for prediction processing.

其中，预设维度的特征向量组，具体可包括40维度的特征向量。通过选取40维度的特征向量，作为模型训练的输入，由于选取的数据维度与预测结果之间存在因果关系，因此可以提供更好的预测效果，预测准确度更高。Wherein, the feature vector group of preset dimensions may specifically include feature vectors of 40 dimensions. By selecting 40-dimensional feature vectors as the input of model training, since there is a causal relationship between the selected data dimensions and the prediction results, it can provide better prediction results and higher prediction accuracy.

在本申请实施例中，可采用关联查询的方式，在多源异构数据库中选取多维度特征向量。由于单条数据维度中前次因限制行为而进入限制活动场所的相关的信息、上次在限制活动场所里面和限制活动场所内的其他限制活动人员的社交信息需要采用关联查询才能得到。例如，当限制活动人员A在再次出现限制行为进入限制活动场所了，则限制活动人员A上次在限制活动场所内的相关信息就可以通过关联查询的方式提取出来，具体可包括限制活动人员A与哪些限制活动人员交流密切的相关信息，上次限制活动场所内的危险行为次数、冲突发生次数等。In the embodiment of the present application, a method of associative query may be used to select multi-dimensional feature vectors in multi-source heterogeneous databases. Because in a single data dimension, the relevant information about the last entry into the restricted activity place due to the restricted behavior, the social information of the last restricted activity place and other restricted activity personnel in the restricted activity place need to be obtained by using the associated query. For example, when the restricted activity person A enters the restricted activity place again, the relevant information of the restricted activity person A in the restricted activity place last time can be extracted through associated query, which can specifically include the restricted activity person A Closely communicate relevant information with those people whose activities are restricted, the number of dangerous behaviors and the number of conflicts in the place where activities were restricted last time, etc.

在步骤S160中，根据所述再次出现限制行为预测概率以及正、负例标签，计算损失函数值，根据所述损失函数值对所述原始中短期再次出现限制行为风险预测模型进行迭代训练，直到符合预设条件时，生成实际中短期再次出现限制行为风险预测模型，以通过所述实际中短期再次出现限制行为风险预测模型对限制活动人员预设时间范围内再次出现限制行为风险进行预测。In step S160, the loss function value is calculated according to the predicted probability of re-emergence of restrictive behavior and the labels of positive and negative examples, and the original medium- and short-term re-emergence of restrictive behavior risk prediction model is iteratively trained according to the loss function value until When the preset conditions are met, an actual medium-to-short-term re-emergence risk prediction model of restriction behavior is generated, so as to predict the risk of re-emergence of restriction behavior within the preset time range of restricted activity personnel through the actual medium-to-short-term re-emergence risk prediction model.

在本申请一实施例中，所述根据所述再次出现限制行为预测概率以及正、负例标签，计算损失函数值，包括：In an embodiment of the present application, the calculation of the loss function value based on the predicted probability of reoccurrence of the restricted behavior and the labels of positive and negative examples includes:

其中，该预设损失函数可为交叉熵函数，也可为均方差、KL散度等。具体可以根据实际情况进行设定，本申请再此不做限定。Wherein, the preset loss function may be a cross-entropy function, or a mean square error, KL divergence, and the like. Specifically, it can be set according to the actual situation, which is not limited in this application.

在本申请实施例中，通过损失函数和反向传播算法对所述原始中短期再次出现限制行为风险预测模型输出的再次出现限制行为预测概率以及正、负例标签，计算出损失函数值，直到该损失函数值的准确率达到预设阈值，例如98%，此时可形成实际中短期再次出现限制行为风险预测模型。In the embodiment of the present application, the loss function value is calculated for the predicted probability of re-occurrence restriction behavior and the labels of positive and negative examples output by the original short-term re-emergence restriction behavior risk prediction model through the loss function and backpropagation algorithm until The accuracy rate of the loss function value reaches a preset threshold, such as 98%, at this time, an actual short-to-medium-term reappearance restriction behavior risk prediction model can be formed.

进一步，在获取实际中短期再次出现限制行为风险预测模型之后，包括：Further, after obtaining the actual short-to-medium-term recurrence risk prediction model, including:

所述生成实际中短期再次出现限制行为风险预测模型之后，包括：After the generation of the actual short-to-medium-term reoccurrence limit behavior risk prediction model, it includes:

具体的，在通过多源异构数据库中的样本数据，对原始中短期再次出现限制行为风险预测模型进行训练完成后，可形成实际用于预测的中短期再次出现限制行为风险预测模型。此时，可获取待预测限制活动人员的当前限制行为的法律文书数据、限制行为数据以及限制活动场所内的生活数据，并进行数据挖掘以及数据清洗，形成可用于预测的数据后，将其作为输入数据，输入至训练完成的实际中短期再次出现限制行为风险模型中进行风险预测，从而可获取该待预测限制活动人员在预设时间范围内，例如，1年内的再次出现限制行为风险概率。Specifically, after training the original short-term and short-term reappearance restriction behavior risk prediction model through the sample data in the multi-source heterogeneous database, a short-term and medium-term reappearance restriction behavior risk prediction model that is actually used for prediction can be formed. At this time, the legal document data, restricted behavior data, and life data in the restricted activity place of the person to be predicted to be restricted can be obtained, and data mining and data cleaning are performed to form data that can be used for prediction. The input data is input into the actual short-term and short-term recurrence risk model of restricted behavior after training for risk prediction, so that the risk probability of the restricted behavior to be predicted can be obtained within the preset time range, for example, within 1 year.

其中，参见图4，该实际中短期再次出现限制行为风险模型可包括一个输入层、多个隐藏层以及一个输出层。第一层为输入层，与下一层的每一个节点的每一根线为一个权重控制参数，记作w，下一层的每一个节点通过上一层的每一个节点乘以其对应的w，w的参数修正可通过loss函数和梯度下降算法实现，最终形成一个拟合效果最好的模型。Wherein, referring to FIG. 4 , the actual short-to-medium-term recurrence restriction behavior risk model may include an input layer, multiple hidden layers, and an output layer. The first layer is the input layer, and each line with each node of the next layer is a weight control parameter, denoted as w, and each node of the next layer is multiplied by each node of the previous layer by its corresponding The parameter correction of w and w can be realized through the loss function and the gradient descent algorithm, and finally a model with the best fitting effect is formed.

所述将所述待预测特征数据输入至所述实际中短期再次出现限制行为风险预测模型，以输出所述待预测限制活动人员的再次出现限制行为预测概率，包括：The input of the characteristic data to be predicted into the actual short-term re-occurrence restriction behavior risk prediction model to output the re-appearance restriction behavior prediction probability of the person to be predicted restriction activities includes:

具体的，通过输入层接收预先提取出的待预测限制活动人员的待预测特征数据，并通过全连接层后，分别通过第一个32维节点的隐藏层hidden layer1，第二个64个节点的隐藏层hidden layer2，32个节点的隐藏层hidden layer3，16个节点的隐藏层hiddenlayer4，其中各个层级间使用的激活函数为sigmod函数,其中x为输入值，S(x)为通过非线性变换后的结果，具体函数如下所示：Specifically, the input layer receives the pre-extracted feature data to be predicted of people with limited activities to be predicted, and after passing through the fully connected layer, the hidden layer1 of the first 32-dimensional node and the second hidden layer of 64 nodes Hidden layer hidden layer2, hidden layer hidden layer3 with 32 nodes, hidden layer hidden layer4 with 16 nodes, where the activation function used between each layer is the sigmod function, where x is the input value, and S(x) is the non-linear transformation The result of the specific function is as follows:

进一步，可使用反向传播算法，以及对应误差计算的损失函数为交叉熵函数，具体的公式如下，其中真实分布为

，网络输出分布为/>

，总的类别数为n：Further, the backpropagation algorithm can be used, and the loss function calculated by the corresponding error is the cross entropy function. The specific formula is as follows, where the real distribution is

, the network output distribution is />

, the total number of categories is n:

例如，输入一个限制活动人员的待预测特征向量为[1,0,1...,0,1]，则对应的真实结果是没有再次出现限制行为（0），即yi；网络输出分布是指该待预测特征向量输入网络后得到的预测结果，例如，是0表示和真实的结果一致，如果为1则可表示与真实的结果不一致。For example, if the feature vector to be predicted of a person with restricted activities is input as [1,0,1...,0,1], the corresponding real result is that there is no restricted behavior (0) again, that is, yi; the network output distribution is Refers to the prediction result obtained after the feature vector to be predicted is input into the network, for example, 0 means that it is consistent with the real result, and if it is 1, it means that it is inconsistent with the real result.

最后一个隐藏层的16个节点会最终可通过Softmax函数，得到两个节点的输出层，其中Softmax的函数表达式为，其中其中

为第i个节点的输出值，C为输出节点的个数，即分类的类别个数：The 16 nodes of the last hidden layer will finally pass the Softmax function to obtain the output layer of two nodes, where the function expression of Softmax is, where

is the output value of the i-th node, and C is the number of output nodes, that is, the number of classification categories:

这两个节点对应的输出结果是二维概率向量组，表示多维待预测特征向量对应的1年后再次出现限制行为的概率情况，该模型可拟合真实场景下1年内再次出现限制行为的风险预测。其中，e表示自然常数。The output corresponding to these two nodes is a two-dimensional probability vector group, which represents the probability of the restricted behavior occurring again in one year corresponding to the multi-dimensional feature vector to be predicted. This model can fit the risk of restricting behavior occurring again within one year in real scenarios. predict. Among them, e represents a constant of nature.

进一步，在通过多次的隐藏层后，最后一个隐藏层有16个节点，每一个节点可对应一个参数，分别连接到最终的两个结果节点通过loss函数和反向传播算法对参数进行修正得到最终的合理参数，例如，最终算出来再次出现限制行为风险节点为0.3，没有再次出现限制行为风险的节点为0.7（两个预测结果的和为1），那么预测的结果就是没有再次出现限制行为风险。Further, after going through multiple hidden layers, the last hidden layer has 16 nodes, each node can correspond to a parameter, which is respectively connected to the final two result nodes, and the parameters are corrected by the loss function and the back propagation algorithm to obtain The final reasonable parameter, for example, is finally calculated to be 0.3 for the node where the risk of restrictive behavior occurs again, and 0.7 for the node where the risk of restrictive behavior does not appear again (the sum of the two prediction results is 1), then the predicted result is that there is no recurrence of restrictive behavior risk.

在本申请一实施例中，所述根据所述实际中短期再次出现限制行为风险预测模型输出所述待预测限制活动人员再次出现限制行为预测概率之后，包括：In an embodiment of the present application, after outputting the predicted probability of the restricted behavior of the person to be predicted based on the actual short-term re-emergence of the restricted behavior risk prediction model, it includes:

当所述待预测限制活动人员再次出现限制行为预测概率，大于预设阈值时，将所述待预测限制活动人员再次出现限制行为风险进行可视化展示，并向限制活动场所管理平台发送重点观察名单。When the predicted probability of the restricted behavior of the person to be predicted to be restricted again is greater than the preset threshold, the risk of the restricted behavior of the person to be predicted to be restricted will be visualized and displayed, and a key observation list will be sent to the management platform of the restricted activity place.

在本申请实施例中，通过实际中短期再次出现限制行为风险预测模型可以对限制活动场所中限制活动人员进行定期或者实时监测，如果发现任意限制活动人员的再次出现限制行为风险概率值较高，例如，高于0.7，则可通过限制活动场所内部的再次出现限制行为风险可视化平台进行展示，以便限制活动场所管理人员随时查看，并制定对应策略。并且还可向限制活动场所管理平台发送重点观察名单，例如，待预测限制活动人员的再次出现限制行为预测概率高于0.5时，则可作为重点观察人员。In the embodiment of this application, regular or real-time monitoring can be carried out on persons with restricted activities in places with restricted activities through the actual short-term and short-term re-emergence risk prediction model. For example, if it is higher than 0.7, it can be displayed through the visualization platform of the re-occurrence restriction behavior risk inside the restricted activity venue, so that the management personnel of the restricted activity venue can view it at any time and formulate corresponding strategies. And it can also send a key observation list to the restricted activity place management platform. For example, if the predicted probability of the restricted behavior of the person whose restricted activities will reappear is higher than 0.5, they can be used as key observers.

进一步，可根据该再次出现限制行为预测概率，对限制活动人员的风险程度进行排序，并根据该排序进行重点展示和观察。Further, according to the predicted probability of reoccurrence of the restricted behavior, the risk degree of the restricted activity personnel can be sorted, and key display and observation can be carried out according to the sorting.

本申请实施例，通过分析公开的法律文书、因限制行为而进入限制活动场所的限制活动人员的个人家庭信息、限制活动场所内的社交生活数据等，结合深度神经网络，并通过对于指定期限内人员进行正负例划分，构建一个中短期的限制活动人员的再次出现限制行为预测模型，对限制活动人员在预设期限内的再次出现限制行为概率进行预测，并且可结合可视化平台，给限制活动场所管理人员提供重点观察名单，以提升限制活动场所中限制活动人员管理的整体效率。有效提高预测的准确性以及效率。In this embodiment of the application, by analyzing the public legal documents, the personal family information of the restricted activity personnel who entered the restricted activity place due to the restricted behavior, the social life data in the restricted activity place, etc., combined with the deep neural network, and through the specified time limit Personnel are divided into positive and negative cases, and a short-to-medium-term prediction model for the re-emergence of restricted behaviors of persons with restricted activities is constructed to predict the probability of re-emergence of restricted behaviors within the preset period, and can be combined with a visualization platform to give restricted activities Venue managers provide key watch lists to improve the overall efficiency of personnel management in restricted-activity venues. Effectively improve the accuracy and efficiency of forecasting.

应理解，上述实施例中各步骤的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本发明实施例的实施过程构成任何限定。It should be understood that the sequence numbers of the steps in the above embodiments do not mean the order of execution, and the execution order of each process should be determined by its functions and internal logic, and should not constitute any limitation to the implementation process of the embodiment of the present invention.

在一实施例中，提供一种基于多维数据库的再次出现限制行为的预测系统，基于多维数据库的再次出现限制行为的预测系统与上述实施例中基于多维数据库的再次出现限制行为的预测方法一一对应。如图5所示，该基于多维数据库的再次出现限制行为的预测系统包括数据库构建单元10、数据集生成单元20、标签生成单元30、训练单元50和预测单元60。各功能模块详细说明如下：In one embodiment, a multidimensional database-based prediction system for recurrence restriction behavior is provided. The multidimensional database-based prediction system for recurrence restriction behavior is the same as the method for predicting recurrence restriction behavior based on multidimensional database in the above-mentioned embodiments. correspond. As shown in FIG. 5 , the multidimensional database-based prediction system for reoccurrence restriction behavior includes a database construction unit 10 , a data set generation unit 20 , a label generation unit 30 , a training unit 50 and a prediction unit 60 . The detailed description of each functional module is as follows:

数据库构建单元10，用于构建多维异构数据库，所述多维异构数据库中包括前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据；The database construction unit 10 is used to construct a multidimensional heterogeneous database, which includes the previous legal document data, the relevant data of the previous entry into the restricted activity place, the monitoring data in the previous restricted activity place, and the current restricted behavior data of legal documents;

数据集生成单元20，用于对所述前次法律文书数据、前次进入限制活动场所的相关数据、前次限制活动场所内的监控数据以及当前限制行为的法律文书数据进行数据挖掘以及数据清洗，以生成样本数据集，所述样本数据集包括训练数据集以及测试数据集；The data set generation unit 20 is used to perform data mining and data cleaning on the previous legal document data, the relevant data of the previous entry into the restricted activity place, the monitoring data in the previous restricted activity place and the legal document data of the current restricted behavior , to generate a sample data set, the sample data set includes a training data set and a testing data set;

标签生成单元30，用于根据限制活动人员在预设期限内的再次出现限制行为数据，进行正负例划分，以作为正、负例标签；The label generation unit 30 is used to divide the positive and negative examples according to the restricted behavior data of the restricted activity personnel within the preset time limit, as positive and negative example labels;

模型构建单元40，用于构建原始中短期再次出现限制行为风险预测模型；A model construction unit 40, configured to construct the original short-to-medium-term reappearance restriction behavior risk prediction model;

训练单元50，用于通过关联查询在所述训练数据集中选取预设维度的特征向量组，作为输入数据，输入至所述原始中短期再次出现限制行为风险预测模型中进行预测处理，以输出再次出现限制行为预测概率；The training unit 50 is used to select a eigenvector group of preset dimensions in the training data set through an associated query, as input data, and input it into the original short-term recurrence restriction behavior risk prediction model for prediction processing, so as to output a recurrence The predicted probability of occurrence of restrictive behavior;

预测单元60，用于根据所述再次出现限制行为预测概率以及正、负例标签，计算损失函数值，根据所述损失函数值对所述原始中短期再次出现限制行为风险预测模型进行迭代训练，直到符合预设条件时，生成实际中短期再次出现限制行为风险预测模型，以通过所述实际中短期再次出现限制行为风险预测模型对限制活动人员预设时间范围内再次出现限制行为风险进行预测。The prediction unit 60 is configured to calculate a loss function value according to the predicted probability of re-occurrence restriction behavior and the positive and negative example labels, and perform iterative training on the original medium and short-term re-occurrence restriction behavior risk prediction model according to the loss function value, Until the preset conditions are met, an actual short-to-medium-term recurrence risk prediction model of restriction behavior is generated, so as to predict the risk of restriction behavior re-emergence within the preset time range of the restricted activity personnel through the actual short-term recurrence restriction risk prediction model.

在一实施例中，数据集生成单元20，还用于：In one embodiment, the data set generating unit 20 is also used for:

在一实施例中，标签生成单元30，还用于：In an embodiment, the label generating unit 30 is also used for:

在一实施例中，预测单元60，还用于：In an embodiment, the prediction unit 60 is also used for:

在一实施例中，所述系统，还包括实际预测单元，用于：In an embodiment, the system further includes an actual prediction unit, configured to:

获取待预测限制活动人员的当前限制行为的法律文书数据、限制行为数据以及限制活动场所内的生活数据；Obtain the legal document data, restricted behavior data, and life data in the restricted activity place of the current restricted behavior of the person to be predicted;

对所述当前限制行为的法律文书数据、限制行为数据以及限制活动场所内的生活数据进行数据挖掘以及数据清洗，以生成待预测特征数据；Perform data mining and data cleaning on the legal document data of the current restricted behavior, restricted behavior data, and life data in the restricted activity site to generate feature data to be predicted;

在一实施例中，所述系统还包括：可视化展示单元，用于：In an embodiment, the system further includes: a visual display unit, configured to:

当所述待预测限制活动人员的再次出现限制行为预测概率，大于预设阈值时，将所述待预测限制活动人员的再次出现限制行为风险进行可视化展示，并向限制活动场所管理平台发送重点观察名单。When the predicted probability of the reoccurrence of the restricted behavior of the person to be predicted to be restricted is greater than the preset threshold, the risk of the reappearance of the restricted behavior of the person to be predicted to be restricted is displayed visually, and the key observation is sent to the management platform of the restricted activity site list.

本申请实施例，通过分析公开的法律文书、因限制行为而进入限制活动场所的限制活动人员个人家庭信息、在限制活动场所内的社交生活数据等，结合深度神经网络，并通过对于指定期限内人员进行正负例划分，构建一个中短期的限制活动人员再次出现限制行为的预测模型，对限制活动人员在预设期限内的再次出现限制行为概率进行预测，并且可结合可视化平台，给限制活动场所管理人员提供重点观察名单，以提升限制活动场所人员管理的整体效率。有效提高预测的准确性以及效率。In this embodiment of the application, by analyzing the public legal documents, the personal and family information of the persons with restricted activities who entered the restricted activity places due to restricted behaviors, the social life data in the restricted activity places, etc., combined with the deep neural network, and through the Personnel are divided into positive and negative cases, and a short-to-medium-term prediction model for the reoccurrence of restricted behavior by restricted activity personnel is constructed to predict the probability of restricted activity personnel's reoccurrence of restricted behavior within the preset period, and can be combined with a visualization platform to give restricted activities Venue management personnel provide a key watch list to improve the overall efficiency of personnel management in restricted activity venues. Effectively improve the accuracy and efficiency of forecasting.

关于基于多维数据库的再次出现限制行为的预测系统的具体限定可以参见上文中对于基于多维数据库的再次出现限制行为的预测方法的限定，在此不再赘述。上述系统中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中，也可以以软件形式存储于计算机设备中的存储器中，以便于处理器调用执行以上各个模块对应的操作。For the specific limitations of the prediction system for the recurrence restriction behavior based on the multidimensional database, please refer to the above definition for the prediction method for the recurrence restriction behavior based on the multidimensional database, which will not be repeated here. Each module in the above system can be fully or partially realized by software, hardware and a combination thereof. The above-mentioned modules can be embedded in or independent of the processor in the computer device in the form of hardware, and can also be stored in the memory of the computer device in the form of software, so that the processor can call and execute the corresponding operations of the above modules.

在一个实施例中，提供了一种计算机设备，该计算机设备可以是终端设备，其内部结构图可以如图6所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口。其中，该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括可读存储介质。该可读存储介质存储有计算机可读指令。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种基于多维数据库的再次出现限制行为的预测方法。本实施例所提供的可读存储介质包括非易失性可读存储介质和易失性可读存储介质。In one embodiment, a computer device is provided. The computer device may be a terminal device, and its internal structure may be as shown in FIG. 6 . The computer device includes a processor, a memory, and a network interface connected through a system bus. Wherein, the processor of the computer device is used to provide calculation and control capabilities. The memory of the computer device includes readable storage media. The readable storage medium stores computer readable instructions. The network interface of the computer device is used to communicate with an external terminal via a network connection. When the computer-readable instructions are executed by the processor, a method for predicting recurrence-limited behavior based on a multidimensional database is realized. The readable storage medium provided in this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.

在一实施例中，提供了一种计算机设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令，所述处理器执行所述计算机可读指令时实现如上述所述基于多维数据库的再次出现限制行为的预测方法的步骤。In one embodiment, a computer device is provided, comprising a memory, a processor, and computer-readable instructions stored in the memory and operable on the processor, the processor executing the computer-readable The instructions are the steps of implementing the method for predicting the reoccurrence limit behavior based on the multidimensional database as described above.

在一实施例中，提供了一种可读存储介质，所述可读存储介质存储有计算机可读指令，所述计算机可读指令被处理器执行时实现如上述所述基于多维数据库的再次出现限制行为的预测方法的步骤。In one embodiment, a readable storage medium is provided, the readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the re-appearance based on the multidimensional database as described above is realized Steps in the predictive method for limiting behavior.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机可读指令来指令相关的硬件来完成，所述的计算机可读指令可存储于一非易失性可读取存储介质或易失性可读存储介质中，该计算机可读指令在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用，均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器（ROM）、可编程ROM（PROM）、电可编程ROM（EPROM）、电可擦除可编程ROM（EEPROM）或闪存。易失性存储器可包括随机存取存储器（RAM）或者外部高速缓冲存储器。作为说明而非局限，RAM以多种形式可得，诸如静态RAM（SRAM）、动态RAM（DRAM）、同步DRAM（SDRAM）、双数据率SDRAM（DDRSDRAM）、增强型SDRAM（ESDRAM）、同步链路（Synchlink） DRAM（SLDRAM）、直接存储器总线动态RAM（DRDRAM）等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing related hardware through computer-readable instructions, and the computer-readable instructions can be stored in a non-volatile memory When being read from a storage medium or a volatile readable storage medium, the computer-readable instructions may include the processes of the embodiments of the above-mentioned methods when executed. Wherein, any references to memory, storage, database or other media used in the various embodiments provided in the present application may include non-volatile and/or volatile memory. Nonvolatile memory can include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in many forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Chain Synchlink DRAM (SLDRAM), Direct Memory Bus Dynamic RAM (DRDRAM), etc.

所属领域的技术人员可以清楚地了解到，为了描述的方便和简洁，仅以上述各功能单元、模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能单元、模块完成，即将所述装置的内部结构划分成不同的功能单元或模块，以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that for the convenience and brevity of description, only the division of the above-mentioned functional units and modules is used for illustration. In practical applications, the above-mentioned functions can be assigned to different functional units, Completion of modules means that the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above.

以上所述实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围，均应包含在本发明的保护范围之内。The above-described embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still carry out the foregoing embodiments Modifications to the technical solutions recorded in the examples, or equivalent replacement of some of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention, and should be included in within the protection scope of the present invention.

Claims

1. A method for predicting reappearance limiting behavior based on a multidimensional database, the method comprising:

constructing a multidimensional heterogeneous database, wherein the multidimensional heterogeneous database comprises previous legal document data, related data of previous access to a restricted activity place, monitoring data in the previous restricted activity place and legal document data of current restricted behavior;

Performing data mining and data cleaning on the previous legal document data, the related data of the previous access to the restricted activity place, the monitoring data in the previous restricted activity place and the legal document data of the current restricted activity to generate a sample data set, wherein the sample data set comprises a training data set and a test data set;

dividing positive examples and negative examples according to the reappearance limiting behavior data of limiting activity personnel within a preset period to serve as positive example labels and negative example labels;

constructing an original medium-short term reappearance limiting behavior risk prediction model;

selecting a feature vector group with preset dimensionality in the training data set through associated query, and inputting the feature vector group serving as input data into the original medium-short term reappearance limiting behavior risk prediction model for prediction processing so as to output reappearance limiting behavior prediction probability;

and calculating a loss function value according to the reappearance limiting behavior prediction probability, the positive example label and the negative example label, and performing iterative training on the original medium-short term reappearance limiting behavior risk prediction model according to the loss function value until a preset condition is met, generating an actual medium-short term reappearance limiting behavior risk prediction model, so as to predict reappearance limiting behavior risk in a preset time range of limiting activity personnel through the actual medium-short term reappearance limiting behavior risk prediction model.

2. The prediction method based on reappearance limiting behavior of multidimensional database according to claim 1, wherein after constructing the multidimensional heterogeneous database, the method comprises:

video image analysis is carried out on the monitoring data in the previous limited activity place, so that affinity parameters among different limited activity personnel are obtained;

and calculating the affinity of the personnel participating in the limiting action together with other personnel according to the affinity parameter, taking the affinity as data to be predicted, and classifying the data into the sample data set.

3. The prediction method of reappearance limiting behavior based on a multidimensional database according to claim 1, wherein the positive and negative case division is performed on the limiting activity personnel corresponding to the sample data set according to reappearance limiting behavior data of the limiting activity personnel within a preset period, and the method comprises the following steps:

dividing the corresponding restricted activity personnel in the sample data set into negative examples when the restricted activity personnel generate a restricted activity event again within a preset period;

and when the restricted activity personnel do not generate the restricted action event again within the preset period, dividing the restricted activity personnel corresponding to the sample data set into positive examples.

4. The multi-dimensional database based reappearance limiting behavior prediction method according to claim 1, wherein the calculating a loss function value according to the reappearance limiting behavior prediction probability and positive and negative case labels comprises:

and calculating the loss function value for the reappearance limiting behavior prediction probability, the positive example label and the negative example label through a preset loss function and a back propagation algorithm.

5. The multi-dimensional database based reappearance limiting behavior prediction method according to claim 1, wherein after generating the actual medium-short term reappearance limiting behavior risk prediction model, the method comprises:

acquiring legal document data of current limiting behaviors of limiting personnel to be predicted, related data of entering a limiting activity place due to limiting behaviors and life data in the limiting activity place;

performing data mining and data cleaning on legal document data of the current limiting behavior, related data entering a limiting activity place due to the limiting behavior and life data in the limiting activity place to generate feature data to be predicted;

and inputting the characteristic data to be predicted into the actual medium-short term reappearance limiting behavior risk prediction model so as to output reappearance limiting behavior prediction probability of the limiting movable personnel to be predicted.

6. The method for predicting the reappearance limiting behavior based on the multidimensional database according to claim 5, wherein the actual medium-short term reappearance limiting behavior risk prediction model comprises an input layer, a plurality of hidden layers and an output layer, wherein the inputting the characteristic data to be predicted into the actual medium-short term reappearance limiting behavior risk prediction model to output the reappearance limiting behavior prediction probability of the person to be predicted to limit the activity comprises:

receiving the feature data to be predicted through the input layer;

the feature data to be predicted received by the input layer are sequentially input into a plurality of hidden layers for processing;

and outputting the reappearance limiting behavior prediction probability of the to-be-predicted limiting movable personnel obtained after the last hidden layer is processed through the output layer.

7. The multi-dimensional database-based reappearance restricting behavior prediction method of claim 6, wherein after outputting the reappearance restricting behavior prediction probability of the restricting activity personnel to be predicted according to the actual medium-short term reappearance restricting behavior risk prediction model, comprising:

and when the prediction probability of the reappearance limiting behaviors of the limiting activity personnel to be predicted is larger than a preset threshold, visually displaying the reappearance limiting behavior risk of the limiting activity personnel to be predicted, and sending a key observation list to a limiting activity place management platform.

8. A multi-dimensional database based prediction system for reappearance of constraint behavior, the system comprising:

the database construction unit is used for constructing a multidimensional heterogeneous database, and the multidimensional heterogeneous database comprises former legal document data, related data of former entering a restricted activity place, monitoring data of former entering the restricted activity place and legal document data of current restricted behavior;

the data set generation unit is used for carrying out data mining and data cleaning on the previous legal document data, the related data of the previous entering restricted activity place, the monitoring data in the previous restricted activity place and the legal document data of the current restricted action so as to generate a sample data set, wherein the sample data set comprises a training data set and a test data set;

the label generating unit is used for dividing positive examples and negative examples according to the reappearance limiting behavior data of limiting active personnel within a preset period to serve as positive example labels and negative example labels;

the model construction unit is used for constructing an original medium-short term reappearance limiting behavior risk prediction model;

the training unit is used for selecting a feature vector group with preset dimensionality in the training data set through the association query, inputting the feature vector group as input data into the original medium-short term reappearance limiting behavior risk prediction model for prediction processing so as to output reappearance limiting behavior prediction probability;

And the prediction unit is used for calculating a loss function value according to the reappearance limiting behavior prediction probability and the positive and negative labels, and carrying out iterative training on the original medium-short term reappearance limiting behavior risk prediction model according to the loss function value until the preset condition is met, and generating an actual medium-short term reappearance limiting behavior risk prediction model so as to predict reappearance limiting behavior risk in a preset time range of limiting activity personnel through the actual medium-short term reappearance limiting behavior risk prediction model.

9. A computer device comprising a memory, a processor and computer readable instructions stored in the memory and executable on the processor, wherein the processor, when executing the computer readable instructions, implements the steps of the multi-dimensional database based re-occurrence restriction behavior prediction method of any one of claims 1 to 7.

10. A readable storage medium storing computer readable instructions which, when executed by a processor, implement the steps of the multi-dimensional database based re-occurrence restriction behavior prediction method of any one of claims 1 to 7.