CN118035800A

CN118035800A - Model training method, device, equipment and storage medium

Info

Publication number: CN118035800A
Application number: CN202211362661.XA
Authority: CN
Inventors: 樊鹏
Original assignee: Tencent Cloud Computing Beijing Co Ltd
Current assignee: Tencent Cloud Computing Beijing Co Ltd
Priority date: 2022-11-02
Filing date: 2022-11-02
Publication date: 2024-05-14

Abstract

The embodiments of the present application relate to the field of computer technology, and disclose a model training method, device, equipment and storage medium, which can be applied to various scenarios such as cloud technology, artificial intelligence, smart transportation, and assisted driving. The method includes: screening out reference samples from the i-th feature sample group of the first feature sample set and the second feature sample set, and determining the pseudo-label corresponding to the reference sample. Then, the first recognition module of the target recognition model is trained using the reference sample containing the pseudo-label and the feature sample in the i+1-th feature sample group. In the process of training the first recognition module, adversarial transfer training is performed on the second recognition module of the target recognition model based on the first feature sample set and the second feature sample set at the same time, until a trained target recognition model is obtained. By using the embodiments of the present application, the model performance of the neural network model can be optimized to improve the recognition accuracy of the model.

Description

Model training method, device, equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a model training method, apparatus, device, and storage medium.

Background

And the willingness of the platform such as a website and an application program to use the object is identified, so that the platform is favorable for accurately recommending and accurately serving the object. The existing method for identifying the willingness of the object mainly comprises the steps of setting an identification rule according to manual experience, and judging historical behavior data of each object through the identification rule so as to identify the willingness of the object; and extracting preset object features which are favorable for identifying the willingness of the object according to the historical behavior data of the object, and classifying the object features through the shallow neural network model so as to identify the willingness of the object. The number of the identification rules set by manual experience is limited, the optimal identification rules are difficult to determine, and the shallow neural network model is too dependent on object characteristics, so that data islands are easy to form, and the identification accuracy of the existing mode is not high. Therefore, how to obtain a neural network model with better model performance through model training so as to improve the identification accuracy of the willingness of the object is a problem to be solved at present.

Disclosure of Invention

The embodiment of the application provides a model training method, device, equipment and storage medium, which can optimize the model performance of a neural network model so as to improve the recognition accuracy of the model.

In one aspect, an embodiment of the present application provides a model training method, where the method includes:

Acquiring a first characteristic sample set and a second characteristic sample set, wherein the first characteristic sample set comprises at least one characteristic sample with a labeling label, the second characteristic sample set comprises T characteristic sample groups, and one characteristic sample group comprises at least one characteristic sample without the labeling label; t is an integer greater than or equal to 1;

screening one or more reference samples from the ith characteristic sample group of the first characteristic sample set and the second characteristic sample set, and determining a pseudo tag corresponding to each reference sample; the labeling label and the pseudo label are used for indicating the probability that the object indicated by the corresponding sample is the target business transfer electronic resource; i is an integer of 1 or more and T or less;

Training a first recognition module of the target recognition model by adopting a reference sample containing a pseudo tag and a feature sample in the (i+1) th feature sample group;

and in the process of training the first recognition module, performing anti-migration training on the second recognition module of the target recognition model based on the feature sample with the labeling label and the feature sample without the labeling label and the feature sample with the second feature sample set, wherein the feature sample is contained in the first feature sample set, and the feature sample is not contained in the second feature sample set, until the trained target recognition model is obtained.

In one aspect, an embodiment of the present application provides a model training apparatus, which includes an acquisition unit and a training unit, wherein:

the acquisition unit is used for acquiring a first characteristic sample set and a second characteristic sample set, wherein the first characteristic sample set comprises at least one characteristic sample with a labeling label, the second characteristic sample set comprises T characteristic sample groups, and one characteristic sample group comprises at least one characteristic sample without the labeling label; t is an integer greater than or equal to 1;

The training unit is used for screening one or more reference samples from the ith characteristic sample group of the first characteristic sample set and the second characteristic sample set, and determining a pseudo tag corresponding to each reference sample; the labeling label and the pseudo label are used for indicating the probability that the object indicated by the corresponding sample is the target business transfer electronic resource; i is an integer of 1 or more and T or less;

The training unit is further used for training the first recognition module of the target recognition model by adopting a reference sample containing the pseudo tag and a characteristic sample in the (i+1) th characteristic sample group;

The training unit is further configured to perform an anti-migration training on the second recognition module of the target recognition model based on the feature sample with the labeling label and the feature sample without the labeling label, where the feature sample is included in the first feature sample set, and the feature sample is included in the second feature sample set, in the training process of the first recognition module, until a trained target recognition model is obtained.

In another aspect, an embodiment of the present application provides a computer device, including an input interface and an output interface, the computer device further including:

A processor adapted to implement one or more computer programs; and

A computer storage medium storing one or more computer programs adapted to be loaded by the processor and to perform the model training method described above.

In another aspect, embodiments of the present application provide a computer storage medium storing one or more computer programs adapted to be loaded by a processor and to perform the above-described model training method.

In another aspect, embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the model training method described above.

In the embodiment of the application, the target recognition model consists of a first recognition module and a second recognition module, and training of the target recognition model is that the first recognition module and the second recognition module are trained. In the training process, the first recognition module continuously screens out a reference sample from the first characteristic sample set and the second characteristic sample set, and determines a pseudo tag of the reference sample, so that the first recognition module is continuously trained by means of the unused characteristic samples in the reference sample and the second characteristic sample set of the pseudo tag, and semi-supervised continuous training of the first recognition module is realized; meanwhile, the second recognition module performs countermeasure migration training through the first feature sample set with the label and the second feature sample set without the label. Because the first recognition module and the second recognition module can further extract the characteristics of the input characteristic samples, classification recognition processing is carried out through the deep characteristics of each extracted characteristic sample. That is, the two recognition modules in the target recognition model perform further feature extraction on the feature sample in the training process, so that the dependence on the feature sample is low, and therefore, the influence of noise in the feature sample (i.e. the feature sample is wrongly represented) on the target recognition model is small, which is beneficial to improving the recognition accuracy of the target recognition model. Meanwhile, in the embodiment of the application, two recognition modules of the target recognition model process and recognize the same characteristic sample, and the processing and recognition of the two recognition modules are mutually independent. That is, the target recognition model recognizes the same feature sample from two different dimensions, so that two recognition results of the same feature sample can be mutually verified to correct errors, which is beneficial to further improving the recognition accuracy of the target recognition model.

In addition, in the embodiment of the application, the repeated use of the feature samples is realized by continuously selecting the feature sample group used in the training and selecting the reference sample from the feature sample set containing the labeling label. The training is repeatedly performed by using the used characteristic samples, so that the problem of forgetting data can be effectively relieved. In addition, the semi-supervised continuous learning and the anti-migration learning in the embodiment of the application are both semi-supervised learning, and training can be performed without a large number of characteristic samples with labeling labels, so that the scheme has lower requirement on the magnitude of the labeling samples, is suitable for real scenes with fewer labeling samples, and has higher popularity. Finally, the embodiment of the application reduces the difference of the characteristic samples from different data distributions from the characteristic aspect by means of anti-migration training, and can effectively avoid the problem of data island.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a simple semi-supervised continuous learning process provided by an embodiment of the present application;

FIG. 2 is a schematic diagram of a model training system according to an embodiment of the present application;

FIG. 3 is a flow chart of a model training method provided by an embodiment of the present application;

FIG. 4 is a flow chart of another model training method according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a coding process according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a semi-supervised continuous training process provided by an embodiment of the present application;

Fig. 7 is a schematic structural diagram of a second identification module according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a training process of a target recognition model according to an embodiment of the present application;

FIG. 9 is a schematic structural diagram of a model training device according to an embodiment of the present application;

Fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

With the continuous development of internet technology, artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) technology has also been developed better. By artificial intelligence techniques is meant the theory, method, technique and application of simulating, extending and expanding human intelligence, sensing the environment, obtaining knowledge and using knowledge to obtain optimal results using a digital computer or a machine controlled by a digital computer. In other words, artificial intelligence is a comprehensive technique of computer science; the intelligent machine is mainly used for producing a novel intelligent machine which can react in a similar way of human intelligence by knowing the essence of the intelligence, so that the intelligent machine has multiple functions of sensing, reasoning, decision making and the like. Accordingly, AI technology is a comprehensive discipline that mainly includes Computer Vision (CV), speech processing, natural language processing, and machine learning (MACHINE LEARNING, ML)/deep learning.

The machine learning is a multi-field interdisciplinary, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of AI and is a fundamental approach for computer devices to have intelligence; the machine learning is a multi-field interdisciplinary, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like; the learning behavior of the computer equipment is specially researched to simulate or realize the learning behavior of human beings so as to acquire new knowledge or skills, and the existing knowledge structure is reorganized to continuously improve the performance of the computer equipment. Deep learning is a technique for machine learning by using a deep neural network system; machine learning/deep learning may generally include a variety of techniques including artificial neural networks, reinforcement learning (Reinforcement Learning, RL), supervised learning, unsupervised learning, anti-migration learning, and the like; the supervised learning is a process of model training using training samples with known classes (with labeled classes), and the unsupervised learning is a process of model training using training samples with unknown classes (without labels).

Here, supervised learning refers to a process of model training using training samples (i.e., labeling samples) of known categories. Semi-supervised learning refers to a process of model training using training samples with known partial categories (i.e., labeled samples) and training samples with unknown partial categories (i.e., unlabeled samples). Unsupervised learning refers to a process of model training using training samples of unknown class (i.e., unlabeled samples).

In addition, for the real requirement of letting models learn and update continuously in dynamic environment, continuous learning frameworks are proposed. The continuous learning framework embodies the continuous learning and continuous improvement process of the model, and specifically, continuous learning refers to that after the model completes the training task of a part of data, the model is helped to train the new data according to the prior experience of the part of data which has completed training when the new data arrives.

In an actual application scene, the number of marked samples in semi-supervised learning is relatively small, and the number of unmarked samples is larger. Thus, semi-supervised learning is typically combined with a continuous learning framework to form semi-supervised continuous learning. For example, referring to fig. 1, a schematic diagram of a simple semi-supervised continuous learning process is shown. The method comprises the steps that a marked sample set and P batches of unmarked sample sets can be obtained first, wherein the marked sample set comprises one or more marked samples and labels of all marked samples; the unlabeled exemplar set of any batch includes one or more unlabeled exemplars, P being a positive integer. In fig. 1, m is a positive integer, P is equal to or smaller than 1.

The model M _m-1 obtained in the previous training may be first subjected to supervised training by using the labeling sample set, so as to obtain the model M _m in the current training. And then, inputting each unlabeled sample in the unlabeled sample set of the mth batch into the model M _m to obtain a prediction label of each unlabeled sample in the mth unlabeled sample set. Then, unlabeled samples with high confidence of the predictive label need to be screened from the unlabeled sample set of the mth batch. In particular, the labels generally indicate a probability, and then the probability interval that the label of most of the labeling samples in the labeling sample set indicates falls into can be determined according to the probability; if the probability indicated by the predictive tag falls within the aforementioned probability interval, then it may be determined that the confidence of the predictive tag is high.

After the unlabeled samples with high confidence of the predictive labels are screened, the screened unlabeled samples and the predictive labels thereof can be added into the labeled sample set, and the updated labeled sample set is obtained. After the next training is started, the updated labeling sample set may be used to perform supervised training on the M _m obtained in the current training, and the training is repeated until the model M converges. Therefore, the semi-supervised continuous learning has multiple rounds of training, each round of training can adopt unused unlabeled samples, and unlabeled samples with high confidence level can be continuously screened from the unused unlabeled samples of each round of training, so that the unlabeled samples can be used as labeled samples in subsequent training.

Further, the challenge shift learning refers to the application of the generated challenge network to shift learning, and also belongs to semi-supervised learning. In the migration learning, there are two fields, namely a source field and a target field, wherein the source field data is labeled sample data, and the target field data is unlabeled sample data. The goal of the migration learning is to utilize a model trained in the source domain to perform effective migration of knowledge, and finally, the model can be applied to the target domain. The generation of the countermeasure network mainly includes a generator and a discriminator. The model used in the anti-migration learning mainly includes a feature extractor (may also be referred to as a generator), a discriminator, and a classifier. The discriminator is used for discriminating whether the output characteristics of the characteristic extractor belong to a source domain or a target domain; the feature extractor continuously reduces the difference between the extracted features of the source domain data and the features of the target domain data in the countermeasure learning with the discriminator; the classifier is based on supervised training of the features of the source domain data and the labels of the source domain data extracted by the feature extractor.

Based on the above, the embodiment of the application provides a model training scheme, and a target recognition model to be trained comprises two recognition modules, wherein the two recognition modules are simultaneously trained based on a feature sample set containing at least one feature sample with a labeling label and a plurality of feature sample groups until the trained target recognition model is obtained. Wherein a set of feature samples comprises at least one feature sample for which no labeling label is present. One recognition module in the target recognition model is trained in a semi-supervised continuous learning mode, one or more reference samples are screened from a characteristic sample set and a characteristic sample set used in the training, and a pseudo label corresponding to each reference sample is determined so as to be convenient for training the recognition module by adopting the reference sample containing the pseudo label and the characteristic sample in the characteristic sample set used in the next training. Another recognition module in the target recognition model is trained by means of anti-migration learning, and the recognition module performs anti-migration training based on feature samples with labeled labels contained in the feature sample set and feature samples without labeled labels contained in the feature sample sets.

It is easy to see that the target recognition model in the scheme is composed of two recognition modules, and the two recognition modules can be trained in a semi-supervised continuous learning and anti-migration learning mode respectively. The semi-supervised continuous learning and the anti-migration learning are deep learning algorithms, and the deep learning algorithms further extract features of the initially input features to obtain deep features of each feature sample. That is, the two recognition modules in the target recognition model perform further feature extraction on the feature sample in the training process, so that the dependence on the feature sample is low, and therefore, the influence of noise in the feature sample (i.e. the feature sample is wrongly represented) on the target recognition model is small, which is beneficial to improving the recognition accuracy of the target recognition model. Meanwhile, in the scheme, two recognition modules of the target recognition model can process and recognize the same characteristic sample, and the recognition of the two recognition modules is independent. That is, the target recognition model recognizes the same feature sample from two different dimensions, so that two recognition results of the same feature sample can be mutually verified to correct errors, which is beneficial to further improving the recognition accuracy of the target recognition model.

In addition, since the model continuously updates its model parameters according to the training conditions of the training samples during the training process, when the training time is long or the training samples are too many, the influence of the training samples adopted during the initial training on the model is weaker, and the so-called problem of data forgetting is generated. The labeled feature samples used in each semi-supervised training are selected from the feature sample group used in the training, and the labeled feature samples are also selected from the feature sample set containing labeling labels. The recognition module is trained by repeatedly adopting the used characteristic samples, so that the problem of forgetting data can be effectively relieved.

In addition, the semi-supervised continuous learning and the anti-migration learning in the scheme are both semi-supervised learning, and training can be achieved without a large number of characteristic samples with labeling labels, so that the scheme has low requirement on the magnitude of the labeling samples, is suitable for real scenes with fewer labeling samples, and has high popularity. Meanwhile, the characteristic samples with the labels and the characteristic samples without the labels are often from different data distributions, so that the characteristic difference is large, the characteristic samples without the labels are difficult to accurately identify by a model obtained by performing supervised training on the characteristic samples with the labels, and only the characteristic samples from the same data distribution with the characteristic samples with the labels can be accurately identified, so that the problem of data island is solved. The target recognition model also reduces the difference of characteristic samples from different data distributions in characteristic aspect by means of anti-migration training, and can effectively avoid the problem of data island.

The recognition modules in the target recognition model are deep neural networks (namely, the neural networks consisting of multiple layers of neural networks), so that the recognition modules in the target recognition model can be trained by using a deep learning algorithm such as semi-supervised continuous learning and anti-migration learning. Furthermore, a feature sample refers to object features of an object; where object features refer to feature vectors that can characterize the behavior of an object. Since the reference sample is a sample selected from each feature sample in the feature sample set and each feature sample in the feature sample group, the reference sample is also a feature sample, and also refers to an object feature of one object.

Meanwhile, any label is used for indicating the probability that the object indicated by the corresponding characteristic sample is the target business transfer electronic resource, and any pseudo label is used for indicating the probability that the object indicated by the corresponding reference sample is the target business transfer electronic resource. In particular, the target business may be a physical or virtual product or service. For example, the target service may be a product of an entity such as a subwoofer, and an actual service such as a top-up nail; the target business may also be a virtual product such as game skin, and a virtual service such as video member. The electronic resource may be resource data capable of serving as a general equivalent, or may be an item capable of serving as a general equivalent that is paid in electronic form by payment software, which is not limited herein. Then, the probability that the object transfers the electronic resource for the target service can also be understood as the probability that the object purchases or uses the target service.

In addition, the pseudo tag of the reference sample in the scheme can be a tag obtained by manually labeling the feature sample without the labeling tag, or can be a tag determined by an identification tag obtained by carrying out identification processing on the feature sample without the labeling tag according to the identification module.

Based on the above model training method, the embodiment of the present application provides a model training system, which can refer to fig. 2, and the model training system shown in fig. 2 may include a plurality of terminal devices 201 and a plurality of servers 202, where a communication connection is established between any one of the terminal devices and any one of the servers. Terminal device 201 may include any one or more of a smart phone, tablet, notebook, desktop, smart car, and smart wearable device. A wide variety of Applications (APP) may be running within the terminal device 201, such as office clients, online conference clients, multimedia play clients, social clients, browser clients, information flow clients, educational clients, and so on. The server 202 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a content delivery network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligence platform. The terminal device 201 and the server 202 may be directly or indirectly connected through wired or wireless communication, which is not limited herein.

In one embodiment, the above model training method may be executed only by the terminal device 201 in the model training system shown in fig. 2, and specifically the execution process is: the terminal device 201 firstly obtains a feature sample set containing at least one feature sample with a label tag and a plurality of feature sample groups; wherein a set of feature samples comprises at least one feature sample for which no labeling label is present. Then, the terminal device 201 performs semi-supervised continuous training on one recognition module in the target recognition model, and in the process of each round of semi-supervised training, the terminal device 201 screens out one or more reference samples from the feature sample set and the feature sample set used in the round of training, and determines a pseudo tag corresponding to each reference sample. While the terminal device 201 performs semi-supervised continuous learning on one recognition module in the target recognition model, the terminal device 201 performs countermeasure migration training on another recognition module in the target recognition model according to the feature sample set and the plurality of feature sample sets until a trained target recognition model is obtained.

Alternatively, the above model training method may be performed only by the server 202 in the model training system shown in fig. 2, and the specific execution process of the above terminal device 201 during model training may be referred to as the specific execution process, which is not described herein.

In another embodiment, the model training method may be run in a model training system, which may include a terminal device and a server. Specifically, the model training method may be completed by the terminal device 201 and the server 202 included in the model training system shown in fig. 2, and the specific implementation process is as follows: in response to the labeling operation on each feature sample, the terminal device 201 obtains a label of each feature sample to construct a feature sample set including at least one feature sample having a label. Note that, the labeling operation for each feature sample may be initiated by the use object of the terminal device 201. The terminal device 201 may then send the collected feature sample set to the server 202.

After receiving the feature sample set, the server 202 may obtain a plurality of feature sample sets; wherein a set of feature samples comprises at least one feature sample for which no labeling label is present. After that, the server 202 performs semi-supervised continuous training on one recognition module in the target recognition model, and in the process of each round of training, the terminal device 201 screens out one or more reference samples from the feature sample set and the feature sample set used in the round of training, and determines a pseudo tag corresponding to each reference sample. While the terminal device 201 performs semi-supervised continuous training on one recognition module in the target recognition model, the terminal device 201 performs anti-migration training on another recognition module in the target recognition model according to the feature sample set and the plurality of feature sample sets until a trained target recognition model is obtained.

Based on the model training scheme and the model training system, the embodiment of the application provides a model training method. Referring to fig. 3, a flow chart of a model training method according to an embodiment of the present application is shown. The model training method shown in fig. 3 may be performed by a server or a terminal device. The model training method shown in fig. 3 may include steps S301 to S304:

S301, a first characteristic sample set and a second characteristic sample set are acquired.

In an embodiment of the present application, the first feature sample set includes at least one feature sample having a label tag. The second feature sample set comprises T feature sample groups; wherein, a feature sample group contains at least one feature sample without label, and T is an integer greater than or equal to 1. It can be seen that the first feature sample set is equivalent to a labeled sample set, while the second feature sample set is equivalent to an unlabeled sample set.

Furthermore, a feature sample refers to object features of the respective object; and the object features may be feature vectors that characterize the behavior of the object. Any label is used for indicating the probability that the object indicated by the corresponding characteristic sample is the target business transfer electronic resource. The labeling label of any feature sample is a label obtained by manually labeling the corresponding feature sample according to experience.

In addition, since the above-mentioned target business may be an entity or a virtual product or service, and may be resource data capable of serving as a general equivalent, or may be an item capable of serving as a general equivalent that is paid in electronic form by payment software; the probability of the object transferring the electronic resource for the target service is thus equivalent to the probability of the object paying for the target service, i.e. the willingness or likelihood of the object purchasing the target service.

S302, one or more reference samples are screened from the ith characteristic sample group of the first characteristic sample set and the second characteristic sample set, and a pseudo tag corresponding to each reference sample is determined.

In the embodiment of the present application, i is an integer of 1 or more and T or less. Since the reference sample is a sample selected from each feature sample in the feature sample set and each feature sample in the feature sample group, the reference sample is also a feature sample, and also refers to an object feature of one object. Meanwhile, the one or more reference samples are selected for training the first recognition module of the target recognition model later, and the first recognition module of the target recognition model is trained in a semi-supervised continuous learning mode. Then, the specific embodiment of screening the reference sample can be referred to as the specific embodiment of unlabeled sample with high screening confidence in the simple semi-supervised continuous learning.

The reference sample screening method may be: according to the probability indicated by the labeling labels of all the feature samples in the first feature sample set, determining a probability interval in which most of the probability indicated by the labeling labels falls. And then, carrying out identification processing on each feature sample in the ith feature sample group through a first identification module to obtain an identification tag corresponding to each feature sample. And if the probability indicated by the identification tag is in the determined probability interval, determining the corresponding characteristic sample as a reference sample. For example, there are 100 feature samples in the first feature sample set, wherein the probability of labeling of 78 feature samples is between 82% and 92%, the probability of labeling of 7 feature samples is between 93% and 100%, the probability of labeling of 6 feature samples is between 72% and 81%, and the probability of labeling of 4 feature samples is between 60% and 71%. Then, the probability interval can be determined to be 82% to 92%.

S303, training a first recognition module of the target recognition model by adopting a reference sample containing the pseudo tag and a characteristic sample in the (i+1) th characteristic sample group.

In the embodiment of the application, the target recognition model comprises a first recognition module and a second recognition module, and the first recognition module and the second recognition module are deep neural networks (namely, the neural networks consisting of multiple layers of neural networks), so that the first recognition module and the second recognition module can be suitable for training through a deep learning algorithm such as semi-supervised continuous learning and anti-migration learning.

In addition, the specific way of training the first recognition module of the target recognition model by using the reference sample containing the pseudo tag and the feature sample in the (i+1) th feature sample group may be: and adopting a first identification module to identify any reference sample containing the pseudo tag, and obtaining the identification tag of the corresponding reference sample. The identification process comprises the following steps: performing feature extraction on any reference sample containing the pseudo tag to obtain reference features of the corresponding reference sample; and performing classification recognition processing on any reference sample to obtain the recognition tag of the corresponding reference sample. Then, determining the matching degree of the identification label of each reference sample and the pseudo label of the corresponding reference sample, and determining that the training of the first identification module is completed when the matching degree meets the target matching degree; and when the matching degree does not meet the target matching degree, training the first recognition module of the target recognition model by adopting the feature samples in the (i+1) th feature sample group until the training of the first recognition module is completed.

It should be noted that, the first recognition module is a deep neural network, and in the process of extracting features through the deep neural network, features extracted by each network layer in the deep neural network are a process from shallow layer to deep layer, wherein shallow layer features generally refer to features of local details; further processing of the shallow features may result in middle layer features, which are typically referred to as features of a partial local structure; further processing of the mid-level features may result in deep features, which are often referred to as features of the overall abstract information. Because the shallow layer features and the middle layer features are local features and the deep layer features are integral features, the features extracted by the first recognition module can be called network layer or global features of the deep layer features.

The identification tag is used for indicating the probability that the object indicated by the corresponding reference sample is the target business transfer electronic resource. Since the pseudo tag is also used for indicating the probability that the object indicated by the corresponding reference sample is the target service transfer electronic resource, the obtaining the matching degree of the identification tag of each reference sample and the pseudo tag of the corresponding reference sample may be: and calculating the difference between the identification label of each reference sample and the pseudo label of the corresponding reference sample, and obtaining the matching degree of the identification label of each reference sample and the pseudo label of the corresponding reference sample according to the calculated difference. The target matching degree and the matching degree may be specific values capable of indicating the degree, such as 89%, 0.8, etc., or words indicating the degree, such as matching, very matching, comparatively matching, mismatch, etc., and are not limited herein.

For example, the target matching degree may be set to 80% in advance; and when the difference is less than or equal to 0.02, the matching degree is 95%; when the difference is less than or equal to 0.05 and greater than 0.02, the matching degree is 80%; when the difference is less than or equal to 0.1 and greater than 0.05, the matching degree is 65%; when the difference is greater than 0.1, the degree of matching is 0%. The probability of the identification tag indication of the reference sample a is 0.85, the probability of the pseudo tag indication is 0.79, and the matching degree of the identification tag and the pseudo tag of the reference sample a is 65% because the difference between the identification tag and the pseudo tag of the reference sample a is 0.6, so the matching degree of the reference sample a does not satisfy the target matching degree.

In addition, since the above-mentioned semi-supervised continuous learning has multiple rounds of training, each round of training adds new unlabeled samples, the ith feature sample set of the second feature sample set used in step S302 corresponds to the ith round of training representing the semi-supervised continuous learning, and the (i+1) th feature sample set used in step S303 corresponds to the (i+1) th round of training representing the semi-supervised continuous learning. Then, training the first recognition module of the target recognition model by using the feature samples in the (i+1) th feature sample group until the training of the first recognition module is completed may be: taking the (i+1) th characteristic sample set as the (i) th characteristic sample set, and executing steps S302 to S303 until training of the first recognition module is completed.

S304, in the process of training the first recognition module, performing anti-migration training on the second recognition module of the target recognition model based on the feature sample with the labeling label and the feature sample without the labeling label and the feature sample with the second feature sample set, wherein the feature sample is contained in the first feature sample set, and the feature sample is not contained in the second feature sample set, until the trained target recognition model is obtained.

In an embodiment of the present application, the second recognition module of the object recognition model may include a feature extractor (may also be referred to as a generator), a classifier, and a discriminator. Then the specific process of the anti-migration training may be: firstly, extracting features of each feature sample in a first feature sample set by adopting a feature extractor in a second recognition module to obtain target features corresponding to each feature sample in the first feature sample set, and determining that a target source of the target features corresponding to each feature sample in the first feature sample set is the first feature sample set; and processing each feature sample in the second feature sample set to obtain target features corresponding to each feature sample in the second feature sample set, and determining target sources of the target features corresponding to each feature sample in the second feature sample set as the second feature sample set.

Then, a discriminator in the second recognition module is adopted to discriminate the source of each target feature, and a label discrimination result corresponding to each target feature is obtained; the label discrimination result corresponding to any target feature is used for indicating that the source of any target feature is a first feature sample set or a second feature sample set. Meanwhile, a classifier in the second recognition module is adopted to classify each feature sample in the first feature sample set, so that recognition tags corresponding to each feature sample in the first feature sample set are obtained; the classification identification label corresponding to any one of the first feature samples is used for indicating the probability that the object indicated by the corresponding feature sample is the target business transfer electronic resource. And finally, training the second recognition module through the target source of each target feature, the label discrimination result corresponding to each target feature, the recognition label and the labeling label corresponding to each feature sample in the first feature sample set, and obtaining a trained second recognition module.

Based on the model training scheme and the model training system, the embodiment of the application provides another model training method. Referring to fig. 4, a flow chart of another model training method according to an embodiment of the present application is shown. The model training method shown in fig. 4 may be performed by the server or the terminal device shown in fig. 1. The model training method shown in fig. 4 may include the steps of:

s401, a first characteristic sample set and a second characteristic sample set are acquired.

In an embodiment of the present application, the first feature sample set includes at least one feature sample having a label tag. The second feature sample set comprises T feature sample groups; wherein, a feature sample group contains at least one feature sample without label, and T is an integer greater than or equal to 1. The specific meaning of the feature sample and the label may be referred to the description of step S301, which is not repeated here.

Since one feature sample refers to the object feature of the corresponding object, the first feature set and the second feature set are both composed of feature samples, except that the feature samples in the first feature set contain labeling tags. Thus, acquiring any feature sample set is acquiring a feature sample, that is, acquiring a feature of the object; then the specific way to obtain either feature set may be: firstly, acquiring image data of a target object, and judging the target object based on the image data of the target object to obtain an object judging result; then, when the object discrimination result indicates that the target object is a normal object, historical behavior data of the target object is obtained, and portrait features and business features of the target object are built based on the historical behavior data; and finally, carrying out coding processing and splicing processing on the portrait features and the business features to obtain target object features of the target object, and taking the target object features as a feature sample.

The target object may include one or more of an object that uses or is using a target service, an object that uses or is using an associated service of the target service, and an object that uses or is using another service of the same type as the target service, and the like, which is strongly related to the target service.

For example, assuming that the target service is a dedicated conference room in an online conference APP, the dedicated conference room can provide a better audio-video experience when the online conference, accommodate more people and conferences, etc. The associated service of the exclusive meeting room can be other products or services in the online meeting APP or products or services in other APP which are the same as the developer of the online meeting APP; other services of the same type as the service of the dedicated conference room may be services in other APPs of which the service type is office or web conference. Then the target object may be an object that purchased or experienced a dedicated conference room service of the online conference APP; the method can also be an object of products or services such as a common conference room and one-to-one video in the online conference APP, or can be an object of products or services in other APP developed by a developer who uses or is using the online conference APP; but also the object of the service in other APPs that have used or are using the type of service as office or web conference. The target object may be obtained by screening according to human experience, or may be obtained by screening according to a service logic rule set by human, which is not limited.

Meanwhile, the portrait data comprises non-privacy behavior data of the target object in the target service or the related service of the target service. The portrait data is composed of data of a plurality of portrait types. For example, when the target service is an online conference APP, the portrait data may include whether the target object installs the online conference APP, whether the target object transfers electronic resources for other services in the online conference APP except for dedicated conference rooms (i.e., whether the target object purchases other products or services in the online conference APP), and whether the target object installs three portrait types of data of an office APP developed by a developer T of the online conference APP. Wherein, the data of whether the portrait type target object installs the online conference APP may be 1 (for indicating that the target object installs the online conference APP) or 0 (for indicating that the target object does not install the online conference APP). The other image types are the same and are not described in detail herein.

Further, the object discrimination result is used to indicate that the target object is a normal object or an abnormal object. In a real service scenario, there may be a case where an object using a service is a false object (i.e., an abnormal object), for example, a code may be used to manipulate a terminal device to download an APP, play a game, and the like. Therefore, in order to avoid that the object features of the abnormal object affect the recognition accuracy of the trained target recognition model, the abnormal object can be removed when the feature sample is acquired.

Specifically, the specific way to obtain the object discrimination result may be to perform discrimination processing on the object based on the image data of the object: acquiring normal data intervals of various image types; if the error between the data of any image type in the image data of the target object and the normal data interval of the corresponding image type is larger than or equal to a preset error, generating an object judging result for indicating that the target object is an abnormal object; if the error between any image type data in the image data of the target object and the normal data interval of the corresponding image type is smaller than the preset error, or the data of any image type in the image data of the target object is in the normal data interval of the corresponding image type, generating an object judging result for indicating that the target object is a normal object.

Specifically, the normal data interval and the preset error of any image type may be set manually according to data statistics experience, or may be set by an electronic device in the model training system, which is not limited herein. In a specific implementation, the process of obtaining the object discrimination result adopts the Laida criterion. The specific process of the 'Laida criterion' comprises the following steps: it is assumed that a group of detection data only contains random errors, standard deviation is obtained by calculating the detection data, a section is determined according to a certain probability, and the error exceeding the section is considered to be not random errors but coarse errors, and the data containing the error should be removed.

For example, when the traffic usage time of the target object is 23 hours and the image type is the traffic usage time, the traffic usage time of the plurality of normal objects of the online conference APP may be processed, and the time interval of the calculated normal traffic usage time is 0.5 hours to 4 hours, and the preset error (i.e., standard deviation) of the traffic usage time is 0.2 hours. Since the error between 23 hours and 4 hours is much greater than 0.2 hours, it can be determined that the target object is an abnormal object.

In addition, the historical behavior data of the target object may include one or more data related to the behavior of the target object, such as object attribute data of the target object, and business interaction data of the target object and the target business, an associated business of the target business, or other businesses of the same business type of the target business. Wherein the object attribute data refers to data related to basic information and usage habit of the object.

Specifically, the specific way to construct the portrait features and business features of the target object based on the historical behavior data may be: constructing portrait features of the target object based on object attribute data of the target object; and constructing the business characteristics of the target object based on the business interaction data of the target object and the target business, the business interaction data of the related business of the target business and the business interaction data of other businesses with the same business type as the target business.

Specifically, the portrait features are obtained by analyzing object attribute data, and the business features are obtained by analyzing related business interaction data. The portrait characteristics may include one or more of object base attributes, device base attributes, network connection attributes, and the like. For example, the object basic attribute may include an attribute related to the object's own information such as the sex of the object, the age of the object, and the like; the device base attributes may include the brand, model, price, etc. of the terminal device used by the object, and the attributes related to the terminal device used by the object; the network connection attribute may include an attribute related to a terminal device connection network of the object, such as the number of times the terminal device of the object is connected to a certain WiFi, the network speed of the network to which the terminal device of the object is connected, and the like.

The service characteristics refer to the service behavior characteristics of the object in the target service, the associated service of the target service and other services with the same service type as the target service. For example, the business characteristics may be click-through rate and conversion rate of the target object to advertisements for other businesses of the business type with the target business.

Meanwhile, the specific process of encoding and splicing the portrait features and the business features to obtain the object features of the target object can be as follows: respectively carrying out coding processing on the portrait features and the service features to obtain coded portrait features and coded service features; and then, splicing the coded image features and the coded service features to obtain object features of the target object.

In particular, specific modes of the Encoding process may include single-value Encoding (One-Hot Encoding), count Encoding (Count Encoding), and type-induction Encoding (Consolidation Encoding).

For some features which can be distinguished by numerical values, preferential total energy coding can be performed by a single-value coding mode. For example, for the portrait feature of the sex attribute, since the sex of the object is both male and female, the sex attribute may be encoded by a single value encoding method, and when the sex of the object is male, the sex attribute of the object may be encoded as (1, 0); when the sex of the object is female, the sex attribute of the object may be encoded as (0, 1).

For the features requiring the calculation times, the features requiring the calculation times are coded in a counting coding mode so as to reflect the interest degree of the object in something. For example, for a wireless network connection interest feature (i.e., wi-Fi POI), since the more times an object connects to a wireless network, the more interest and attention the object has in the wireless network, the feature may be encoded in a number encoding manner to reflect the object's interest in connecting to the wireless network. For example, the feature of "food-a country dish" may be encoded as 3 since the target object has eaten 3 canteen dishes within a week.

As for the features that can be generalized into one category, they can be generalized into one feature and encoded. Illustratively, the system version features of the android mobile phone include "4.2", "4.4", and "5.0", and these three features can be generalized to one system feature of "low-version android system" based on experience; meanwhile, the system characteristic of the 'low-version android system' can be set to be used for indicating '4.2' when the value is 1, 4.4 when the value is 2 and 5.0 when the value is 3. In a specific implementation, many features are too fine, which brings difficulty to fusion and processing of the features, and the Consolidation Encoding coding mode brings more forward benefits than direct single-value coding of the features in some cases.

Optionally, due to the above mentioned feature of wireless network connection interest, feature of equipment brands and the like, and the feature of the target object such as click rate, conversion rate and the like of advertisements of other services of the service type of the target service, the feature values in different time periods are different. In order to more fully analyze the portrait features and business features of the object, the portrait features and business features may be defined from different time dimensions, and thus both portrait features and business features may be features that include one or more time dimensions. Then, the specific way to obtain the object features of the target object by performing encoding processing and splicing processing on the portrait features and the service features may be: respectively carrying out coding processing on the portrait features and the business features in different time dimensions to obtain coding features; and performing splicing treatment on the coding features, and taking the spliced coding features as target object features of the target object.

In particular, different time dimensions refer to different time periods, such as 4 days, one week, half month, four different time dimensions for one year. The specific way of respectively coding the portrait features and the business features in different time dimensions to obtain the coding features can be as follows: carrying out aggregation treatment on feature values of the portrait features of each time dimension to obtain a first aggregation feature; and carrying out aggregation treatment on the business characteristics of each time dimension to obtain a second aggregation characteristic. And respectively encoding the first aggregation feature and the second aggregation feature to obtain encoding features. The specific mode of the polymerization treatment can be as follows: the feature values of the portrait features or the business features of each time dimension are spliced, and one or more of the sum, the median or the standard deviation of the feature values of the portrait features of each time dimension are calculated to obtain the aggregation features of the corresponding portrait features or business features.

In a specific implementation, referring to fig. 5, a schematic diagram of a coding process is shown. The image feature a is defined from four time dimensions of half a year from the current time, 3 months from the current time, 1 month from the current time, and 7 days from the current time, with the same color boxes representing the same time dimension. As shown by vector 501, feature values of portrait feature a in the time dimension half a year from the current time include 1, 2, 3, 0; the characteristic values in the time dimension 3 months from the current time include 0,3, 7, 2; the characteristic values in the time dimension of 1 month from the current time include 0,3, 0, 4; the feature values in the time dimension 7 days from the current time include 5, 1, 0, 4. Then, vector 501 is averaged and pooled, i.e., an average of all feature values in each time dimension is calculated, and finally, a first aggregate feature of portrait feature a, i.e., vector 502, is obtained.

Optionally, after the object features of the target object are obtained, the object features of the target object may be stored offline in the HDFS (The Hadoop Distributed FILE SYSTEM), so that the object features may be quickly accessed and obtained during subsequent model training. In a specific implementation, since the object features are encoded features obtained by encoding a plurality of portrait features and business features, for each object, the object features of the object may be a multi-dimensional numeric vector, such as (1,0,31,4,0.2,9.3,8.8, …,0,0,1,2,34).

S402, obtaining the feature value of each feature sample in the ith feature sample group of the second feature sample set, and screening out a representative sample from the ith feature sample group based on the feature value.

In the embodiment of the application, the characteristic value refers to the contribution value of the characteristic sample to the training target. Specifically, the training target of the first recognition module is the recognition label obtained by performing recognition processing on the feature sample, and the difference between the recognition label and the labeling label of the feature sample is as small as possible. Then, the first recognition module can be assisted in correcting errors by repeatedly recognizing the feature samples which are easy to recognize the errors by the first recognition module, so that the recognition accuracy is improved; and the first recognition module is helped to distinguish similar characteristic samples by repeatedly recognizing the samples with smaller differences from other characteristic samples, so that the recognition accuracy is improved. In a specific implementation, the feature value may be represented by information entropy and a distance between each feature sample and other feature samples (i.e., a clustering degree of the feature samples).

Specifically, based on the feature value, the manner of screening the representative sample from the ith feature sample group may be: and carrying out information entropy calculation on each characteristic sample in the ith characteristic sample group of the second characteristic sample set to obtain information entropy corresponding to each characteristic sample, and taking one or more characteristic samples with the corresponding information entropy larger than a target entropy threshold value in the ith characteristic sample group as initial reference samples. And then, performing distance calculation on each selected initial reference sample, and performing clustering operation on one or more obtained initial reference samples based on the calculated distances. And finally, determining a representative sample of the ith characteristic sample group from one or more initial reference samples according to the result of the clustering operation.

Wherein, the lower the information entropy, the higher the information purity (i.e., the higher the accuracy of the information), and the lower the information purity (i.e., the lower the accuracy of the information). Therefore, the higher the feature value of the feature sample with higher information entropy is, the more the initial reference sample can be screened out by comparing whether the corresponding information entropy is larger than the target entropy threshold value.

In addition, according to the shannon formula, it can be known that the information entropy needs to be obtained through probability calculation, so that a first recognition module needs to be adopted to perform recognition processing on each feature sample in the ith feature sample group of the second feature sample set, and a recognition tag corresponding to each feature sample in the ith feature sample group (namely, the probability that an object indicated by the corresponding feature sample is a target business transfer electronic resource) is obtained. Then, the specific way to calculate the information entropy of each feature sample in the ith feature sample group of the second feature sample set to obtain the information entropy corresponding to each feature sample may be: and obtaining a shannon formula, and respectively solving the shannon formula by adopting identification labels corresponding to the characteristic samples in the ith characteristic sample group to obtain information entropy corresponding to the characteristic samples in the ith characteristic sample group. The solution process of shannon formula is a technical means commonly used by those skilled in the art, and is not described herein.

In addition, since the recognition processing of the first recognition module in step S303 includes feature extraction and classification recognition processing, global features corresponding to each feature sample are obtained during feature extraction, and the distance between the samples can be represented by the distance between the features.

Then, performing distance calculation on each selected initial reference sample, and performing clustering operation on one or more obtained initial reference samples based on the calculated distances, where a specific manner of obtaining a clustering operation result may be: and calculating the distance between the global features corresponding to any two initial reference samples. The cut-off distance is obtained by continuously updating the initial cut-off distance in the clustering operation process, and the initial cut-off distance can be a manually set distance. Then, calculating the local density and the relative distance of each initial reference sample according to the cut-off distance; finally, clustering is performed according to the local density and the relative distance of each initial reference sample, so as to obtain one or more clustering centers (namely, the result of clustering operation can be also called as local cluster center). Thus, the specific manner of determining the representative sample of the ith feature sample group from one or more initial reference samples based on the result of the clustering operation may be: the initial reference sample indicated by the cluster center is taken as a representative sample.

Wherein the local density of any one of the initial reference samples is obtained from the truncated distance and the distance between any one of the initial reference samples and each of the initial reference samples other than any one of the initial reference samples; the relative distance of any initial reference sample refers to the minimum distance between any initial reference sample and a target reference sample, and the target reference sample refers to an initial reference sample having an initial local density that is greater than the local density of any initial reference sample. In a specific implementation, the specific algorithm for screening representative samples according to the information entropy of each feature sample in the feature sample set and the result of the clustering operation of the feature sample set may be an entropy density peak clustering algorithm (Entropyand DENSITY PEAKS Cluster, EDPC), which is a technical means commonly used by those skilled in the art and is not described herein.

S403, acquiring an alternative sample set.

In the embodiment of the application, the alternative sample set comprises the feature samples without labels in the first i feature sample groups. Wherein the first i feature sample groups at least comprise the i-th feature sample group. That is, the first i feature sample groups may include only the i-th feature sample group. Alternatively, the first i feature sample groups may also include the ith feature sample group, and other feature sample groups of the T feature sample groups for which reference sample screening has been completed.

Wherein, the representative sample screened in step S402 is subsequently used as a reference sample, and the corresponding pseudo tag is determined. Thus, when the i-th feature sample group includes only the i-th feature sample group, the feature samples in the i-th feature sample group, for which no label exists, refer to feature samples in the i-th feature sample group other than the feature samples screened as representative samples. When the i-th feature sample group includes the i-th feature sample group and other feature sample groups of the T feature sample groups for which reference sample screening has been completed, the feature samples of the i-th feature sample group for which no tag exists refer to feature samples other than the feature samples of the i-th feature sample group that are screened as representative samples.

S404, screening similar samples of the representative samples from the characteristic samples in the alternative sample set according to the characteristic samples in the first characteristic sample set and the sample distances between the characteristic samples in the alternative sample set and the representative samples.

In the embodiment of the present application, according to each feature sample in the first feature sample set and the sample distance between each feature sample in the candidate sample set and the representative sample, a specific manner of screening similar samples of the representative sample from each feature sample in the candidate sample set may be: taking each characteristic sample in the first characteristic sample set and the representative sample screened from the ith characteristic sample set as a new labeling sample set; selecting a neighbor sample set of any characteristic sample in the alternative sample set from the new labeling sample set; when the representative sample exists in the neighbor sample set, any characteristic sample is taken as a similar sample of the corresponding representative sample.

The specific way to select the neighbor sample set of any feature sample in the candidate sample set from the new labeling sample set may be:

1) A target sample set is determined. The target sample set comprises each characteristic sample in the new labeling sample set and each characteristic sample in the alternative sample set.

2) And performing distance calculation on each characteristic sample in the target sample set, and performing clustering operation on each characteristic sample in the target sample set based on the obtained distance to obtain the local density and the local cluster center corresponding to each characteristic sample in the target sample set.

Wherein, the local cluster center of any characteristic sample refers to the cluster center. It should be noted that, the specific embodiment of obtaining the local density and the clustering center corresponding to each feature sample through the clustering operation may refer to the specific embodiment of obtaining the local density and the clustering center through the clustering operation in step S402, which is not described herein.

3) And determining the characteristic samples in the new labeling sample set, which have the same local cluster center as the characteristic samples in the candidate sample set, as candidate neighbor samples of the corresponding characteristic samples in the candidate sample set, so as to obtain candidate neighbor sample sets corresponding to the characteristic samples in the candidate sample set.

Optionally, if the number of feature samples in the new labeling sample set, which have the same local cluster center as any feature sample in the candidate sample set, is less than the preset number, the feature sample in the new labeling sample set, which is not selected as a candidate neighbor sample, and the feature sample closest to the any feature sample in the candidate sample set, is added as a candidate neighbor sample to the candidate neighbor sample set of the any feature sample in the candidate sample set, until the number of candidate neighbor samples in the candidate neighbor sample set of the any feature sample in the candidate sample set reaches the preset number. The preset number may be set manually or may be set by an electronic device in the model training system, which is not limited herein. Illustratively, the preset number is a positive integer, which may be 20, 85, 100, etc.

4) Acquiring a sparse representation coefficient of each candidate neighbor sample in any candidate neighbor sample set and a sparse representation threshold of any candidate neighbor sample set; and determining that the candidate neighbor samples with the sparse representation coefficients larger than the corresponding sparse representation threshold value in any candidate neighbor sample set are neighbor samples, so as to obtain a neighbor sample set of corresponding feature samples in the candidate sample set.

The sparse representation coefficient is used for indicating the association degree between the corresponding candidate neighbor sample and the corresponding characteristic sample in the candidate sample set. The sparse representation threshold of any candidate neighbor sample set refers to a mean value of sparse representation coefficients calculated according to the sparse representation coefficient of each candidate neighbor sample in any candidate neighbor sample set.

Specifically, the specific way to obtain the sparse representation coefficient of each neighbor sample in the neighbor sample set may be: acquiring a feature vector corresponding to each neighbor sample in any candidate neighbor sample set, and determining a target expression for carrying out vector representation on the corresponding feature sample in the candidate sample set by adopting the feature vector corresponding to each neighbor sample in any candidate neighbor sample set; the target expression comprises sparse representation coefficients to be solved of each neighbor sample. And then, obtaining a target loss function for solving the target expression, and solving the target expression by adopting the target loss function to obtain a sparse representation coefficient of each neighbor sample. In a specific implementation, a neighbor sample set of each feature sample may be screened by a neighbor search algorithm.

For example, candidate neighbor sample set N _i of any of the feature samples x _i of the candidate sample set includes K candidate neighbor samples N _i,j; wherein n _i,j∈N_i, j is less than or equal to [1, K ]. Then, x _i may perform sparse representation through each candidate neighbor sample in the candidate neighbor sample set N _i and the corresponding sparse representation coefficient, where the sparse representation formula (i.e. the above-mentioned target expression) is shown in formula (1-1):

x_i＝X_iβ_i (1-1)

Wherein ,X_i＝[n_i,1,n_i,2,…,n_i,K],β_i＝[β_i,1,β_i,2,…,β_i,K]^T, is the sparse representation coefficient of n _ij in any candidate neighbor sample is β _i,j. Since both the feature sample x _i and the candidate neighbor sample n _i,j are feature vectors, x _i＝X_iβ_i refers to that the candidate neighbor sample n _i,j can be converted into the feature sample x _i by multiplying it by its sparse representation coefficient β _i,j.

Meanwhile, the formula of the target loss function for solving the target expression is shown as formula (1-2):

Where λ is a hyper-parameter (i.e., a parameter that needs to be manually configured, a parameter that is not learned by model training). After obtaining the sparse representation coefficient β _i,j of each candidate neighbor sample N _i,j in the candidate neighbor sample set N _i, the sparse representation threshold R of the candidate neighbor sample set N _i may be calculated according to the sparse representation coefficient of each candidate neighbor sample in the candidate neighbor sample set N _i, where the calculation formula (1-3) is as follows:

Wherein, Β _i,j refers to the sparse representation coefficients corresponding to each candidate neighbor sample N _i,j in the candidate neighbor sample set N _i, and β _i refers to the set of sparse representation coefficients corresponding to all candidate neighbor samples N _i,j in the candidate neighbor sample set N _i, thus β _i,j∈β_i. If the sparse representation coefficient β _i,j > R of a certain candidate neighbor sample, the candidate neighbor sample may be determined as a neighbor sample, so as to obtain a neighbor sample set T _i.

S405, taking the representative sample and the similar sample as screened reference samples, and determining a pseudo tag corresponding to each reference sample.

In the embodiment of the present application, when the reference samples are representative samples, a specific manner of determining the pseudo tag corresponding to each reference sample may be: and responding to the labeling operation of each representative sample, and acquiring the pseudo label of each representative sample. That is, the pseudo tag representing the sample is a tag obtained by manually labeling the representative sample.

When the reference samples are similar samples, the specific manner of determining the pseudo tag corresponding to each reference sample may be: acquiring a sparse representation coefficient of each neighbor sample in a neighbor sample set; the sparse representation coefficient is used for indicating the association degree between the corresponding neighbor sample and the corresponding characteristic sample in the alternative sample set. And then, weighting processing is carried out by adopting sparse representation coefficients corresponding to each neighbor sample in the neighbor sample set, and a pseudo tag of any candidate characteristic sample which is used as a similar sample is generated according to the result of the weighting processing.

The specific embodiment of acquiring the sparse representation coefficient of each neighbor sample in the neighbor sample set may refer to the specific embodiment of acquiring the sparse representation coefficient of each candidate neighbor sample mentioned in step S404, which is not described herein.

In addition, because each neighbor sample in the neighbor sample set is screened from the new labeled sample set, each neighbor sample is a feature sample that contains a label. Then, the specific manner of performing weighting processing by using the sparse representation coefficient corresponding to each neighbor sample in the neighbor sample set and generating the pseudo tag of any feature sample serving as the similar sample according to the result of the weighting processing may be: obtaining labels corresponding to all neighbor samples in a neighbor sample set; multiplying the probability indicated by the label corresponding to the corresponding neighbor sample according to the sparse representation coefficient corresponding to each neighbor sample to obtain the weighted probability corresponding to each neighbor sample; then, the weighted probabilities corresponding to the neighbor samples in the neighbor sample set are added to obtain a total probability (i.e., a result of the weighting process). Finally, a pseudo tag (i.e., a pseudo tag of any feature sample that is taken as a similar sample) is generated that indicates the total probability. In particular implementations, the algorithm to screen out similar samples representing samples from the set of candidate samples and to determine similar samples may be an active learning enhancement algorithm (Local Representation Coefficient Based ACTIVE LEARNING ENHANCEMENT, LRCBALE) based on local identification coefficients.

For example, referring to the example in step S404, the pseudo tag p_y _i of any of the feature samples that are taken as similar samples can be obtained by the formula (2-1):

Wherein c refers to the feature sample selected as the representative sample in the ith feature sample group, Refers to the sparse representation coefficients of each neighbor sample in the neighbor sample set corresponding to the feature sample of the similar sample selected as c in the candidate feature set.

S406, training a first recognition module of the target recognition model by adopting a reference sample containing the pseudo tag and a characteristic sample in the (i+1) th characteristic sample group.

In the embodiment of the present application, the specific way to train the first recognition module of the target recognition model by using the reference sample including the pseudo tag and the feature sample in the (i+1) th feature sample group may be: a first identification module of the target identification model is adopted to carry out identification processing on any similar sample containing the pseudo tag, and an identification tag of the corresponding similar sample is obtained; acquiring the matching degree of the identification label of each similar sample and the pseudo label of the corresponding similar sample, and determining to finish training of the first identification module when the matching degree meets the target matching degree; and when the matching degree does not meet the target matching degree, training the first recognition module of the target recognition model by adopting a representative sample containing the pseudo tag and a characteristic sample in the (i+1) th characteristic sample group until the training of the first recognition module is completed.

Wherein, when the number of the similar samples is plural, the target matching degree refers to the target duty ratio of the similar sample in which the identification tag matches the pseudo tag among the plural similar samples. The specific way to obtain the matching degree of the identification tag of each similar sample and the pseudo tag of the corresponding similar sample may be: the method comprises the steps of obtaining the duty ratio of a similar sample matched with the pseudo tag by the identification tag in a plurality of similar samples. Meanwhile, the specific mode for determining whether the matching degree meets the target matching degree is as follows: when the obtained duty ratio is greater than or equal to the target duty ratio, determining that the matching degree meets the target matching degree; and when the obtained duty ratio is smaller than the target duty ratio, determining that the matching degree does not meet the target matching degree.

In a specific implementation, referring to fig. 6, a schematic diagram of a semi-supervised continuous training process is shown. Acquiring a first feature sample set(I.e., a labeled sample set consisting of a plurality of feature samples containing labeling labels), and a second feature sample set D _U (i.e., an unlabeled sample set) comprising T feature sample sets, any feature sample set comprising a plurality of feature samples. At the same time, the number of active learning samples C (i.e., the number of representative samples described above) and the training threshold H are also required to be acquired. Wherein, C and H are both superparameters.

In training round 1 (i=1 at this time), as shown in fig. 6, a first feature sample set needs to be used firstThe first recognition module M ₁ is obtained by performing supervised training on the initialized first recognition module M ₀ (i.e., the untrained deep neural network). Since M ₀ needs to pair/>, with supervised trainingExtracting the characteristics of each characteristic sample to obtainGlobal features of each feature sample in (a); meanwhile, it can be represented byGlobal features of each feature sample in the training set constitute a labeling feature set/>, of the 1 st training setThen, a feature sample group is selected from the second feature sample set D _U as the 1 st feature sample group D ₁, and the first recognition module M ₁ is used to perform recognition processing on each feature sample in D ₁, so as to obtain global features and recognition tags of each feature sample in D ₁. Wherein the global features of each feature sample in D ₁ constitute the 1 st unlabeled feature set F ₁.

After obtaining F ₁, an entropy density peak clustering algorithm may be used to process F ₁ to screen C active learning samples from the first feature sample set D ₁ corresponding to F ₁ (i.e., the representative samples described above, since the representative samples containing pseudo labels later may be reused as labeling samples added to the history labeling sample set, the representative samples may also be referred to as active learning samples), thereby obtaining an active learning sample set for the first trainingWherein, actively learn sample setGlobal features of each active learning sample in the model (1) active learning feature setAt the time of obtainingAfterwards, the/> -can be obtained by means of manual labelingPseudo tags of the active learning samples.

Then, can be used forEach feature sample containing a label in the system and the method, and an active learning sample setEach active learning sample containing the pseudo tag forms a new labeling sample set of the 1 st training; meanwhile, each feature sample in the first feature sample set D ₁ may be subtracted by the active learning sample setThe residual characteristic samples obtained by the active learning samples in the training process form an alternative sample set of the 1 st round of training. Wherein, the feature set corresponding to the new labeling sample set of the 1 st training is thatFeature set corresponding to alternative sample set of 1 st training is

Thereafter, as shown in FIG. 6, an active learning enhancement algorithm based on local identification coefficients may be employed to perform the following Processing to screen/>, from each feature sample in the candidate sample setActively enhanced learning samples (i.e., similar samples representing the samples) and determining pseudo tags for the actively enhanced learning samples. The process can be as follows: and determining a neighbor sample set (namely the neighbor sample set) of each feature sample in the alternative sample set of the 1 st training from the feature samples of the new labeling sample set of the 1 st training, and a sparse representation coefficient corresponding to each neighbor sample (namely the neighbor sample) in the neighbor sample set. If the neighbor sample set of any of the feature samples in the candidate sample set includesAny of the feature samples in the candidate sample set may be determined to be an active reinforcement learning sample. Meanwhile, the pseudo tag of the active enhancement learning sample can be obtained according to the sparse representation coefficient and the tag corresponding to each neighbor sample in the neighbor sample set of the feature sample selected as the active enhancement sample. Note that the active reinforcement learning samples containing pseudo tags constitute the active reinforcement sample set/>, of the first training round

After obtainingThereafter, the first recognition module M ₁ pair/>, may be employedAnd (3) performing identification processing on each active reinforcement learning sample to obtain an identification tag of each active reinforcement learning sample in D ₁. If the number of the active reinforcement learning samples, which are matched with the corresponding identification tags and the corresponding pseudo tags, in the D ₁ is greater than or equal to the training threshold H, training of the first identification module is determined to be completed, and the first identification module M ₁ obtained in the round of training is taken as the first identification module of output. If the number of active reinforcement learning samples in D ₁ for which the corresponding identification tag matches the corresponding pseudo tag is less than the training threshold H, then as shown in fig. 6, the And adding the training data to the historical characteristic sample set, and starting the training of the 2 nd round. Because All of the samples containing the labels, so the feature samples in the history feature sample set are all feature samples containing the labels.

In the ith training (where i is 2 or more), as shown in FIG. 6, a plurality of feature samples are selected from the historical feature sample set to obtain a subset of the ith trainingThen, active learning sample set/>, screened from the previous training round (i.e. the i-1 training round) of the i-th training roundSubsetLabeling sample set/>, constituting ith training round

By usingPerforming supervised training on the first identification module M _i-1 obtained by the i-1 th round training to obtain a first identification module M _i, andGlobal features of each feature sample in (a); meanwhile, it can be represented byGlobal features of each feature sample in the training set constitute a labeling feature set/>, of the ith training setThen, a feature sample group is selected from the second feature sample set D _U to be used as an ith feature sample group D _i, and the first recognition module M _i is adopted to perform recognition processing on each feature sample in D _i, so as to obtain global features and recognition tags of each feature sample in D _i. Wherein the global features of each feature sample in D _i constitute the i-th unlabeled feature set F _i.

After F _i is obtained, an entropy density peak clustering algorithm can be adopted to process F _i to screen C active learning samples from D _i, thereby obtaining an active learning sample set of the ith trainingWherein, actively learn sample setGlobal features of the active learning sample constitute the i active learning feature setAt the time of obtainingAfterwards, the/> -can be obtained by means of manual labelingPseudo-labels for each feature sample in (a).

The samples in the history labeling sample set containing the labels can then be combined with the active learning sample setEach active learning sample containing pseudo tag in the training set constitutes a new labeling sample set/>, of the ith training round At the same time, each characteristic sample in each alternative sample set in the previous i-1 round of training can be processed, andSubtracting the active learning sample set/>, from each feature sample in (a)The residual characteristic samples obtained by each active learning sample in the training set form an alternative sample set of the ith training, namelyWherein, the feature set corresponding to the new labeling sample set of the ith training isThe feature set corresponding to the alternative sample set of the ith training is as follows

Thereafter, as shown in fig. 6, an active learning enhancement algorithm based on the local identification coefficient may be used to process the feature set corresponding to the new labeling sample set of the ith training round and the feature set corresponding to the candidate sample set of the ith training round, so as to screen out the active enhancement learning samples (i.e. similar samples representing samples) of each active learning sample in the new labeling sample set of the ith training round from each feature sample in the candidate sample set of the ith training round, and determine the pseudo labels of the active enhancement learning samples. Wherein the active enhancement learning samples containing the pseudo tag form an active enhancement sample set of the ith training round

After obtainingThereafter, the first recognition module M _i pair/>, may be employedAnd (3) performing identification processing on each active reinforcement learning sample to obtain an identification tag of each active reinforcement learning sample in D _i. If the number of the active reinforcement learning samples, which are matched with the corresponding identification tags and the corresponding pseudo tags, in the D _i is greater than or equal to the training threshold H, training of the first identification module is determined to be completed, and the first identification module M _i obtained in the round of training is taken as the first identification module of output. If the number of active reinforcement learning samples in D ₁ for which the corresponding identification tag matches the corresponding pseudo tag is less than the training threshold H, then as shown in fig. 6, the And adding the training data to the historical characteristic sample set, and starting the (i+1) th training.

S407, in the process of training the first recognition module, performing anti-migration training on the second recognition module of the target recognition model based on the feature sample with the labeling label and the feature sample without the labeling label and the feature sample with the second feature sample set, wherein the feature sample is contained in the first feature sample set, and the feature sample is not contained in the second feature sample set, until the trained target recognition model is obtained.

In the embodiment of the present application, the specific process of performing the anti-migration training on the second recognition module of the target recognition model may be: firstly, acquiring a first attention characteristic of a first characteristic sample set and a second attention characteristic corresponding to a second characteristic sample of a second characteristic sample set; then, a second recognition module is adopted to determine the recognition tag of the first feature sample based on the first attention feature, and the difference between the recognition tag and the labeling tag of the first feature sample is determined; and finally, acquiring label discrimination results of the second recognition module on the first attention characteristic and the second attention characteristic, and training the second recognition module based on the difference between the label discrimination results and the determined labels.

In particular, the second recognition module may include a feature extractor, an attention enhancer, a classifier, and a arbiter. Then, a specific way to obtain the attention feature of any feature sample may be: carrying out feature extraction on any feature sample by adopting a feature extractor in the second recognition module to obtain global features of any feature sample; and adopting an attention enhancer in the second identification module to conduct feature enhancement on any feature sample to obtain the attention feature of any feature sample.

Wherein, since any feature sample is a feature vector, how many elements in each feature vector is equivalent to how many feature channels in each feature vector. Thus, a specific way to perform feature enhancement on any feature sample may be: and weighting the element values of the corresponding characteristic channels in the attention characteristic of any characteristic sample according to the channel weight of each characteristic channel to obtain the attention characteristic of any characteristic sample. The channel weights of the characteristic channels are set manually at the beginning, but the initial channel weights of the characteristic channels are updated continuously in the training process of the second recognition module. Since the attention enhancement of the features is a technical means commonly used by those skilled in the art, it is not described in detail herein.

In a specific implementation, referring to fig. 7, a schematic structural diagram of a second identification module is shown. The second recognition module includes a feature extractor (i.e., G in fig. 7), an attention enhancer (i.e., E in fig. 7), a classifier (i.e., C in fig. 7), and a arbiter (i.e., D in fig. 7). Meanwhile, an initialization parameter theta _c of the classifier, an initialization parameter theta _G of the feature extractor and an initialization parameter theta _D of the discriminator are obtained, and the number of feature samples used in each training round of the migration resistant training is batch_size. Wherein θ _c、θ_G、θ_D and batch_size are both hyper-parameters.

In each round of training of the migration countermeasure training, a batch_size feature sample containing labeling labels is selected from the first feature sample setSelecting batch_size feature samples/>, from the second feature sample setWherein i is less than or equal to batch_size, i is a positive integer, s is used for indicating that the feature sample is from a first feature sample set, and t is used for indicating that the feature sample is from a second feature sample set;

then, as shown in FIG. 7, the feature samples are taken And feature sampleRespectively input into the feature extractor G, and the feature extractor G is adopted to carry out the method on the feature samplesExtracting features to obtain features f _s; feature sample/>, using feature extractor GAnd extracting the characteristics to obtain the characteristics f _t.

After obtaining the feature f _s and the feature f _t, performing feature enhancement on the feature f _s by using an attention enhancer E to obtain a first attention feature E (f _s); and feature enhancement of feature f _t with a attention enhancer E to obtain a second attention feature E (f _t). The first attention characteristic E (f _s) is then input to the classifier C and the arbiter D, and the second attention characteristic E (f _t) is input to the arbiter D. The first attention feature E (f _s) is classified by the classifier C to obtain an identification tag (i.e. a feature sample) of the first attention feature E (f _s)Is a tag for identification). Meanwhile, the first attention feature E (f _s) and the second attention feature E (f _t) are respectively processed by adopting a discriminator D to obtain a label discriminating result (namely a feature sample) of the first attention feature E (f _s)Label discrimination results of (2).

Can pass through characteristic samplesThe difference between the labeling tag and the identification tag of the classifier C is determined, and the classifier gradient Δ _C of the classifier C is calculated as shown in formula (3-1):

Wherein, θ _c refers to the classifier parameters obtained after the last round of training is finished, and can be updated continuously along with the training of each round, θ _c＝θ_c+Δ_C.θ_G refers to the feature extractor parameters obtained after the last round of training is finished, and θ _G＝θ_G+Δ_G can be updated continuously along with the training of each round. G refers to the feature extractor and, Refers to a feature extractor versus feature sampleThe result obtained after the treatment (i.e., feature f _s). E refers to an attention enhancer,Refers to the result of the attention enhancer processing feature f _s (i.e., first attention feature E (f _s)). C refers to the classifier(s) and,Refers to the result obtained by processing E (f _s) by the classifier (i.e. feature sampleAn identification tag of (d)). /(I)Refers to the characteristic sampleIs a label. L _y refers to a cross entropy loss function for computing the feature samplesThe difference between the identification tag and the labeling tag. n=batch_size.

Then, through the feature sampleTag discrimination results of (a) feature sampleThe label discrimination result of the (3) is determined, the discriminator gradient delta _D of the discriminator D is determined, and the calculation formula is shown as the formula (3-2):

Wherein, θ _D refers to the parameters of the arbiter obtained after the last round of training is finished, and θ _D＝θ_D+Δ_D is updated continuously with each round of training. Refers to feature extractor vs. feature sampleThe result obtained after the treatment (i.e., feature f _t). /(I)Refers to the result of the attention enhancer processing feature f _t (i.e., the second attention feature E (f _t)). /(I)Refers to the result (i.e. characteristic sample) obtained by the processing of E (f _s) by the discriminatorA label discrimination result of (f)). /(I)Refers to the result obtained by the processing of E (f _t) by the discriminator (i.e. feature sampleA label discrimination result of (f)). L _s is used to indicate the feature sampleFrom the first set of feature samples (which may also be referred to as feature samplesTarget source of (a))l _t is used to indicate a feature sampleFrom a second set of feature samples (which may also be referred to as feature samplesTarget source of (c). L _d refers to the cross entropy loss function, and two L _d are used to calculate the feature samples/>, respectivelyLabel discrimination results and feature samplesDifferences in target sources of (a) and feature samplesLabel discrimination result and feature sample of (a)Is a difference in target source of (a).

Finally, the feature extractor gradient Δ _G of the feature extractor G can be calculated by the classifier gradient Δ _C of the classifier C and the discriminator gradient Δ _D of the discriminator D, the calculation formula of which is shown in formula (3-3), λ being the super parameter:

In one possible implementation, after obtaining the trained target recognition model, it is also possible to: acquiring the reference object characteristics of the reference object under the target service, and adopting a first recognition module in the trained target recognition model to recognize the reference object characteristics so as to obtain a first probability that the reference object is a target service transfer electronic resource; and identifying the characteristics of the reference object by adopting a second identification module in the trained target identification model to obtain a second probability that the reference object is the target business transfer electronic resource. And finally, carrying out weighting processing on the first probability and the second probability, and generating an object label of the reference object under the target service based on the weighted target probability.

The specific way of weighting the first probability and the second probability and generating the object label of the reference object under the target service based on the weighted target probability may be: acquiring a first weight of the first probability and a second weight of the second probability; multiplying the first probability with the first weight to obtain a first weighted probability, and multiplying the second probability with the second weight to obtain a second weighted probability; and finally, adding the first weighted probability and the second weighted probability to obtain the target probability.

The first weight and the second weight may be manually set according to the recognition accuracy of each recognition module in the trained target recognition model, or the model training system may be set according to the recognition accuracy of each recognition module in the trained target recognition model, which is not limited herein. The sum of the first weight and the second weight is equal to 1. For example, the first weight is 0.65, then the second weight is 0.35. Similarly, the first weight is 0.5, and then the second weight is 0.5. In addition, the specific meanings of the reference object and the reference object feature may refer to the specific meanings of the target object and the object feature in step S401, which are not described herein.

For example, referring to fig. 8, a schematic diagram of a training process for a target recognition model is shown. The training process of the target recognition model mainly includes four processes of sample preparation (i.e., the portion included in ① in fig. 8), feature construction (i.e., the portion included in ② in fig. 8), semi-supervised continuous training (i.e., the portion included in ③ in fig. 8), and anti-migration training (i.e., the portion included in ④ in fig. 8). Wherein the semi-supervised continuous training and the anti-migration training are performed simultaneously.

In the sample preparation stage, the labeling label and the image data of the target object are required to be acquired firstly, then the abnormal evaluation index of the target object is determined according to the image data, and finally the target object is subjected to discrimination processing according to the abnormal evaluation index of the target object, so that a discrimination result is obtained. According to the image data, an abnormal evaluation index of the target object is determined, and according to the abnormal evaluation index of the target object, the determination process is performed on the target object, that is, in step S401, whether the error between the data of any image type and the normal data interval of the corresponding image type is greater than or equal to the preset error is determined, so as to generate a specific process for indicating whether the target object is the object determination result of the abnormal object, which is not described herein. And when the judging result indicates that the target object is a normal object, entering a characteristic construction stage.

In the characteristic construction stage, firstly, historical behavior data of a target object are obtained, and portrait characteristic and business characteristic of the target object are constructed according to the historical behavior data. Then, the feature values of the portrait features in each time dimension are subjected to aggregation treatment to obtain a first aggregation feature; and carrying out aggregation treatment on the business characteristics of each time dimension to obtain a second aggregation characteristic. And finally, carrying out coding processing and splicing processing on the first aggregation feature and the second aggregation feature to obtain target object features of the target object, taking the target object features as a feature sample, and storing the feature sample in an HDFS in an off-line manner. The target object features are stored in the HDFS offline in advance, so that the target object features can be conveniently and quickly taken during model training. It should be noted that, part of the target objects can acquire the labeling labels, and the object features of such target objects can be added into the first feature sample set as feature samples; part of the target objects cannot acquire the labeling labels, and object features of the target objects can be added into the second feature sample set as feature samples.

During semi-supervised continuous training, a first feature sample set can be acquired from the HDFSAnd a second set of feature samples D _U comprising T feature sample groups. At the same time, the number of active learning samples C (i.e., the number of representative samples described above) and the training threshold H are also required to be acquired. Wherein, C and H are both superparameters. Then, byThe first recognition module M _i-1 is trained to obtain a first recognition module M _i. Meanwhile, the ith feature sample group D _i in D _U is ready, and the first recognition module M _i performs recognition processing on D _i to obtain the recognition tag corresponding to each feature sample in D _i and the unlabeled feature set F _i corresponding to D _i. Thereafter, C representative samples are selected from D _i using EDPC algorithm and the sample set/>, is addedAnd selecting/>, from the set of candidate samples, using LRCBAL algorithmEach representing a similar sample of the sample, and adding a sample setFinally, whenOutputting a first recognition module M _i as a first recognition module for completing training when the number of the recognition labels of the similar samples and the pseudo-labels of the corresponding similar samples are matched is greater than or equal to H; whenWhen the number of the matching of the identification labels of the similar samples and the pseudo labels of the corresponding similar samples is smaller than H, a history labeling sample set is added, and the next training round is started. Note that the characters mentioned in the semi-supervised continuous training phase have the same meaning as the characters exemplified in fig. 6.

In the anti-migration training phase, the initialization parameters (i.e., θ _c、θ_G、θ_D and batch_size described above) need to be acquired first. Then, firstly selecting a batch_size number of feature samples containing label tags from the first feature sample setSelecting batch_size feature samples/>, from the second feature sample setThen, the feature extractor G is adopted to sample the featuresExtracting features to obtain features f _s; feature sample/>, using feature extractor GAnd extracting the characteristics to obtain the characteristics f _t.

After obtaining the feature f _s and the feature f _t, performing feature enhancement on the feature f _s by using an attention enhancer E to obtain a first attention feature E (f _s); and feature enhancement of feature f _t with a attention enhancer E to obtain a second attention feature E (f _t). Then, the first attention feature E (f _s) and the second attention feature E (f _t) are respectively processed by a discriminator D to obtain a label discriminating result (namely a feature sample) of the first attention feature E (f _s)Label discrimination results of (2). And inputting the first attention characteristic E (f _s) to the classifier C and the discriminator D, and inputting the second attention characteristic E (f _t) to the discriminator D. The first attention feature E (f _s) is classified by the classifier C to obtain an identification tag (namely a feature sample/>) of the first attention feature E (f _s)Is a tag for identification). Finally, according to the characteristic sampleIdentification tag and labelling tag of,Label discrimination results ofAnd training the second recognition module according to the label discrimination result. Note that the character mentioned in the challenge migration training phase has the same meaning as the character illustrated in fig. 7.

After training of the first recognition module and the second recognition module is completed, a trained target recognition model may be obtained. In practical application, if the reference object feature of a certain reference object is obtained, a first recognition module for completing training in a target recognition model after training is performed, and recognition processing is performed on the reference object feature of the reference object through the first recognition module for completing training, so as to obtain a first probability P ₁; and a second recognition module which can finish training in the target recognition model after finishing training, and the second probability P ₂ is obtained by performing recognition processing on the reference object characteristics of the reference object through the second recognition module which finishes training. Finally, the target probability, namely the probability of transferring the electronic resource to the target service by the reference object, is obtained by calculating the average value of the P ₁ and the P ₂. Optionally, a label of the reference object may also be generated according to the target probability.

In one embodiment, when identifying the willingness of a corresponding object to transfer electronic resources to a certain conference service (i.e., a target service) of the online conference APP, the model that the computer device can train may be a target identification model for identifying the willingness of electronic resources to transfer to the online conference APP. When training the target recognition model, the computer device may obtain a feature sample with a label tag based on the object that has performed electronic resource transfer for the online conference APP, historical behavior data of the electronic resource transfer performed for the online conference APP under historical conditions, and the label tag of the object under the condition that the electronic resource transfer is known to be performed, so as to construct and obtain a first feature sample set. In addition, in the case where the historical behavior data of the corresponding object can be obtained, but the corresponding label tag cannot be obtained, the computer device can take the historical behavior data as a feature sample without a label, and form a second feature sample set. Wherein the historical behavior data comprises: and the corresponding object is based on the behavior data of the online conference APP for electronic resource transfer at one or more historical time points, such as browsing data of the online conference APP for the corresponding object, wherein the historical time points refer to time points before the current time point, and the current time point refers to the time point for training the target recognition model.

In a gaming scenario, a computer device may be based on a business that transfers electronic resources for a certain game; training to obtain a target identification model under corresponding service. While training the target recognition model, the computer device obtains historical game data for objects that have electronic resource transfers for the game, the historical game data including whether the respective objects are the game transfer electronic resources, and the number of electronic resources transferred. Based on the obtained historical game data of different objects, corresponding labeling labels can be added to the corresponding historical game data which is contained in the historical game data and is determined to have carried out electronic resource transfer for the game, the historical game data with the labeling labels is used as a characteristic sample in a first characteristic sample set, and the historical game data without the labeling labels is used as a characteristic sample in a second characteristic sample set.

In addition, a representative sample with larger recognition deviation in the training round can be selected according to the feature value, so that the recognition deviation can be corrected continuously in the subsequent training, and the recognition accuracy of the first recognition module is improved. And selecting the similar sample representing the sample, so that whether training is completed or not is determined through the pseudo tag and the identification tag of the similar sample later, and the identification accuracy degree of the first identification module can be accurately evaluated in the training process, so that the identification accuracy rate of the first identification module for completing training can reach the expected effect.

Meanwhile, in the embodiment of the application, the repeated use of the characteristic samples is realized by continuously selecting the substitution table samples from the characteristic sample group used in the training and selecting the reference samples from the characteristic sample set containing the labeling labels. The training is repeatedly performed by using the used characteristic samples, so that the problem of forgetting data can be effectively relieved. In addition, the semi-supervised continuous learning and the anti-migration learning in the embodiment of the application are both semi-supervised learning, and training can be performed without a large number of characteristic samples with labeling labels, so that the scheme has lower requirement on the magnitude of the labeling samples, is suitable for real scenes with fewer labeling samples, and has higher popularity. Finally, the embodiment of the application reduces the difference of the characteristic samples from different data distributions from the characteristic aspect by means of anti-migration training, and can effectively avoid the problem of data island. In addition, the embodiment of the application adds the attention enhancer in the anti-migration training to enhance the part of the characteristics which is beneficial to the subsequent accurate classification, weaken the part of the characteristics which is not beneficial to the subsequent accurate classification, and facilitate the improvement of the identification accuracy of the second identification module.

Based on the related description of the model training method, the application also discloses a model training device. The model training means may be a computer program (comprising program code) running on one of the computer devices mentioned above. The model training apparatus may perform the model training method shown in fig. 3 and 4, referring to fig. 9, and the model training apparatus may at least include: an acquisition unit 901 and a training unit 902.

An obtaining unit 901, configured to obtain a first feature sample set and a second feature sample set, where the first feature sample set includes at least one feature sample with a label, the second feature sample set includes T feature sample groups, and one feature sample group includes at least one feature sample without a label; t is an integer greater than or equal to 1;

A training unit 902, configured to screen one or more reference samples from an ith feature sample group of the first feature sample set and the second feature sample set, and determine a pseudo tag corresponding to each reference sample; the labeling label and the pseudo label are used for indicating the probability that the object indicated by the corresponding sample is the target business transfer electronic resource; i is an integer of 1 or more and T or less;

the training unit 902 is further configured to train the first recognition module of the target recognition model by using the reference sample including the pseudo tag and the feature sample in the i+1th feature sample group;

The training unit 902 is further configured to perform an anti-migration training on the second recognition module of the target recognition model based on the feature sample with the labeling label and the feature sample without the labeling label, where the feature sample is included in the first feature sample set, and the feature sample is included in the second feature sample set, in the process of training the first recognition module, until a trained target recognition model is obtained.

In one embodiment, the training unit 902 is further specifically configured to, when screening one or more reference samples from the ith feature sample group of the first feature sample set and the second feature sample set, perform:

acquiring the feature value of each feature sample in the ith feature sample group of the second feature sample set, and screening out a representative sample from the ith feature sample group based on the feature value;

acquiring an alternative sample set, wherein the alternative sample set comprises characteristic samples without labels in the first i characteristic sample groups; the first i feature sample groups at least comprise the ith feature sample group;

Screening similar samples representing samples from the characteristic samples in the alternative sample set according to the characteristic samples in the first characteristic sample set and the sample distances between the characteristic samples in the alternative sample set and the representative samples;

and taking the representative sample and the similar sample as screened reference samples.

In yet another embodiment, the training unit 902, when acquiring the feature value of each feature sample in the ith feature sample group of the second feature sample set and screening the representative sample from the ith feature sample group based on the feature value, is further configured to perform:

Carrying out information entropy calculation on each characteristic sample in the ith characteristic sample group of the second characteristic sample set, and taking one or more characteristic samples with corresponding information entropy larger than a target entropy threshold value in the ith characteristic sample group as initial reference samples;

performing distance calculation on each selected initial reference sample, and performing clustering operation on one or more obtained initial reference samples based on the calculated distances;

And determining a representative sample of the ith characteristic sample group from one or more initial reference samples according to the result of the clustering operation.

In yet another embodiment, the training unit 902 is specifically configured to perform, when selecting a similar sample representing a sample from among the candidate feature samples in the candidate sample set according to each feature sample in the first feature sample set and a sample distance between each feature sample in the candidate sample set and the representative sample:

Taking each characteristic sample in the first characteristic sample set and the representative sample screened from the ith characteristic sample set as a new labeling sample set;

selecting a neighbor sample set of any characteristic sample in the alternative sample set from the new labeling sample set;

When the representative sample exists in the neighbor sample set, any characteristic sample is taken as a similar sample of the corresponding representative sample.

In yet another embodiment, the obtaining unit 901, when determining the pseudo tag corresponding to the similar sample, may be specifically configured to perform:

Acquiring a sparse representation coefficient of each neighbor sample in a neighbor sample set; the sparse representation coefficient is used for indicating the association degree between the corresponding neighbor sample and any characteristic sample;

and weighting the sparse representation coefficients corresponding to each neighbor sample in the neighbor sample set, and generating a pseudo tag of any characteristic sample serving as a similar sample according to the result of the weighting.

In yet another embodiment, the training unit 902 may be specifically configured to perform, when acquiring the sparse representation coefficient of each neighbor sample in the set of neighbor samples:

Acquiring a feature vector corresponding to each neighbor sample in a neighbor sample set, and determining a target expression for carrying out vector representation on any feature sample by adopting the feature vector corresponding to each neighbor sample in the neighbor sample set; the target expression comprises sparse representation coefficients to be solved of each neighbor sample;

And obtaining a target loss function for solving the target expression, and solving the target expression by adopting the target loss function to obtain a sparse representation coefficient of each neighbor sample.

In yet another embodiment, the training unit 902 is further operable to perform:

acquiring image data of a target object, and performing discrimination processing on the target object based on the image data of the target object to obtain an object discrimination result;

when the object discrimination result indicates that the target object is a normal object, historical behavior data of the target object is obtained, and portrait features and business features of the target object are built based on the historical behavior data;

And performing coding processing and splicing processing on the portrait features and the business features to obtain target object features of the target object, and taking the target object features as a feature sample.

In yet another embodiment, the portrayal feature and the business feature are each features comprising one or more time dimensions; the training unit 902 may be further configured to perform, when performing encoding processing and stitching processing on the image feature and the service feature to obtain a target object feature of the target object:

Respectively carrying out coding processing on the portrait features and the business features in different time dimensions to obtain coding features;

and splicing the coding features, and taking the spliced coding features as target object features of the target object.

In yet another embodiment, the training unit 902 may be further configured to, when training the first recognition module of the target recognition model using the reference sample including the pseudo tag and the feature samples in the i+1th feature sample group, perform:

a first identification module of the target identification model is adopted to carry out identification processing on any reference sample containing the pseudo tag, so as to obtain an identification tag of the corresponding reference sample;

acquiring the matching degree of the identification label of each reference sample and the pseudo label of the corresponding reference sample, and determining to finish training of the first identification module when the matching degree meets the target matching degree;

And when the matching degree does not meet the target matching degree, training the first recognition module of the target recognition model by adopting the feature samples in the (i+1) th feature sample group until the training of the first recognition module is completed.

In yet another embodiment, the training unit 902 may be further configured to, when performing the anti-migration training on the second recognition module of the target recognition model based on the feature samples with the labeling tag included in the first feature sample set and the feature samples without the labeling tag included in the second feature sample set, perform:

acquiring first attention characteristics of a first characteristic sample set and second attention characteristics corresponding to a second characteristic sample of a second characteristic sample set;

determining an identification tag of the first feature sample based on the first attention feature by adopting a second identification module, and determining a difference between the identification tag and a labeling tag of the first feature sample;

and acquiring label discrimination results of the second recognition module on the first attention characteristic and the second attention characteristic, and training the second recognition module based on the difference between the label discrimination results and the determined labels.

In yet another embodiment, training unit 902 may also be configured to perform:

acquiring the reference object characteristics of the reference object under the target service, and adopting a first recognition module in the trained target recognition model to recognize the reference object characteristics so as to obtain a first probability that the reference object is a target service transfer electronic resource; and

Adopting a second recognition module in the trained target recognition model to recognize the characteristics of the reference object, and obtaining a second probability that the reference object is the target business transfer electronic resource;

and weighting the first probability and the second probability, and generating an object label of the reference object under the target service based on the weighted target probability.

According to one embodiment of the application, the steps involved in the methods shown in fig. 3 and 4 may be performed by the various units in the model training apparatus shown in fig. 9. For example, step S301 shown in fig. 3 may be performed by the acquisition unit 901 in the model training apparatus shown in fig. 9; steps S302 to S304 may be performed by the training unit 902 in the model training apparatus shown in fig. 9. For another example, step S401 shown in fig. 4 may be performed by the acquisition unit 901 in the model training apparatus shown in fig. 9; steps S402 to S407 may be performed by the training unit 902 in the model training apparatus shown in fig. 9.

According to another embodiment of the present application, each unit in the model training apparatus shown in fig. 9 is divided based on a logic function, and each unit may be separately or completely combined into one or several other units to form the model training apparatus, or some unit(s) thereof may be further split into a plurality of units with smaller functions to form the model training apparatus, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present application. In other embodiments of the present application, the model-based training apparatus may also include other units, and in practical applications, these functions may also be implemented with assistance from other units, and may be implemented by cooperation of multiple units.

According to another embodiment of the present application, a model training apparatus as shown in fig. 9 may be constructed by running a computer program (including program code) capable of executing the steps involved in the method as shown in fig. 3 or fig. 4 on a general-purpose computing device such as a computer device including a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), etc., processing elements and storage elements, and a model training method of an embodiment of the present application may be implemented. The computer program may be recorded on, for example, a computer storage medium, and loaded into and run in the above-described computer apparatus through the computer storage medium.

Based on the method embodiment and the device embodiment, the application further provides electronic equipment. Referring to fig. 10, a schematic structural diagram of an electronic device according to an embodiment of the present application is provided. The electronic device shown in fig. 10 may include at least a processor 1001, an input interface 1002, an output interface 1003, and a computer storage medium 1004. Wherein the processor 1001, input interface 1002, output interface 1003, and computer storage medium 1004 may be connected by a bus or other means.

The computer storage medium 1004 may be stored in a memory of the electronic device, the computer storage medium 1004 for storing a computer program comprising program instructions, and the processor 1001 for executing the program instructions stored by the computer storage medium 1004. The processor 1001 (or CPU (Central Processing Unit, central processing unit)) is a computing core and a control core of the electronic device, which are adapted to implement one or more instructions, in particular to load and execute one or more instructions to implement the model training method flow or corresponding functions described above.

The embodiment of the application also provides a computer storage medium (Memory), which is a Memory device in the electronic device and is used for storing programs and data. It will be appreciated that the computer storage medium herein may include both a built-in storage medium in the terminal and an extended storage medium supported by the terminal. The computer storage medium provides a storage space that stores an operating system of the terminal. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor 1001. It should be noted that, the computer storage medium herein may be a high-speed random access memory (random access memory, RAM) memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory; optionally, at least one computer storage medium remote from the processor may be present.

In one embodiment, one or more instructions stored in a computer storage medium may be loaded and executed by the processor 1001 to implement the corresponding steps of the method described above in relation to the model training method embodiments of fig. 3 and 5, and in a specific implementation, the one or more instructions in the computer storage medium are loaded and executed by the processor 1001 to:

The processor 1001 obtains a first feature sample set and a second feature sample set, the first feature sample set including at least one feature sample with a label tag, the second feature sample set including T feature sample sets, one feature sample set including at least one feature sample without a label tag; t is an integer greater than or equal to 1;

The processor 1001 screens out one or more reference samples from the ith feature sample group of the first feature sample set and the second feature sample set, and determines a pseudo tag corresponding to each reference sample; the labeling label and the pseudo label are used for indicating the probability that the object indicated by the corresponding sample is the target business transfer electronic resource; i is an integer of 1 or more and T or less;

the processor 1001 trains the first recognition module of the target recognition model using the reference sample containing the pseudo tag and the feature sample in the i+1th feature sample group;

During the training of the first recognition module, the processor 1001 performs an anti-migration training on the second recognition module of the target recognition model based on the feature sample with the labeling tag included in the first feature sample set and the feature sample without the labeling tag included in the second feature sample set until a trained target recognition model is obtained.

In one embodiment, the processor 1001 may be specifically configured to, when screening one or more reference samples from the ith feature sample group of the first feature sample set and the second feature sample set, perform:

In one embodiment, the processor 1001 may be further configured to, when acquiring the feature value of each feature sample in the ith feature sample group of the second feature sample set, and screening the representative sample from the ith feature sample group based on the feature value, perform:

In one embodiment, the processor 1001 is further specifically configured to, when selecting a similar sample representing a sample from among the candidate feature samples in the candidate sample set according to each feature sample in the first feature sample set and a sample distance between each feature sample in the candidate sample set and the representative sample, perform:

In one embodiment, the processor 1001 may be specifically configured to, when determining the pseudo tag corresponding to the similar sample, perform:

In one embodiment, the processor 1001, when acquiring the sparse representation coefficients for each neighbor sample in the set of neighbor samples, is specifically operable to perform:

In one embodiment, the processor 1001 is further operable to perform:

In one embodiment, the portrayal feature and the business feature are each features that comprise one or more time dimensions; the processor 1001 may be further configured to perform, when performing encoding processing and stitching processing on the portrait feature and the service feature to obtain a target object feature of the target object:

In one embodiment, the processor 1001 may be further configured to, when training the first recognition module of the target recognition model using the reference sample including the pseudo tag and the feature samples in the i+1th feature sample group, perform:

In one embodiment, the processor 1001 may be further configured to perform, when performing the anti-migration training on the second recognition module of the target recognition model based on the feature samples with the labeling tag included in the first feature sample set and the feature samples without the labeling tag included in the second feature sample set:

In one embodiment, the processor 1001 may also be configured to perform:

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the electronic device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the electronic device to perform the method embodiments described above and illustrated in fig. 3 and 4. The computer readable storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), or the like.

The model training method in the embodiment of the invention can be applied to other scenes such as cloud technology, artificial intelligence, intelligent driving and the like, and is not limited herein.

It will be appreciated that, in the specific embodiment of the present application, some examples relate to data related to privacy of a use object, such as historical behavior data; thus, when the above embodiments of the present application are applied to a specific product or technology, a use subject license or agreement needs to be obtained, and the collection, use and processing of relevant data needs to comply with relevant laws and regulations and standards of relevant countries and regions.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the application is subject to the protection scope of the claims.

Claims

1. A method of model training, comprising:

2. The method of claim 1, wherein the screening one or more reference samples from an ith feature sample group of the first feature sample set and the second feature sample set comprises:

acquiring the feature value of each feature sample in the ith feature sample group of the second feature sample set, and screening a representative sample from the ith feature sample group based on the feature value;

Screening similar samples of the representative samples from the characteristic samples in the alternative sample set according to the characteristic samples in the first characteristic sample set and the sample distances between the characteristic samples in the alternative sample set and the representative samples;

3. The method of claim 2, wherein the obtaining the feature value of each feature sample in the ith feature sample group of the second feature sample set and the screening representative samples from the ith feature sample group based on the feature value comprises:

Performing information entropy calculation on each characteristic sample in an ith characteristic sample group of the second characteristic sample set, and taking one or more characteristic samples with corresponding information entropy larger than a target entropy threshold value in the ith characteristic sample group as initial reference samples;

And determining the representative sample of the ith characteristic sample group from the one or more initial reference samples according to the result of the clustering operation.

4. The method of claim 2, wherein the screening similar samples of the representative sample from among the feature samples in the candidate sample set based on the feature samples in the first feature sample set and the sample distances between the feature samples in the candidate sample set and the representative sample comprises:

And when the representative sample exists in the neighbor sample set, taking any characteristic sample as a similar sample of the corresponding representative sample.

5. The method of claim 4, wherein determining the manner in which the pseudo tags corresponding to similar samples are determined comprises:

acquiring a sparse representation coefficient of each neighbor sample in the neighbor sample set; the sparse representation coefficient is used for indicating the association degree between the corresponding neighbor sample and any characteristic sample;

and weighting the sparse representation coefficients corresponding to each neighbor sample in the neighbor sample set, and generating a pseudo tag of any characteristic sample serving as a similar sample according to the weighting result.

6. The method of claim 5, wherein the obtaining sparse representation coefficients for each neighbor sample in the set of neighbor samples comprises:

Acquiring a feature vector corresponding to each neighbor sample in the neighbor sample set, and determining a target expression for vector representation of any feature sample by adopting the feature vector corresponding to each neighbor sample in the neighbor sample set; the target expression comprises sparse representation coefficients to be solved of each neighbor sample;

and obtaining a target loss function for solving the target expression, and solving the target expression by adopting the target loss function to obtain the sparse representation coefficient of each neighbor sample.

7. The method of claim 1, wherein the method further comprises:

Acquiring image data of a target object, and judging the target object based on the image data of the target object to obtain an object judging result;

When the object discrimination result indicates that the target object is a normal object, historical behavior data of the target object is obtained, and portrait features and business features of the target object are constructed based on the historical behavior data;

and carrying out coding processing and splicing processing on the portrait features and the service features to obtain target object features of the target object, and taking the target object features as a feature sample.

8. The method of claim 7, wherein the portrait feature and the business feature are each features that include one or more time dimensions; the encoding processing and the splicing processing are carried out on the portrait features and the business features to obtain target object features of the target object, and the method comprises the following steps:

9. The method of claim 1, wherein training the first recognition module of the object recognition model using the reference sample including the pseudo tag and the feature samples in the i+1th feature sample group comprises:

10. The method of claim 1, wherein the performing the anti-migration training on the second recognition module of the target recognition model based on the feature samples with the presence tag contained in the first feature sample set and the feature samples without the presence tag contained in the second feature sample set comprises:

acquiring first attention characteristics of a first characteristic sample of the first characteristic sample set and second attention characteristics corresponding to a second characteristic sample of the second characteristic sample set;

determining an identification tag of the first feature sample based on the first attention feature using the second identification module, and determining a difference between the identification tag and a labeling tag of the first feature sample;

11. The method of claim 1, wherein the method further comprises:

acquiring a reference object characteristic of a reference object under a target service, and adopting a first recognition module in a trained target recognition model to recognize the reference object characteristic to obtain a first probability that the reference object is a target service transfer electronic resource; and

Adopting a second recognition module in the trained target recognition model to recognize the characteristics of the reference object to obtain a second probability that the reference object is a target business transfer electronic resource;

12. A model training device, characterized in that the model training device comprises an acquisition unit and a training unit, wherein:

13. A computer device, comprising:

a processor adapted to implement one or more computer programs;

Computer storage medium storing one or more computer programs adapted to be loaded by the processor and to perform the model training method according to any of claims 1-11.

14. A computer storage medium, characterized in that it stores one or more computer programs adapted to be loaded by a processor and to perform the model training method according to any of the claims 1-11.