CN112422413A

CN112422413A - Click feedback method and device for push message

Info

Publication number: CN112422413A
Application number: CN202011245574.7A
Authority: CN
Inventors: 欧锦华; 袁乾烽; 国兴旺; 周德勇; 万炀
Original assignee: GUANGZHOU QISHENG INFORMATION TECHNOLOGY CO LTD
Current assignee: GUANGZHOU QISHENG INFORMATION TECHNOLOGY CO LTD
Priority date: 2020-11-10
Filing date: 2020-11-10
Publication date: 2021-02-26
Anticipated expiration: 2040-11-10
Also published as: CN112422413B

Abstract

The invention discloses a click feedback method and device for a push message. The click feedback method comprises the following steps: collecting click feedback information of a user on the push message, wherein the click feedback information comprises: click time information, click content type information, characteristic information associated with click time and/or click content type; generating training data by adopting click feedback information, and training a prediction system according to the generated training data; the method comprises the steps of fishing a plurality of users from a database, obtaining user characteristic data corresponding to the users, inputting the user characteristic data into a prediction system as data to be tested, and obtaining a prediction result, wherein the prediction result comprises the following steps: probability of click time enumeration, probability of click content type enumeration. Under the condition of considering the using habit, using feeling of the user and the time and occasion of opening the message, the aim of accurately positioning the user for pushing the message and enabling the user to receive the message in the time period hope to receive the message is achieved.

Description

Click feedback method and device for push message

Technical Field

The invention relates to the field of internet, in particular to a click feedback method and device for push messages.

Background

Information push is a new technology for reducing information overload by periodically transmitting information required by users on the internet through a certain technical standard or protocol. In the operation process of an enterprise, two main tasks are provided, wherein one task is to generate a new user, and the other task is to reserve an original user. To accomplish these two tasks, in addition to developing new products, it is also necessary to actively push messages for new products to users at regular and fixed amounts.

However, from the perspective of the user, passively accepting a message may make it easier for the user to generate a feeling of dislike, and thus reduce the user's experience of using the product, as compared to actively seeking information by itself.

Therefore, for the message active push mechanism of the related art, a feedback mechanism is still lacking at present to consider the use habit and the use feeling of the user, and the time and the occasion of opening the message.

Disclosure of Invention

The invention mainly aims to disclose a click feedback method and a click feedback device for pushing messages, which are used for at least solving the problems that a message active pushing mechanism adopted in the related technology is lacked, the using habit and the using feeling of a user are considered, the time and the occasion for opening the message are considered, and the like.

According to one aspect of the invention, a click feedback method for pushing a message is provided.

The click feedback method for the push message comprises the following steps: collecting click feedback information of a user on a push message, wherein the click feedback information comprises: click time information, click content type information, characteristic information associated with click time and/or click content type; generating training data by adopting the click feedback information, and training a prediction system according to the generated training data; the method comprises the steps of fishing a plurality of users from a database, obtaining user characteristic data corresponding to the users, inputting the user characteristic data serving as data to be tested into a prediction system, and obtaining a prediction result, wherein the prediction result comprises: probability of click time enumeration, probability of click content type enumeration.

According to another aspect of the present invention, a click feedback device for pushing a message is provided.

The click feedback device for pushing the message comprises the following components: the information collection module is used for collecting click feedback information of a user on the push message, wherein the click feedback information comprises: click time information, click content type information, characteristic information associated with click time and/or click content type; the system training module is used for generating training data by adopting the click feedback information and training a prediction system according to the generated training data; a result obtaining module, configured to retrieve multiple users from a database, obtain user feature data corresponding to the multiple users, input the user feature data as data to be tested into the prediction system, and obtain a prediction result, where the prediction result includes: probability of click time enumeration, probability of click content type enumeration.

Aiming at the message active pushing mechanism adopted in the related technology, the invention provides a feedback mechanism, under the condition of considering the use habit, the use feeling of the user and the time and the occasion of opening the message, the really needed message is pushed to the user at the proper time, and the purposes of accurately positioning the user for pushing the message and enabling the user to receive the message in the time period of expecting to receive the message are realized.

Drawings

FIG. 1 is a flow chart of a click feedback method of a push message according to an embodiment of the present invention;

FIG. 2 is a flow chart of a user characterization method in accordance with a preferred embodiment of the present invention;

FIG. 3 is a diagram of click time enumeration one-hot feature intervals for push content in accordance with a preferred embodiment of the present invention;

FIG. 4 is a diagram of a click content type enumeration one-hot feature interval for push content in accordance with a preferred embodiment of the present invention;

fig. 5 is a user characteristic one-hot section diagram of push content according to a preferred embodiment of the present invention;

FIG. 6 is a flowchart of a method of generating training data to train a predictive system in accordance with a preferred embodiment of the present invention;

FIG. 7 is a diagram of the training operation of a predictive system in accordance with a preferred embodiment of the present invention;

FIG. 8 is a flow chart of generating a prediction result using a prediction system in accordance with a preferred embodiment of the present invention;

FIG. 9 is a predictive operational schematic of a predictive system in accordance with a preferred embodiment of the invention;

FIG. 10 is a general flowchart of a Push system click prediction method according to the preferred embodiment of the present invention;

FIG. 11 is a block diagram of a click feedback device for push messages according to an embodiment of the present invention;

fig. 12 is a block diagram of a click feedback device for push messages according to a preferred embodiment of the present invention.

Detailed Description

The following detailed description of specific embodiments of the present invention is provided in conjunction with the accompanying drawings.

Fig. 1 is a flowchart of a click feedback method of a push message according to an embodiment of the present invention. As shown in fig. 1, the click feedback method for push messages includes:

step S101: collecting click feedback information of a user on a push message, wherein the click feedback information comprises: click time information, click content type information, characteristic information associated with click time and/or click content type;

step S103: generating training data by adopting the click feedback information, and training a prediction system according to the generated training data;

step S105: the method comprises the steps of fishing a plurality of users from a database, obtaining user characteristic data corresponding to the users, inputting the user characteristic data serving as data to be tested into a prediction system, and obtaining a prediction result, wherein the prediction result comprises: probability of click time enumeration, probability of click content type enumeration.

The embodiment of the invention provides a feedback mechanism, which is used for collecting click feedback information of a user on a push message, generating training data by adopting the click feedback information and training a prediction system according to the generated training data; inputting user characteristic data corresponding to a plurality of users fished from a database into the prediction system to obtain a prediction result, wherein the prediction result comprises: probability of click time enumeration, probability of click content type enumeration. Under the condition of considering the use habit, the use feeling of the user and the time and the occasion of opening the message, the really needed message is pushed to the user at the proper time according to the prediction result, so that the aim of accurately positioning the user for pushing the message and enabling the user to receive the message in the time period of expecting to receive the message is fulfilled.

Preferably, the characteristic information associated with the click time and/or the click content type may include, but is not limited to, at least one of the following: the information of the user is the region information, the equipment screen information, the browser type information, the operating system type information, the equipment manufacturer information, the equipment type information, the equipment name information and the push channel information.

Firstly, feature engineering is described, wherein the feature engineering refers to screening better data features from original data in a series of engineering modes so as to improve the training effect of the model. Data and features determine the upper limit of machine learning, and models and algorithms are approaching this upper limit. It follows that good data and features are a prerequisite for greater utility of models and algorithms. The feature engineering generally comprises the steps of data preprocessing, feature selection, dimension reduction and the like. Whether the features extracted from the data are good or bad directly affects the model effect.

Preferably, the generating of the training data by using the click feedback information includes: converting the click time in the log data into click time enumeration, and then coding the click time enumeration by adopting a fixed position one-hot coding mode; converting the click content type in the log data into click content type enumeration, and then coding the click content type enumeration by adopting a fixed position one-hot coding mode; and converting the characteristic information associated with the click time and/or the click content type into characteristic data, and then reducing the dimension of the user characteristic data to a dynamic one-hot interval by adopting a dynamic one-hot coding mode.

Feature engineering is described below in conjunction with fig. 2.

As shown in fig. 2, first, a user field is selected from a user profile database, that is, feature information (e.g., information of a region where the user is located, device screen information, browser type information, operating system type information, device manufacturer information, device type information, device name information, push channel information, etc.) associated with a click time and/or a click content type is converted into feature data; then, carrying out feature processing, namely feature representation based on Embedding; embedding indicates: also called Distributed representation, compared with one-hot coded representation of high-dimensional sparsity, embedding learns a low-dimensional dense real vector, i.e. compresses sparse data with more bits to a space with less bits. It is the one-hot encoding process that is processed into a fixed-dimension embedded vector representation.

Wherein, the Embedding representation includes: a dynamic one-hot coding mode and a fixed position one-hot coding mode. These two encoding schemes are described in detail below:

the first encoding method: dynamic one-hot coding mode

Splicing the field name and the field value into a character string, taking int64 hash value of the character string, and then taking a mode of the length of the dynamic one-hot zone for the hash value, wherein the formula is as follows:

int64_ Hash (field name _ field value)% dynamic one-hot region length

The dynamic one-hot has the advantages that sparse feature dimensionality reduction is compressed to a feature interval with fixed length, and both the feature field and the feature field value can be dynamically increased. As long as the characteristic interval length is selected appropriately, the occurrence of collision can be ignored.

The second encoding method: fixed position one-hot coding mode

Mapping to a fixed one-hot position according to an enumeration number.

The advantage of a fixed one-hot is that the length of the characteristic interval is fixed and the characteristic position can be known in advance.

In a preferred implementation, the features may further include: item characteristics and user characteristics (i.e., characteristic information associated with click time and/or click content type).

First type of feature data: item characteristic data

(1) Click time enumeration for push content

That is, the click time of the push content, for example, 24 enumerations (0 to 23) respectively represent a 24-hour time period, and the processing method: fixed position one-hot

Such as:

0 time is 000000000000000000000001

1 time is 000000000000000000000010

...

100000000000000000000000 when No. 23

The click time enumeration of the push content is converted to a time one-hot, as can be seen in the example of fig. 3.

(2) Type enumeration of push content

That is, types of content to be pushed, for example, (a ═ 0, t ═ 1, v ═ 2) 3 enumerations, which respectively represent 3 media types, where a ═ article, t ═ question and answer post, v ═ video, processing mode: fixed position one-hot

Such as:

a＝0000000001

t＝0000000010

v＝0000000100

...

the type enumeration of the push content is converted into a content type one-hot, as can be seen in the example of fig. 4.

Second type of characteristic data: user characteristic data (i.e., characteristic information associated with click time and/or click content type)

Because the general number of the user features is large, the enumeration number is also large, and if the enumeration number is too large, the feature sparseness and the overlarge model can be caused; and the user features have uncertainty (more enumeration may be added in the future), so that the fixed one-hot method is not suitable for extracting the features. The features can be extracted in a dynamic one-hot manner. For example:

city (ip _ area), processing mode: dynamic one-hot

Device _ screen _ width, device _ screen _ height), processing mode: dynamic one-hot

Browser type (browser _ name), processing mode: dynamic one-hot

System type (os _ name), processing mode: dynamic one-hot

Device manufacturer (device _ vendor), processing mode: dynamic one-hot

Device type (device _ type), processing mode: dynamic one-hot

Device name (device _ model), processing mode: dynamic one-hot

Push channel enumeration (Huashi, vivo, OPPO, millet), processing mode: dynamic one-hot

...

The remaining user feature dimension reduction is compressed to a dynamic one-hot interval, as can be seen in the example of fig. 5.

Preferably, in step S101, the collecting click feedback information of the user on the push message may further include: acquiring a click behavior log of a user clicking a push message; extracting log data in the user click behavior log, wherein the log data comprises: click time, click content type, characteristic information associated with click time and/or click content type (e.g., user id, push channel, etc.).

In a preferred implementation, the click feedback information may be obtained by collecting information in a user's click behavior log. For example, the click behavior log of the user is a log of behavior that the user clicks the push content after receiving the push content on the terminal. After the log is processed through a message queue and a stream, the log is summarized and training data is generated.

The log format is as follows:

time of click, user id (identification information), content type, push channel, whether click is made or not

FIG. 6 is a flowchart of a method for generating training data to train a predictive system in accordance with a preferred embodiment of the present invention. As shown in fig. 6, first, log data in a user click behavior log stream is extracted. The click time needs to be converted into click time enumeration first, and then converted into time enumeration one-hot, as shown in fig. 3; the content type needs to be converted into a content type enumeration one-hot, as shown in fig. 4; and converting the user id and the push channel into user characteristics through characteristic engineering, performing dimension reduction compression to a dynamic one-hot interval through a dynamic one-hot mode, and generating training data, as shown in fig. 5. And then training a prediction system according to the generated training data.

Preferably, the prediction system may further include: two logistic regression classification models connected in series, wherein the first logistic regression classification model is a time prediction model, and the second logistic regression classification model is a content type prediction model.

Preferably, in step S103, the training the prediction system according to the generated training data may further include: inputting coded data for coding the user characteristic data in the training data and coded data for coding click time enumeration in the training data into the time prediction model, and training the time prediction model; and inputting coded data for coding the user characteristic data in the training data, coded data for coding the click content type enumeration in the training data, and coded data for coding the click time enumeration number in the training data into the content type prediction model to train the content type prediction model.

FIG. 7 is a diagram of the training operation of a predictive system in accordance with a preferred embodiment of the present invention. As shown in FIG. 7, the prediction system uses two logistic regression classification models in series, named LR-Time (temporal prediction model) and LR-Type (content Type prediction model), respectively.

The Logistic Regression (LR) classification model is based on the linear Regression model, and compresses the result of the linear model w ^ Tx to be between [0 and 1] by using a sigmoid function, so that the Logistic Regression classification model has a probability meaning. Its essence is still a linear model, relatively simple to implement. The weight of the independent variable can be obtained through logistic regression analysis, and an LR model is a basic component unit for deep learning, and the formula is as follows: and f, (x) the logistic regression model continuously corrects the weight through deep learning, and when the weight is corrected to be less than a preset value, the logistic regression classification model finishes training.

As shown in FIG. 7, the input of LR-Time includes: the method comprises the steps that coded data for coding user characteristic data in training data and coded data for coding click Time enumeration in the training data are output, the weight of LR-Time is output, the weight is continuously corrected through deep learning, and when the weight is corrected to be small in error and meets a preset requirement, the LR-Time model completes training.

The formula is as follows:

click time enumeration (LR-time (x)) locations (linear)

The input of LR-Type includes: outputting coded data + LR-Time (which needs to be converted into click Time characteristics first, namely, coding the Time enumeration number by adopting a fixed position one-hot coding mode to generate coded data, as shown in fig. 3 and 7) + coded data for coding click content Type enumeration in the training data, outputting LR-Type weight, continuously correcting the weight through deep learning, and finishing training by the LR-Time model when the weight is corrected to have small error and meets a preset requirement.

The formula is as follows:

content type enumeration (LR-type (x)) location (linear (user feature + click time feature))

Therefore, with the support of sufficient data quantity, a prediction system meeting the requirement can be trained. It should be noted that, in order to train a prediction system with small errors and high accuracy, the prediction system is usually trained using all the acquired training data, but if the amount of training data is large, in order to reduce the amount of computation, part of the training data may be randomly extracted from the entire training data in a certain proportion (for example, 30%) to train the prediction system.

Preferably, the step S105 of fishing the plurality of users from the database may further include: and fishing the plurality of users with the highest liveness rank from the database.

Preferably, in step S105, inputting the user feature data as the data to be tested into the prediction system, and obtaining the prediction result may further include: reducing the dimension of the user characteristic data to a dynamic one-hot interval by adopting a dynamic one-hot coding mode, and generating the to-be-tested data; inputting the data to be tested into the time prediction system to obtain the probability of the click time enumeration; encoding at least one time enumeration number in the time enumeration numbers by adopting a fixed position one-hot encoding mode to generate encoded data; and inputting the generated coded data and the to-be-tested data into the content type prediction model to obtain the probability of the content type prediction model.

Preferably, at least one of the time enumerations may be determined by: sorting the time enumeration numbers according to the sequence of the probability of the click time enumeration; determining the first X big probabilities in the click time enumeration probabilities, and using the time enumeration numbers corresponding to the first X big probabilities as the at least one time enumeration number, wherein X is an integer which is greater than or equal to 1 and smaller than the total amount of the time enumeration numbers.

In a preferred implementation, a model optimization scheme may be adopted, that is, for a series of logistic regression classification models, the input of the content type prediction model is not encoded data encoding all click time enumerations, but encoded data of time enumerations corresponding to the first X large probabilities in the probability of click time enumeration is input to the content type prediction model.

For example, the number of users is 100 ten thousand, there are 24 time enumerations, and 3 content types. If all are calculated once with the model, 100 ten thousand 24 x 3-7200 ten thousand times are required. Since the time prediction and the content type prediction are separated by two different models, the probability of click time enumeration is obtained first, which is the input of the content type prediction model. Therefore, the probabilities of the time enumeration can be sorted from large to small, and only the first X large probabilities (for example, the time enumeration of the first 3 large probabilities) in the probabilities of the click time enumeration are extracted as the input of the content type prediction model, so that the calculation amount can be greatly reduced, wherein the calculation amount is 100 ten thousand by 3 to 900 ten thousand, and the performance is improved by 87.5%. Namely: the output of the time prediction model is the Top three times of the click time priority of the push content (Top3), and the output of the content type prediction model is the type priority of the content to be pushed.

The above preferred embodiment is further described below in conjunction with fig. 8 and 9.

As shown in fig. 8, first, active and/or sub-active users are retrieved from a user library, then, the retrieved users and push channels are converted into user characteristic data through the above characteristic engineering, and then, the user characteristic data are reduced and compressed to a dynamic one-hot interval through a dynamic one-hot mode, and data to be predicted are generated, as shown in fig. 5.

As shown in fig. 9, data to be predicted by the user (user feature data in the figure) is input into LR-Time (Time prediction model), and the probability of the prediction result click Time enumeration is obtained. And then, the Time enumeration number output by the first-stage LR-Time (Time prediction model) can be coded and converted into click Time characteristic data, and then the click Time characteristic data and the data to be predicted of the user are input into an LR-Type (content Type prediction model) together to obtain the probability of content Type enumeration of a prediction result.

Preferably, the click Time enumeration numbers output by the first-stage LR-Time (temporal prediction model) may be sorted according to the order of the probability of the click Time enumeration, the first X large probabilities (for example, the first 3) in the probability of the click Time enumeration are determined, where X is an integer greater than or equal to 1 and smaller than the total amount of the Time enumeration numbers, and the Time enumeration numbers corresponding to the first X large probabilities are used as the input of the second-stage content type prediction model. That is, only the time enumeration numbers corresponding to the first X large probabilities may be encoded by using a fixed-position one-hot encoding method, and converted into click time feature data, as shown in fig. 3. Then, the LR-Type (content Type prediction model) is input together with the data to be predicted of the user, and the probability of content Type enumeration of the prediction result is obtained. As mentioned above, the calculation amount can be greatly reduced, and the calculation efficiency and the system performance can be improved.

Preferably, after the step S105 of inputting the user characteristic data as data to be tested into the prediction system and obtaining a prediction result, the method may further include: receiving a prediction result from the prediction system, wherein the prediction result comprises: user identification information corresponding to the plurality of users, probability of click time enumeration corresponding to each user identification information, and probability of click content type enumeration; for each piece of user identification information, determining first M large probabilities from the enumerated probabilities of the click time corresponding to the user identification information, randomly selecting one probability from the first M large probabilities, and taking the enumerated number of the click time corresponding to the selected probability as the message pushing time of the current user, wherein M is greater than or equal to 1 and less than or equal to the total number of the enumerated numbers of the click time corresponding to the user identification information; for each piece of user identification information, determining the first N large probabilities from the enumerated probabilities of the click content types corresponding to the user identification information, randomly selecting one probability from the first N large probabilities, and taking the enumerated number of the click content types corresponding to the selected probability as the message push content type of the current user, wherein N is greater than or equal to 1, and N is less than or equal to the total number of the enumerated numbers of the click content types corresponding to the user identification information.

In the preferred implementation process, M and N may generally select numbers greater than or equal to 2, so that it may be avoided that information is pushed according to the click time enumeration number and the click content type enumeration with the highest probability each time, diversity is lacking, one is randomly selected from a plurality of click time enumeration numbers and a plurality of click content type enumerations with higher probabilities to push messages, and higher flexibility and diversity are provided, which may effectively provide user experience.

Preferably, the method may further include: creating a task bucket timer, wherein for each click time enumeration number of all click time enumeration numbers, a task bucket corresponding to the click time enumeration number is set; after the click time enumeration number corresponding to the selected probability is used as the message push time of the current user, the method may further include: and placing the identification information of the current user into the task bucket of the click time enumeration number corresponding to the selected probability.

Preferably, after the identifier information of the current user is placed in the task bucket of the click time enumeration number corresponding to the selected probability, the method may further include: configuring a message pushing table according to the created task bucket, the click content type enumeration number corresponding to the probability selected for each user and the pushing channel information corresponding to each user; and pushing the message for each user by the pushing engine according to the click time enumeration number and the click content type enumeration number in the message pushing table.

In a preferred implementation process, the Push task engine obtains the click time priority Top3 of the Push content and the type priority of the content to be pushed from the prediction system. The first parameter determines which time period the content is to be pushed to the user and the second parameter determines the type of content to be pushed to the user. Because the Push is carried out according to time periods, the Push task engine carries out barrel division according to time for the user to be pushed, and the Push task engine needs to carry out three steps:

step 1, a task bucket timer is established.

The results from the prediction system are user list data comprising:

user id, push time Top3, push content type, push channel

Dividing each hour into one task bucket for 24 hours, randomly selecting a user list according to the probability of 3 pushing times, and putting the user into a corresponding time bucket:

for example, three push times for user a are as follows:

when the push time 1 is 18, the probability is 0.5

When the push time 2 is 21, the probability is 0.3

When the push time 3 is 22, the probability is 0.2

When the randomly selected push time according to the probabilistic Random function Random (0.5,0.3,0.2) is 18, the user a will be put into the time bucket at 18 hours for sending:

time 0 ═ user 1, user 2, user 3

...

18 ═ user 4, user 5, user a

...

23 [. ] ═ once [ ]

And 2, matching the task contents in the bucket, and acquiring the contents pushed to the user from a content engine (calling a batch interface) according to the type priority of the contents.

For example, the probability of the push content of the user a is as follows:

probability of type a is 0.5

Probability of type t is 0.4

Type v probability of 0.1

And if the push content randomly selected according to the probability Random function Random (0.5,0.4,0.1) is of the t type, the content of the t type found from the content pool is sent to the user a.

And step 3, pushing the content to the terminal user through a pushing channel of the terminal manufacturer.

Through the steps, a large schedule table (namely the configuration message pushing table) is obtained:

when 0 [ { id: user 1, content: a, channel: 1}, { id: user 2, content: t, channel: 1}, { id: user 3, content: v, channel: 2} ]

...

At 18 [ { id: user 4, content: a, channel: 1}, { id: user 5, content: a, channel: 3}, { id: user a, content: t, channel: 2} ]

...

23 [. ] ═ once [ ]

The Push pushing engine pushes really needed messages to the users at proper time according to the schedule, so that harassment feeling of the users is reduced, and the purpose of optimizing a message pushing system is achieved.

Fig. 10 is a general flowchart of a click feedback method for Push system Push messages according to a preferred embodiment of the present invention. As shown in fig. 10, the click feedback method for Push messages in the Push system mainly includes the following steps:

and S1001, adopting feature engineering to the user data, selecting features to realize the characterization of the user, and outputting the user feature data. Through the processing of the characteristic engineering, subsequent training data and data to be tested can be obtained.

S1003, the behavior records are converted into training data of a neural network by collecting click feedback of the user on the push message and combining with the user characteristic data, and the constructed prediction system is trained. For example, a click behavior log of a user clicking on a push message may be obtained; extracting log data in the user click behavior log, wherein the log data comprises: click time, click content type, characteristic information associated with click time and/or click content type. And converting the characteristic information associated with the click time and/or the click content type into user characteristic data, and then reducing the dimension of the user characteristic data to a dynamic one-hot interval by adopting a dynamic one-hot coding mode.

S1005, fishing out the users according to the active attributes of the users, converting the users into data to be predicted by combining with the user characteristic data, and inputting the data into a prediction system to obtain parameters (time and content type) of the push message. For example, the plurality of users with the activity ranks ahead are retrieved from a user profile database, then user characteristic data are obtained, the user characteristic data are subjected to dimension reduction and compression to a dynamic one-hot interval by adopting a dynamic one-hot coding mode, and the to-be-tested data are generated; inputting the data to be tested into a first-stage time prediction model of a prediction system to obtain the probability of the click time enumeration; sorting the time enumeration numbers according to the sequence of the probability of the click time enumeration; determining the first X approximate probabilities in the probability of the click time enumeration, and encoding the time enumeration numbers corresponding to the first X large probabilities by adopting a fixed position one-hot encoding mode to generate encoded data; and inputting the generated coded data and the to-be-tested data into the content type prediction model to obtain the probability of the content type prediction model.

And S1007, submitting the parameters of the push message to a recommendation system, extracting appropriate content from a content library by the recommendation system, and pushing the personalized message to the user through a message channel. For example, the Push engine receives the prediction result from the prediction system, determines, for each piece of the user identification information, the first M large probabilities from the enumerated probabilities of the click time corresponding to the user identification information, randomly selects one probability from the first M large probabilities, and takes the enumerated number of the click time corresponding to the selected probability as the message Push time of the current user, where M is greater than or equal to 1, and M is less than or equal to the total number of the enumerated numbers of the click time corresponding to the user identification information; for each piece of user identification information, determining the first N large probabilities from the enumerated probabilities of the click content types corresponding to the user identification information, randomly selecting one probability from the first N large probabilities, and taking the enumerated number of the click content types corresponding to the selected probability as the message push content type of the current user, wherein N is greater than or equal to 1, and N is less than or equal to the total number of the enumerated numbers of the click content types corresponding to the user identification information.

FIG. 11 is a block diagram illustrating a click feedback apparatus for push messages according to an embodiment of the present invention. As shown in fig. 11, the click feedback device for push messages includes: an information collecting module 10, configured to collect click feedback information of a user on a push message, where the click feedback information includes: click time information, click content type information, characteristic information associated with click time and/or click content type; the system training module 12 is configured to generate training data by using the click feedback information, and construct a prediction system according to the generated training data; a result obtaining module 14, configured to retrieve multiple users from a database, obtain user feature data corresponding to the multiple users, input the user feature data as data to be tested into the prediction system, and obtain a prediction result, where the prediction result includes: probability of click time enumeration, probability of click content type enumeration.

The embodiment of the invention provides a click feedback device of a push message, an information collection module 10 collects click feedback information of a user on the push message, a model construction module 12 generates training data by adopting the click feedback information, and a prediction system is trained according to the generated training data; the result obtaining module 14 inputs the user feature data corresponding to the plurality of users retrieved from the database into the prediction system to obtain a prediction result, where the prediction result includes: probability of click time enumeration, probability of click content type enumeration. Under the condition of considering the use habit, the use feeling of the user and the time and the occasion of opening the message, the really needed message is pushed to the user at the proper time according to the prediction result, so that the aim of accurately positioning the user for pushing the message and enabling the user to receive the message in the time period of expecting to receive the message is fulfilled.

In the preferred implementation process, the user characteristic data is used as the data to be tested and input into the prediction system, and the prediction result is obtained. Specifically, a dynamic one-hot coding mode is adopted to reduce the dimension of the user characteristic data to a dynamic one-hot interval, and the to-be-tested data is generated; inputting the data to be tested into the time prediction system to obtain the probability of the click time enumeration; encoding at least one time enumeration number in the time enumeration numbers by adopting a fixed position one-hot encoding mode to generate encoded data; and inputting the generated coded data and the to-be-tested data into the content type prediction model to obtain the probability of the content type prediction model.

At least one of the time enumerations is determined by: and sorting the time enumeration numbers according to the magnitude sequence of the probability of the click time enumeration, determining the first X large probabilities in the probability of the click time enumeration, and taking the time enumeration number corresponding to the first X large probabilities as the at least one time enumeration number, wherein X is an integer which is greater than or equal to 1 and less than the total amount of the time enumeration numbers.

Preferably, as shown in fig. 11, the apparatus may further include: a result receiving module 16, configured to receive a prediction result from the prediction system, where the prediction result includes: user identification information corresponding to the plurality of users, probability of click time enumeration corresponding to each user identification information, and probability of click content type enumeration; a first selecting module 18, configured to determine, for each piece of the user identification information, first M large probabilities from the enumerated probabilities of the click time corresponding to the piece of user identification information, randomly select one probability from the first M large probabilities, and use an enumerated number of the click time corresponding to the selected probability as a message pushing time of a current user, where M is greater than or equal to 1, and M is less than or equal to a total amount of the enumerated numbers of the click time corresponding to the piece of user identification information; a second selecting module 20, configured to determine, for each piece of the user identification information, the first N large probabilities from the enumerated probabilities of the click content types corresponding to the user identification information, randomly select one probability from the first N large probabilities, and use the enumerated number of the click content types corresponding to the selected probability as the message push content type of the current user, where N is greater than or equal to 1, and N is less than or equal to the total enumerated number of the click content types corresponding to the user identification information.

It should be noted that, in the above preferred embodiment, the modules in the click feedback device for pushing a message are combined with each other, reference may be made to the descriptions of fig. 1 to fig. 10, and details are not described here again.

To sum up, with the above embodiments provided by the present invention, a click feedback mechanism of the push message is provided, the idea and implementation method of the push channel are optimized, and the key parameters of the push message content are estimated, so as to implement the implementation and strategy of the user personalized push message content. And moreover, a two-stage series prediction system is constructed, the prediction system is more flexible, and the coded data of the time enumeration number corresponding to the first few large probabilities in the click time enumeration probability is input into the content type prediction model, so that the predicted calculated amount is greatly simplified, and the processing capacity is improved.

The above disclosure is only for a few specific embodiments of the present invention, but the present invention is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.

Claims

1. A click feedback method for pushing a message is characterized by comprising the following steps:

collecting click feedback information of a user on a push message, wherein the click feedback information comprises: click time information, click content type information, characteristic information associated with click time and/or click content type;

generating training data by adopting the click feedback information, and training a prediction system according to the generated training data;

the method comprises the steps of fetching a plurality of users from a database, obtaining user characteristic data corresponding to the users, inputting the user characteristic data into a prediction system as data to be tested, and obtaining a prediction result, wherein the prediction result comprises the following steps: probability of click time enumeration, probability of click content type enumeration.

2. The method of claim 1, wherein collecting user click feedback information for push messages comprises:

acquiring a click behavior log of a user clicking a push message;

extracting log data in the user click behavior log, wherein the log data comprises: click time, click content type, characteristic information associated with click time and/or click content type.

3. The method of claim 1, wherein generating training data using the click feedback information comprises:

converting the click time in the log data into click time enumeration, and then coding the click time enumeration by adopting a fixed position one-hot coding mode;

converting the click content type in the log data into click content type enumeration, and then coding the click content type enumeration by adopting a fixed position one-hot coding mode;

and converting the characteristic information associated with the click time and/or the click content type into user characteristic data, and then reducing the dimension of the user characteristic data to a dynamic one-hot interval by adopting a dynamic one-hot coding mode.

4. The method of claim 1, wherein the prediction system comprises:

two logistic regression classification models connected in series, wherein the first logistic regression classification model is a time prediction model, and the second logistic regression classification model is a content type prediction model.

5. The method of claim 4, wherein training a prediction system based on the generated training data comprises:

inputting coded data for coding the user characteristic data in the training data and coded data for coding click time enumeration in the training data into the time prediction model, and training the time prediction model;

and inputting coded data for coding the user characteristic data in the training data, coded data for coding the click content type enumeration in the training data, and coded data for coding the click time enumeration number in the training data into the content type prediction model to train the content type prediction model.

6. The method of claim 4, wherein inputting the user characteristic data into the prediction system as data to be tested, obtaining a prediction result comprises:

reducing the dimension of the user characteristic data to a dynamic one-hot interval by adopting a dynamic one-hot coding mode, and generating the data to be tested;

inputting the data to be tested into the time prediction model to obtain the probability of the click time enumeration;

encoding at least one time enumeration number in the time enumeration numbers by adopting a fixed position one-hot encoding mode to generate encoded data;

and inputting the generated coded data and the to-be-tested data into the content type prediction model to obtain the probability of the content type prediction model.

7. The method of claim 6, wherein at least one of the time enumerations is determined by:

sequencing the time enumeration numbers according to the sequence of the probability of the click time enumeration;

determining the first X big probabilities in the click time enumeration probabilities, and taking the time enumeration numbers corresponding to the first X big probabilities as the at least one time enumeration number, wherein X is an integer which is greater than or equal to 1 and smaller than the total amount of the time enumeration numbers.

8. The method of claim 1, wherein inputting the user characteristic data as data to be tested into the prediction system, and after obtaining the prediction result, further comprising:

receiving a prediction result from the prediction system, wherein the prediction result comprises: user identification information respectively corresponding to the plurality of users, probability of click time enumeration respectively corresponding to each user identification information, and probability of click content type enumeration;

for each piece of user identification information, determining the first M large probabilities from the enumerated probabilities of the click time corresponding to the user identification information, randomly selecting one probability from the first M large probabilities, and taking the enumerated number of the click time corresponding to the selected probability as the message pushing time of the current user, wherein M is greater than or equal to 1 and less than or equal to the total number of the enumerated numbers of the click time corresponding to the user identification information;

for each piece of user identification information, determining the first N large probabilities from the enumerated probabilities of the click content types corresponding to the user identification information, randomly selecting one probability from the first N large probabilities, and taking the enumerated numbers of the click content types corresponding to the selected probabilities as the message push content types of the current user, wherein N is greater than or equal to 1, and N is less than or equal to the total number of the enumerated numbers of the click content types corresponding to the user identification information.

9. The method of claim 8, further comprising: creating a task bucket timer, wherein for each click time enumeration number of all click time enumeration numbers, a task bucket corresponding to the click time enumeration number is set;

after the click time enumeration number corresponding to the selected probability is used as the message pushing time of the current user, the method further comprises the following steps: and placing the identification information of the current user into a task bucket of the click time enumeration number corresponding to the selected probability.

10. The method of claim 9, wherein after placing the identification information of the current user into the task bucket of the click time enumeration number corresponding to the selected probability, further comprising:

configuring a message pushing table according to the created task bucket, the click content type enumeration number corresponding to the probability selected for each user and the pushing channel information corresponding to each user;

and pushing the message for each user by a pushing engine according to the click time enumeration number and the click content type enumeration number in the message pushing table.

11. The method of any one of claims 1 to 10, wherein said dragging a plurality of users from a database comprises: and fishing the plurality of users with the top liveness from the database.

12. The method of any of claims 1-10, wherein the characteristic information associated with the click time and/or click content type comprises at least one of:

the information of the user is the region information, the equipment screen information, the browser type information, the operating system type information, the equipment manufacturer information, the equipment type information, the equipment name information and the push channel information.

13. A click feedback device for pushing a message, comprising:

the information collection module is used for collecting click feedback information of a user on the push message, wherein the click feedback information comprises: click time information, click content type information, characteristic information associated with click time and/or click content type;

the system training module is used for generating training data by adopting the click feedback information and training a prediction system according to the generated training data;

the result obtaining module is used for fishing a plurality of users from a database, obtaining user characteristic data corresponding to the users, inputting the user characteristic data into the prediction system as data to be tested, and obtaining a prediction result, wherein the prediction result comprises: probability of click time enumeration, probability of click content type enumeration.

14. The apparatus of claim 13, wherein the prediction system comprises: two logistic regression classification models connected in series, wherein the first logistic regression classification model is a time prediction model, and the second logistic regression classification model is a content type prediction model.

15. The apparatus of claim 13, further comprising:

a result receiving module, configured to receive a prediction result from the prediction system, where the prediction result includes: user identification information respectively corresponding to the plurality of users, probability of click time enumeration respectively corresponding to each user identification information, and probability of click content type enumeration;

a first selection module, configured to determine, for each piece of the user identification information, first M large probabilities from click time enumerated probabilities corresponding to the piece of user identification information, randomly select one probability from the first M large probabilities, and use a click time enumerated number corresponding to the selected probability as a message push time of a current user, where M is greater than or equal to 1, and M is less than or equal to a total amount of click time enumerated numbers corresponding to the piece of user identification information;

and a second selection module, configured to determine, for each piece of the user identification information, the first N large probabilities from the enumerated probabilities of the click content types corresponding to the user identification information, randomly select one probability from the first N large probabilities, and use an enumerated number of the click content types corresponding to the selected probability as a message push content type of the current user, where N is greater than or equal to 1, and N is less than or equal to a total amount of the enumerated numbers of the click content types corresponding to the user identification information.