CN114820005A

CN114820005A - User retention prediction processing method and device, computer equipment and storage medium

Info

Publication number: CN114820005A
Application number: CN202110075537.4A
Authority: CN
Inventors: 李沁妤
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-01-20
Filing date: 2021-01-20
Publication date: 2022-07-29
Anticipated expiration: 2041-01-20
Also published as: CN114820005B

Abstract

The present application relates to the field of Internet technologies, and provides a user retention prediction processing method, device, computer equipment and storage medium. The method includes: obtaining the accumulated user retention data corresponding to each historical time node at the respective subsequent time nodes according to the user retention data possessed by each historical time node at the respective subsequent time nodes, and obtaining adjacent time nodes based on the accumulated user retention data The change trend of the accumulated user retention data presented during the period, and then from the change trend and the accumulated user retention data of the corresponding historical time node at the previous time node of the prediction time node, the predicted user corresponding to the historical time node at the prediction time node is obtained. Retained data can fully mine and utilize the changes of retained data represented by historical user retained data to accurately predict future user retained data, and improve the accuracy and stability of the prediction of user retained data.

Description

User retention prediction processing method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of internet technologies, and in particular, to a user retention prediction processing method and apparatus, a computer device, and a storage medium.

Background

With the development of internet technology, analysis processing technologies such as user access behavior and user retention prediction appear. In the internet technology, a user starts to use a certain application or a certain application service within a certain period of time, and the user who continues to use the application service after a certain period of time can be regarded as a remaining user, the proportion of the user occupying the newly added user at that time is the remaining rate, and the remaining rate can be generally counted according to time units such as days, weeks, months and the like.

The prediction of the user retention data provided in the current technology is usually to directly fit the user retention data of the current year with the user retention data of the previous year, but this method has a technical problem of low prediction accuracy of the user retention data.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a user retention prediction processing method, apparatus, computer device and storage medium for solving the above technical problems.

A user retention prediction processing method, the method comprising:

acquiring user retention data of each historical time node in each subsequent time node;

obtaining accumulated user retention data corresponding to each historical time node at each subsequent time node according to the user retention data of each historical time node at each subsequent time node;

acquiring a change trend which is presented between adjacent time nodes and is related to the accumulated user retention data based on the accumulated user retention data corresponding to each historical time node at each subsequent time node;

and acquiring the predicted user retention data corresponding to the historical time node at the predicted time node according to the change trend and the accumulated user retention data of the historical time node at the previous time node of the predicted time node.

A user retention prediction processing apparatus, the apparatus comprising:

the retention data acquisition module is used for acquiring user retention data of each historical time node in each subsequent time node;

the accumulated data acquisition module is used for acquiring the accumulated user retention data corresponding to each historical time node at each subsequent time node according to the user retention data of each historical time node at each subsequent time node;

the change trend acquisition module is used for acquiring the change trend of the accumulated user retention data presented between adjacent time nodes based on the accumulated user retention data corresponding to each historical time node in each subsequent time node;

and the retention data prediction module is used for acquiring the predicted user retention data corresponding to the historical time node at the prediction time node according to the change trend and the accumulated user retention data of the historical time node at the previous time node of the prediction time node.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

acquiring user retention data of each historical time node in each subsequent time node; obtaining accumulated user retention data corresponding to each historical time node at each subsequent time node according to the user retention data of each historical time node at each subsequent time node; acquiring a change trend which is presented between adjacent time nodes and is related to the accumulated user retention data based on the accumulated user retention data corresponding to each historical time node at each subsequent time node; and acquiring the predicted user retention data corresponding to the historical time node at the predicted time node according to the change trend and the accumulated user retention data of the historical time node at the previous time node of the predicted time node.

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

A computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions are read by a processor of the computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps of the method described above.

According to the user retention prediction processing method, the device, the computer equipment and the storage medium, the accumulated user retention data corresponding to each historical time node at each subsequent time node can be obtained according to the user retention data of each historical time node at each subsequent time node, the change trend of the accumulated user retention data presented between adjacent time nodes is obtained based on the accumulated user retention data, and the predicted user retention data corresponding to the historical time node at the predicted time node is obtained according to the change trend and the accumulated user retention data of the corresponding historical time node at the previous time node of the predicted time node. According to the scheme, the accumulated data corresponding to each subsequent time node is obtained by respectively accumulating the user retention data of each historical time node, the change trend or change rule of the accumulated data between adjacent time nodes is further known through the accumulated data, and the corresponding user retention data can be reversely deduced after the accumulated data of the predicted time nodes is predicted by utilizing the change trend or change rule, so that the retention data change rule represented by the user retention data for many years can be fully mined and utilized to accurately predict the future user retention data, and the prediction accuracy and the prediction stability of the user retention data are improved.

Drawings

FIG. 1 is a diagram of an application environment for a user retention prediction processing method in one embodiment;

FIG. 2 is a flow diagram that illustrates a method for user retention prediction processing in one embodiment;

FIG. 3 is a flow diagram that illustrates the step of obtaining trends in changes that are present between neighboring time nodes with respect to accumulated user retention data, in one embodiment;

4(a) -4 (d) are schematic diagrams of the handling of user retention data in some embodiments;

FIG. 5 is a flowchart illustrating a step of obtaining an active user number in one embodiment;

FIG. 6 is a flowchart illustrating a user retention prediction processing method according to another embodiment;

FIG. 7 is a block diagram of a user persistence prediction processing apparatus in one embodiment;

FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The user retention prediction processing method provided by the present application can be applied to an application environment as shown in fig. 1, and the method can be executed by the server 120, and the server 120 can be implemented by an independent server or a server cluster composed of a plurality of servers. The server 120 may provide background services for the application programs, and different users may download and use the application programs at their respective terminals. In the present application, the server 120 may predict, according to user retention data that each historical time node 11 has at each subsequent time node, user retention data corresponding to one or more predicted time nodes 13 of each historical time node in the future, where the user retention data may be a user retention rate or a specific user retention rate, and in a case where data processing manners are consistent, in order to make a description of the scheme of the present application clear, the scheme will be described in a subsequent part of the present application with the user retention rate as the user retention data.

In addition, after the server 120 predicts the User remaining data corresponding to the predicted time node 13, it may further predict, for example, the number of Active users per day (DAU, data Active User) and the total duration of a certain year in the future based on the predicted User remaining data, and the predicted number of Active users per day and total duration of a certain year may be used for providing reliable data support for related services during annual resource planning.

The following describes the user retention prediction processing method provided by the present application in detail with reference to the embodiments and the accompanying drawings.

In one embodiment, as shown in fig. 2, a user retention prediction processing method is provided, which is exemplified by the application of the method to the server 120 in fig. 1, and the method may include the following steps:

step S201, user retention data of each historical time node in each subsequent time node is obtained;

specifically, the time nodes may be divided or set according to the time granularity required by prediction, where the time granularity may be hours, days, weeks, months, years, and the like, and each day may be regarded as a time node, for example, from the time of application release.

For example, if the current time node is 12 in 2020, 11 in 2020, … …, 1 in 2020, … …, and the like may be used as historical time nodes, each historical time node has a new user, and as time goes by, the new user of each historical time node may be partially or completely retained in subsequent time nodes, and the server 120 may obtain, according to the subsequent time node retention number of the new user of each historical time node, a user retention rate of each historical time node in each subsequent time node, that is, how much the new user of each historical time node is retained in the subsequent time nodes by the server 120. For subsequent time nodes, still taking months as time granularity, for example, for a historical time node of 1 month in 2020, the subsequent time node may include 0, 1, 2, 3, … …, and the historical time node has user retention data specific meaning of a new user generated in 1 month in 2020, and retention rate of 0, 1, 2, 3, … … months in 1 month in 2020.

In practical application, the number of daily active users or the number of monthly active users may be composed of users retained by each historical time node (for example, each historical month and each historical day), if the historical time node is counted from the current application release day, the number of daily active users of the application is mainly composed of new users retained in each historical day, and the user retention rate of each historical time node in a future prediction time node or a future prediction time node may be predicted based on the user retention rate of each historical time node in each subsequent time node, and the number of active users of the prediction time node may be further calculated based on the prediction result.

Step S202, according to the user retention data of each historical time node in each subsequent time node, obtaining the accumulated user retention data of each historical time node in each subsequent time node;

the server 120 calculates the accumulated user retention data corresponding to each historical time node at each subsequent time node, that is, calculates the accumulated user retention data corresponding to each historical time node at each subsequent time node separately, where the accumulated user retention number corresponding to each subsequent time node is the sum of the user retention data corresponding to all subsequent time nodes before the subsequent time node.

In one embodiment, step S202 may include:

the server 120 accumulates the user retention data of each subsequent time node in the forward direction according to the time sequence for each historical time node, and obtains the accumulated user retention data of each historical time node corresponding to each subsequent time node.

For example, for the historical time node of 2019, month 9, the subsequent time nodes thereof are set to include 0, 1 and 2, the user retention data corresponding to the subsequent time nodes are respectively set to be 100%, 57.26% and 51.73%, and the forward direction in time sequence refers to the direction from the subsequent time node 0 to the subsequent time node 2, so that the cumulative user retention data of the historical time node at the subsequent time node 0 is 100%, the cumulative user retention data at the subsequent time node 1 is 157.26%, and the cumulative user retention data at the subsequent time node 1 is 208.99%. In this embodiment, the server 120 may perform the same processing on each historical time node to obtain the accumulated user retention data corresponding to each historical time node at each subsequent time node.

Step S203, acquiring a change trend of accumulated user retention data presented between adjacent time nodes based on the accumulated user retention data corresponding to each historical time node in each subsequent time node;

in this step, the server 120 mainly obtains a variation trend of the accumulated user retention data presented between each adjacent subsequent time node through the feedback of the accumulated user retention data based on the accumulated user retention data corresponding to each historical time node at each subsequent time node, where the variation trend may reflect a change condition of the accumulated user retention data of each historical time period on the whole year-by-year basis, month-by-month basis, or day-by-day basis.

Step S204, according to the change trend and the accumulated user retention data of the historical time node in the previous time node of the predicted time node, obtaining the predicted user retention data corresponding to the historical time node in the predicted time node.

In this step, the server 120 may calculate predicted accumulated user remaining data corresponding to each historical time node at the predicted time node according to a change trend of accumulated user remaining data presented between adjacent time nodes and by combining accumulated user remaining data of each historical time node at a previous time node of the predicted time node, and subtract the predicted accumulated user remaining data from the accumulated user remaining data of the corresponding historical time node at the previous time node to obtain predicted user remaining data corresponding to the predicted time node.

For example, according to the variation trend of the accumulated user retention data presented in the following 0 month and the following 1 month, the server 120 may calculate, by combining the accumulated user retention data of the historical time node at the previous time node of the historical time node at 2021 year and 1 month, i.e., at 12 months in 2020, that the predicted accumulated user retention rate of the historical time node at 2021 year and 1 month in the historical time node at 12 months in 2020 is 158.9%, for example, the server 120 may subtract 100% of the accumulated user retention rate of the historical time node at 2020 year and 12 month in the historical time node at 2021 year and 1 month in the historical time node at 158.9% of the predicted accumulated user retention rate, and may calculate the predicted user retention rate of the historical time node at 2021 year and 1 month in the historical time node at 12 month in 2020 is 58.9%. In this way, the server 120 can predict the predicted user retention rate of each historical time node at the predicted time node, the server 120 can also roll and estimate the user retention rate corresponding to the time node after the predicted time node according to the predicted user retention rate, that is, the server 120 can predict the user retention rate of 1 month to 12 months in 2021 year on the basis of predicting the 1 month in 2021 year, in specific applications, the server 120 can estimate the user retention rate of 365 days in the next day in the history, and the larger the data volume is, the higher the estimated stability and accuracy are, and the more accurate and stable the estimated active user number is.

According to the user retention prediction processing method, the server 120 can obtain the accumulated user retention data corresponding to each historical time node at each subsequent time node according to the user retention data of each historical time node at each subsequent time node, obtain the change trend of the accumulated user retention data presented between adjacent time nodes based on the accumulated user retention data, and obtain the predicted user retention data corresponding to the historical time node at the predicted time node according to the change trend and the accumulated user retention data of the corresponding historical time node at the previous time node of the predicted time node. According to the scheme, the server 120 accumulates the user retention data of each historical time node to obtain the accumulated data corresponding to each subsequent time node, the server 120 further learns the change trend or change rule presented between adjacent time nodes through the accumulated data, and then the corresponding user retention data can be reversely deduced after the accumulated data of the predicted time nodes are predicted by utilizing the change trend or change rule, so that the future user retention data can be predicted accurately by fully mining and utilizing the retention data change rule represented by the user retention data for many years, and the prediction accuracy and the prediction stability of the user retention data are improved.

In one embodiment, the server 120 may obtain the trend of change regarding the accumulated user retention data presented between adjacent time nodes in the following manner, as shown in fig. 3 and described in conjunction with fig. 4(a) to 4(c), and the step S203 may include:

step S301, for each subsequent time node, determining a data statistics range related to the accumulated user retention data, which is suitable for each subsequent time node.

It is assumed that the historical time nodes include T1 to T4, and each subsequent time node includes T0 to T3, for example, the historical time nodes T1 to T4 may specifically correspond to 1 month to 4 months 2020, and the subsequent time nodes T0 to T3 may specifically correspond to subsequent 0 th, 1 th, 2 th and 3 th months. X10 to X40 shown in fig. 4(a) represent user retention rates of the respective historical time nodes at the respective subsequent time nodes, for example, the user retention rate X10 represents a user retention rate of the historical time node T1 at the subsequent time node T0, and the server 120 can calculate the accumulated user retention rates of the respective historical time nodes T1 to T4 at the respective subsequent time nodes by performing accumulation on the respective historical time nodes T1 to T4, where L10 to L40 shown in fig. 4(b) are the accumulated user retention rates of the respective historical time nodes T1 to T4 at the respective subsequent time nodes T0 to T3.

In this step, the server 120 needs to determine a data statistical range corresponding to each of the subsequent time nodes t1 to t3 for each of the subsequent time nodes t1 to t3, with respect to the accumulated user retention data. Specifically, the data statistics range determined by the server 120 for the subsequent time nodes t1 to t3 includes a first data statistics range for each of the subsequent time nodes t1 to t3 and a second data statistics range for a previous subsequent time node of each of the subsequent time nodes t1 to t3, and the first data statistics range and the second data statistics range are the same in size. As shown in fig. 4(b), for the subsequent time node t1, the first statistical range of data specifically refers to the column range occupied by L11, L21 and L31, and the second statistical range of data specifically refers to the column range occupied by L10, L20 and L30; for the subsequent time node t2, the first statistical range of data specifically refers to the column range occupied by L12 and L22, and the second statistical range of data specifically refers to the column range occupied by L11 and L21; for the subsequent time node t3, the first statistical range of data specifically refers to the column range occupied by L13, and the second statistical range of data specifically refers to the column range occupied by L12.

Step S302, obtaining first accumulated user remaining data statistical information and second accumulated user remaining data statistical information based on the accumulated user remaining data corresponding to the historical time nodes in the first data statistical range and the second data statistical range respectively;

specifically, as shown in fig. 4(b), for the subsequent time node t1, the server 120 may perform statistics on the accumulated user retention data corresponding to the column ranges occupied by L11, L21 and L31 and the column ranges occupied by L10, L20 and L30 of each historical time node to obtain the first accumulated user retention data statistical information and the second accumulated user retention data statistical information for the subsequent time node t1, and the similar service 120 may obtain the first accumulated user retention data statistical information and the second accumulated user retention data statistical information for the subsequent time nodes t2 and t 3.

In some embodiments, for subsequent time node t1, server 120 may sum L11, L21, and L31 to obtain a first accumulated user retention data, sum L10, L20, and L30 to obtain a second accumulated user retention data, and similarly may calculate the first and second accumulated user retention data for subsequent time nodes t2 and t 3. That is, for each subsequent time node, the first cumulative user retention data statistics may include a sum of cumulative user retention data corresponding to each historical time node in the corresponding first data statistics range, and the second cumulative user retention data statistics may include a sum of cumulative user retention data corresponding to each historical time node in the corresponding second data statistics range.

Step S303, obtaining a variation trend according to the first accumulated user retention data statistical information and the second accumulated user retention data statistical information.

In this step, for each subsequent time node, the server 120 may compare the obtained statistical information of the first and second accumulated user remaining data to obtain a change trend of the accumulated user remaining data for each subsequent time node.

Further, the step S303 may specifically include: the server 120 obtains a variation trend according to a ratio between a sum of the accumulated user remaining data corresponding to each historical time node in the first data statistical range and a sum of the accumulated user remaining data corresponding to each historical time node in the second data statistical range.

That is, for each subsequent time node, the first cumulative user retention data statistical information obtained by the server 120 includes a sum of cumulative user retention data corresponding to each historical time node in the corresponding first data statistical range (referred to as a first cumulative sum), and the second cumulative user retention data statistical information obtained includes a sum of cumulative user retention data corresponding to each historical time node in the corresponding second data statistical range (referred to as a second cumulative sum), so that the server 120 can calculate a ratio of the first cumulative sum to the second cumulative sum (referred to as a cumulative sum ratio), as shown in fig. 4(b), the server 120 can obtain cumulative sum ratios of each subsequent time node t1 to t3 as F01, F12 and F23, respectively, which can reflect a trend of change in presentation of cumulative user data between the subsequent time nodes t0 and t1, a trend of change in presentation of cumulative user data between the subsequent time nodes t1 and t2, and a trend of change in presentation of cumulative user data between the subsequent time nodes t2 and t 3626 And t3, the server 120 can accurately and reliably calculate the accumulated user retention data corresponding to the predicted time node at each historical time node based on the variation trend information such as the variation trend of the accumulated user retention data presentation.

Further, in some embodiments, the server 120 may calculate the predicted user retention data corresponding to each historical time node at the predicted time node by the following method, where the step S204 specifically includes:

the server 120 obtains the predicted accumulated user retention data corresponding to the historical time node at the predicted time node according to the product of the ratio representing the change trend and the accumulated user retention data of the historical time node at the previous time node of the predicted time node; the server 120 subtracts the predicted accumulated user retention data from the accumulated user retention data that the historical time node had at the previous time node to obtain the predicted user retention data.

Referring to fig. 4(b), without loss of generality, the server 120 predicts the cumulative user retention rate of the historical time node T4 corresponding to the subsequent time node T1, for the historical time node T4, the subsequent time node T1 is a predicted time node, and for the historical time node T3, the server 120 may predict the cumulative user retention rate of the historical time node T3 corresponding to the subsequent time node T2, for the historical time node T3, the subsequent time node T2 is a predicted time node. Specifically, referring to fig. 4(b) and 4(c), after the server 120 acquires the change trend presented by the accumulated user retention data between the subsequent time nodes T0 and T1, which is represented by the accumulation sum ratio F01, the server 120 may multiply the accumulation sum ratio F01 with the accumulated user retention data, i.e., L40, of the historical time node T4 at the previous time node of the predicted time node, i.e., the subsequent time node T0, and an obtained product is the predicted accumulated user retention data L41, corresponding to the predicted time node T1, of the historical time node T4.

By analogy, the server 120 may calculate the elements that fill all the blank boxes in fig. 4(b), i.e., may obtain the predicted accumulated user retention data corresponding to each historical time node at the predicted time node (L23-L43). On this basis, the server 120 may restore the predicted accumulated user-retention data shown in fig. 4(c) and the originally known accumulated user-retention data to the corresponding predicted user-retention data (i.e., X23-X43) and the originally known user-retention data (i.e., X10-X40) shown in fig. 4(d) in a reverse subtractive manner. For example, for the calculation of the predicted user retention data X41, the server 120 may subtract the predicted accumulated user retention data L41 from the accumulated user retention data L40 that the historical time node T4 had at the previous time node T0, resulting in the predicted user retention data X41. By adopting the scheme provided by the embodiment, the server 120 can accurately and reliably calculate the predicted user retention data corresponding to each historical time node at one or more predicted time nodes.

After obtaining the predicted user retention rate corresponding to the predicted time node of each historical time node, the server 120 may calculate the number of active users corresponding to the predicted time node by using the predicted user retention rate. In a specific service scenario, the number of active users in the future, for example, in each day, mainly consists of all remaining new users and old users in the history, for a first class of users, that is, new users, the server 120 may estimate the retention rate of all remaining new users in the history in each day, each week, each month, or each year in the future by using the method described in the above embodiment, that is, the server 120 may estimate the predicted user retention rate for the first class of users by using the above method, further, the server 120 may further obtain the number of active users corresponding to a second class of users, that is, old users at the predicted time node, and based on this, the server 120 may estimate the number of active users corresponding to a certain or a certain predicted time node. Specifically, in some embodiments, as shown in fig. 5, the method may further include the following steps:

step S501, the server 120 determines the number of first-class users corresponding to each historical time node;

in this step, the server 120 may obtain the number of new users corresponding to each historical time node, for example, obtain the number of new users added each month in history.

Step S502, obtaining a first active user number corresponding to the first class user at the prediction time node according to the first class user number of each historical time node and the corresponding prediction user retention rate;

taking the historical time node as an example every day, in this step, the server 120 may multiply the number of new users corresponding to each day in the history by the predicted user retention rate corresponding to the predicted time node, so as to obtain the number of active users corresponding to the predicted time node of the new users, which is used as the first number of active users. In a specific application, the server 120 may estimate the number of new users remaining to 12 months and 31 days in 2021, for example, according to the number of new users possessed by each historical time node and the corresponding predicted user retention rate.

Step S503, the server 120 obtains a second active user number corresponding to the predicted time node for the second type of user;

in addition to the number of active users corresponding to the new user, in order to predict the number of active users at the predicted time node more reasonably and accurately, the server 120 further obtains the number of second active users corresponding to the predicted time node for the second type of users and the old users. In the example of the information application, the old user may be a user that has registered or used the information application before each historical time node.

In some embodiments, step S503 may include: the server 120 determines a second class user base number corresponding to a selected time node before the predicted time node for the second class user; the server 120 obtains the user number decay history trend for the second type of user, and obtains a second active user number based on the user number decay history trend and the base number of the second type of user.

In this embodiment, the server 120 may calculate, from the selected time node, the number of users corresponding to the second type of user at the predicted time node according to the user number decay history trend of the second type of user, and use the calculated number as the second active user number. The user number decay history trend refers to the decay trend of the number of the second type users, namely the old users before the selected time node. For example, the server 120 may predict the number of old users staying in 2020 to 12/31/2021, at which each day in 2021 may be regarded as a predicted time node, and the server 120 may select the number of active users in 2020 for 12/31/2021 (i.e., a selected time node) as the aforementioned second type of user base number, and in the case of obtaining the historical trend of user number decay, the server 120 may decay the second type of user base number by day according to the historical trend of user number decay, that is, may calculate the number of old users staying in 2021 each day, so as to obtain the number of old users staying in 2021 for 12/31/2021. For the calculation of the user number decay history trend, the server 120 may first find out, based on the number of active users in 31 days in 12 months in 2020, the percentage of old users in the past of the year, such as 52%, and may also obtain, from the number of active users in 31 days in 12 months in 2020, the percentage of new users in the new day, such as 98%, so that the server 120 may calculate that the percentage of old users remaining in 31 days in 12 months in 2020 is 46%, and the percentage is shared in each day in 2020, so that the loss rate of the old users in 2020, i.e., the user number decay history trend, is about 99.87%, and after the user number decay history trend is obtained, the server 120 may calculate, as the second active user number, the user number of the old users in 2021 year in each day according to the number of active users in 31 days in 12 months in 2020. In this way, the server 120 can accurately predict the number of active users for the old users each day by applying, for example, the decay pattern that the previous year has for the old users to, for example, the next year.

And step S504, obtaining the number of the active users corresponding to the predicted time node according to the number of the first active users and the number of the second active users.

In this step, the server 120 may add the first active user number and the second active user number to obtain an active user number corresponding to the predicted time node, for example, the server 120 may sum the first active user number and the second active user number each day in 2021 to obtain an active user number each day.

According to the scheme provided by the embodiment of the application, the server 120 can obtain the number of the daily active users DAU in each day of the future year based on the prediction of the number of the new and old users reserved in each day of the future year.

In one embodiment, a user retention prediction processing method is provided, which is performed by the server 120, as shown in fig. 6, and may include the following steps:

step S601, the server 120 obtains the user retention rate of each historical time node in each subsequent time node;

step S602, the server 120 obtains, according to the user retention rate of each historical time node at each subsequent time node, an accumulated user retention rate corresponding to each historical time node at each subsequent time node;

step S603, the server 120 accumulates the user retention rates of the respective subsequent time nodes in the forward direction according to the time sequence for each historical time node, so as to obtain the accumulated user retention rates of the respective historical time nodes corresponding to the respective subsequent time nodes;

step S604, the server 120 determines, for each subsequent time node, a data statistical range about the accumulated user retention rate that is adapted to each subsequent time node;

the data statistical range comprises a first data statistical range for each subsequent time node and a second data statistical range for a previous subsequent time node of each subsequent time node, and the first data statistical range and the second data statistical range are the same in size.

Step S605, the server 120 obtains a variation trend about the accumulated user retention rate presented between adjacent time nodes according to a ratio between a sum of the accumulated user retention rates of the historical time nodes in the first data statistical range and a sum of the accumulated user retention rates of the historical time nodes in the second data statistical range;

step S606, the server 120 obtains the predicted accumulated user retention rate corresponding to each historical time node at the predicted time node according to the product of the ratio and the accumulated user retention rate of each historical time node at the previous time node of the predicted time node;

step S607, the server 120 subtracts the accumulated user retention rate of each predicted accumulated user retention rate from the accumulated user retention rate of the previous time node of the corresponding historical time node to obtain the predicted user retention rate of each historical time node at the predicted time node;

step S608, the server 120 obtains a first active user number corresponding to the new user at the predicted time node according to the new user number of each historical time node and the corresponding predicted user retention rate, and the server 120 obtains a second active user number corresponding to the old user at the predicted time node according to the user number attenuation historical trend of the old user and the basis number of the old user;

in step S609, the server 120 sums the first active user number and the second active user number to obtain the active user number corresponding to the predicted time node.

According to the embodiment of the application, the server 120 can carry out rolling estimation on the retention rate of the new user in a multi-year dimension mode, and can continuously combine historical retention rate data in the estimation process, follow-up estimation on the retention rate of 365 days can be carried out for each day, the estimation stability and accuracy are high, for the retention estimation of the old user, the server 120 can predict the retention rate of the old user in the future year according to the retention rule of the old user in the previous year, and the DAU can be finally estimated on the number of active users in the day of the future year by combining the server 120 and the retention rule. Therefore, internet service or application service providers can predict daily active user number DAU of each day in the future year according to respective actual conditions, and make corresponding service resource supply strategies.

It should be understood that, although the steps in the above flowcharts are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the above flowcharts may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or the stages is not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a part of the steps or the stages in other steps.

In one embodiment, as shown in fig. 7, a user retention prediction processing apparatus is provided, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, wherein the apparatus 700 specifically includes:

a retained data obtaining module 701, configured to obtain user retained data that each historical time node has at each subsequent time node;

an accumulated data obtaining module 702, configured to obtain, according to the user retention data that each historical time node has at each subsequent time node, accumulated user retention data that each historical time node corresponds to at each subsequent time node;

a variation trend obtaining module 703, configured to obtain, based on the accumulated user remaining data corresponding to each subsequent time node of each historical time node, a variation trend regarding the accumulated user remaining data, which is present between adjacent time nodes;

and a retention data prediction module 704, configured to obtain the predicted user retention data corresponding to the historical time node at the prediction time node according to the change trend and the accumulated user retention data that the historical time node has at the previous time node of the prediction time node.

In an embodiment, the accumulated data obtaining module 702 is configured to, for each historical time node, accumulate the user retention data that is possessed by the respective subsequent time node in a forward direction according to a time sequence to obtain the accumulated user retention data that corresponds to the respective subsequent time node for the respective historical time node.

In one embodiment, the variation trend obtaining module 703 is configured to determine, for each subsequent time node, a data statistics range related to the accumulated user retention data, which is suitable for the respective subsequent time node; wherein the data statistics ranges comprise a first data statistics range for each subsequent time node and a second data statistics range for a previous subsequent time node of each subsequent time node; the first data statistical range and the second data statistical range are the same in size; obtaining first accumulated user remaining data statistical information and second accumulated user remaining data statistical information based on the accumulated user remaining data corresponding to the historical time nodes in the first data statistical range and the second data statistical range respectively; and acquiring the change trend according to the first accumulated user retention data statistical information and the second accumulated user retention data statistical information.

In one embodiment, the first accumulated user retention data statistics information includes a sum of accumulated user retention data corresponding to the historical time nodes in the first data statistics range; the second accumulated user retention data statistical information comprises the sum of the accumulated user retention data corresponding to the historical time nodes in the second data statistical range; a change trend obtaining module 703, configured to obtain the change trend according to a ratio between a sum of accumulated user remaining data corresponding to the first data statistical range of each historical time node and a sum of accumulated user remaining data corresponding to the second data statistical range of each historical time node.

In one embodiment, the retention data prediction module 704 is configured to obtain the predicted accumulated user retention data corresponding to the historical time node at the prediction time node according to a product of the ratio representing the variation trend and the accumulated user retention data that the historical time node has at a previous time node of the prediction time node; subtracting the predicted accumulated user retention data from the accumulated user retention data of the historical time node at the previous time node to obtain the predicted user retention data.

In one embodiment, the predicted user retention data comprises a predicted user retention rate for a first class of users; the apparatus 700 may further include: the active user number prediction module is used for determining a first type of user number corresponding to each historical time node; obtaining a first active user number corresponding to the first class user at the prediction time node according to the first class user number of each historical time node and the corresponding prediction user retention rate; acquiring a second active user number corresponding to the predicted time node of the second type of users; and obtaining the number of the active users corresponding to the predicted time node according to the first number of the active users and the second number of the active users.

In one embodiment, the active user number prediction module is configured to determine a second type user base number corresponding to a selected time node before the predicted time node for the second type user; acquiring a user number attenuation historical trend aiming at the second type of users; the user number decay history trend is a decay trend that the number of the second type of users has before the selected time node; and obtaining the second active user number based on the user number attenuation historical trend and the second type of user basic number.

For specific limitations of the user retention prediction processing apparatus, reference may be made to the above limitations of the user retention prediction processing method, which will not be described herein again. The various modules in the user retention prediction processing apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 8. The computer device 800 includes a processor 820, memory, and a network interface 840 connected by a system bus 810. Wherein the processor 820 of the computer device 800 is configured to provide computing and control capabilities. The memory of the computer device 800 includes a nonvolatile storage medium 8310 and an internal memory 8320. The non-volatile storage medium 8310 stores an operating system 8311, computer programs 8312, and a database 8313. The internal memory 8320 provides an environment for the operation of an operating system 8311 and computer programs 8312 in the nonvolatile storage medium 8310. The database 8313 of the computer device 800 can be used to store user retention data, accumulate user retention data, trend changes, predict user retention data, and number of active users, among other data. The network interface 840 of the computer apparatus 800 is used for communicating with an external terminal through a network connection. The computer program 8312, when executed by the processor 820, implements a user retention prediction processing method.

Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A user retention prediction processing method, wherein the method comprises:

Obtain the user retention data that each historical time node has at their respective subsequent time nodes;

Obtaining the accumulated user retention data corresponding to each historical time node at each subsequent time node according to the user retained data of each historical time node at respective subsequent time nodes;

Based on the accumulated user retention data corresponding to the respective subsequent time nodes of each historical time node, acquiring the change trend of the accumulated user retention data presented between adjacent time nodes;

According to the change trend and the accumulated user retention data of the historical time node at the time node preceding the prediction time node, the predicted user retention data corresponding to the historical time node at the prediction time node is obtained.

2. The method according to claim 1, characterized in that, according to the user retention data that each historical time node has at respective subsequent time nodes, the accumulated data corresponding to each historical time node at respective subsequent time nodes are obtained. User retention data, including:

For each historical time node, the user retained data possessed at the respective subsequent time nodes are accumulated in a forward direction according to the time sequence, to obtain the accumulated user retained data corresponding to the respective subsequent time nodes of the respective historical time nodes.

3. The method according to claim 1, wherein, based on the accumulated user retention data corresponding to the respective subsequent time nodes based on the respective historical time nodes, obtain information about the accumulated user retention data presented between adjacent time nodes. changing trends, including:

For each subsequent time node, determine a data statistical range related to accumulated user retained data that is suitable for each subsequent time node; wherein the data statistical range includes a first data statistical range for each subsequent time node and a data statistical range for each subsequent time node the second data statistical range of the previous subsequent time node of the subsequent time node; the size of the first data statistical range and the second data statistical range is the same;

Obtaining the first accumulated user retained data statistics and the second accumulated user retained data statistics based on the accumulated user retention data corresponding to the first data statistics range and the second data statistics range respectively by the historical time nodes;

The change trend is acquired according to the first accumulated user retention data statistics information and the second accumulated user retention data statistics information.

4 . The method according to claim 3 , wherein the first accumulated user retention data statistical information comprises the sum of the accumulated user retention data corresponding to the each historical time node in the first data statistical range; 4 . The second accumulated user retention data statistical information includes the sum of the accumulated user retention data corresponding to each historical time node in the second data statistics range; the first accumulated user retention data statistical information and the first accumulated user retention data 2. Accumulate the statistical information of user retention data to obtain the change trend, including:

According to the ratio between the sum of accumulated user retained data corresponding to each historical time node in the first data statistical range and the accumulated user retained data corresponding to each historical time node in the second data statistical range , to get the change trend.

5 . The method according to claim 4 , wherein the historical time node is obtained according to the change trend and the accumulated user retention data that the historical time node has at a time node preceding the prediction time node. 6 . The predicted user retention data corresponding to the predicted time node includes:

According to the product of the ratio representing the change trend and the accumulated user retention data of the historical time node at the time node preceding the prediction time node, the corresponding value of the historical time node at the prediction time node is obtained. Predict accumulated user retention data;

The predicted accumulated user retention data is subtracted from the accumulated user retention data possessed by the historical time node at the previous time node to obtain the predicted user retention data.

6. The method according to any one of claims 1 to 5, wherein the predicted user retention data comprises a predicted user retention rate for the first type of users; the acquiring the historical time node is in the prediction After the predicted user retention data corresponding to the time node, the method further includes:

determining the number of users of the first type corresponding to each of the historical time nodes;

According to the number of users of the first type at each historical time node and the corresponding predicted user retention rate, obtain the number of first active users corresponding to the first type of users at the predicted time node;

Obtain the number of second active users corresponding to the second type of users at the predicted time node;

According to the first number of active users and the second number of active users, the number of active users corresponding to the predicted time node is obtained.

7. The method according to claim 6, wherein the acquiring the number of second active users corresponding to the second type of users at the predicted time node comprises:

determining the base number of the second type of users corresponding to a selected time node before the predicted time node of the second type of user;

Acquiring a historical trend of user number decline for the second type of users; the user number decline historical trend is the decline trend of the number of users of the second type before the selected time node;

The second number of active users is obtained based on the declining historical trend of the number of users and the base number of the second type of users.

8. A user retention prediction processing device, wherein the device comprises:

The retained data acquisition module is used to acquire the user retained data owned by each historical time node at their respective subsequent time nodes;

A cumulative data acquisition module, configured to obtain the cumulative user retained data corresponding to the respective historical time nodes at the respective subsequent time nodes according to the user retained data possessed by the respective historical time nodes at the respective subsequent time nodes;

a change trend acquisition module, configured to obtain the change trend of the accumulated user retention data presented between adjacent time nodes based on the accumulated user retention data corresponding to the respective historical time nodes at respective subsequent time nodes;

The retained data prediction module is used to obtain the predicted user corresponding to the historical time node at the predicted time node according to the change trend and the accumulated user retained data of the historical time node at the previous time node of the predicted time node Save data.

9. A computer device, comprising a memory and a processor, wherein the memory stores a computer program, wherein the processor implements the method according to any one of claims 1 to 7 when the processor executes the computer program. step.

10. A computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 7 are implemented.