[go: up one dir, main page]

CN116166873A - User portrait generation method and device, electronic equipment and storage medium - Google Patents

User portrait generation method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116166873A
CN116166873A CN202210974291.9A CN202210974291A CN116166873A CN 116166873 A CN116166873 A CN 116166873A CN 202210974291 A CN202210974291 A CN 202210974291A CN 116166873 A CN116166873 A CN 116166873A
Authority
CN
China
Prior art keywords
data
user
target
information
user data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210974291.9A
Other languages
Chinese (zh)
Inventor
方平
吴鹏
吴海英
罗展松
胡伟
宋瑞鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Consumer Finance Co Ltd
Original Assignee
Mashang Consumer Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Consumer Finance Co Ltd filed Critical Mashang Consumer Finance Co Ltd
Priority to CN202210974291.9A priority Critical patent/CN116166873A/en
Publication of CN116166873A publication Critical patent/CN116166873A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开提供了一种用户画像数据生成方法及装置、电子设备、存储介质,该方法包括:流式获取待处理的目标用户数据,目标用户数据包括业务应用实时生成的用户数据;获取预先配置的标签信息和人群包信息,标签信息包括至少一个标签的信息,每一标签表示用户的一个用户特征,人群包信息包括至少一个人群包的信息,每一人群包包括多个不同的标签;根据标签信息和人群包信息,对目标用户数据进行画像,获得用户画像数据;在对目标用户数据进行画像的过程中,获取数据库集群的后端节点的运行状态,并根据运行状态对后端节点进行运维管理。根据本公开的实施例可以及时的对业务应用实时生成的用户数据进行画像,以提升用户画像数据的生成速度。

Figure 202210974291

The present disclosure provides a method and device for generating user portrait data, electronic equipment, and a storage medium. The method includes: obtaining target user data to be processed in a streaming manner, and the target user data includes user data generated in real time by business applications; obtaining pre-configured tag information and crowd package information, the tag information includes at least one tag information, each tag represents a user feature of the user, the crowd package information includes at least one crowd package information, and each crowd package includes multiple different tags; according to the tag information and crowd package information, profile the target user data, and obtain user profile data; in the process of profiling the target user data, obtain the running status of the back-end nodes of the database cluster, and run the back-end nodes according to the running status dimension management. According to the embodiments of the present disclosure, user data generated by business applications in real time can be profiled in a timely manner, so as to increase the speed of generating user profile data.

Figure 202210974291

Description

用户画像生成方法及装置、电子设备、存储介质User portrait generation method and device, electronic device, storage medium

技术领域technical field

本公开涉及计算机技术领域,特别涉及一种用户画像生成方法及装置、电子设备、存储介质。The present disclosure relates to the field of computer technology, and in particular to a method and device for generating a user portrait, electronic equipment, and a storage medium.

背景技术Background technique

用户画像(Persona),又称用户角色,作为一种勾画目标用户、联系用户诉求与设计方向的有效工具,用户画像在各领域得到了广泛的应用,通过用户画像,能够为用户提供各种便捷的服务。User portrait (Persona), also known as user role, as an effective tool to outline target users, contact user demands and design direction, user portrait has been widely used in various fields, through user portrait, can provide users with various convenient services.

目前,在进行用户画像时,一般是按照预设时间间隔批量获取一段时间内线上业务应用生成的线上用户数据,以及线下上传的离线用户数据,并通过对该批量获取的数据进行分析处理,获得用户画像数据。At present, when performing user portraits, the online user data generated by online business applications within a certain period of time and the offline user data uploaded offline are generally obtained in batches according to preset time intervals, and the data obtained in batches is analyzed and processed. , to obtain user profile data.

由此可知,目前的用户画像生成方法可能存在处理延迟的问题。It can be seen that the current method for generating user portraits may have a processing delay problem.

发明内容Contents of the invention

本公开提供一种用户画像生成方法及装置、电子设备、存储介质。The disclosure provides a method and device for generating a user portrait, electronic equipment, and a storage medium.

第一方面,本公开提供了一种用户画像生成方法,应用于部署有数据库集群的电子设备中,该方法包括:In a first aspect, the present disclosure provides a method for generating a user portrait, which is applied to an electronic device deployed with a database cluster, and the method includes:

流式获取待处理的目标用户数据,其中,所述目标用户数据包括业务应用实时生成的用户数据;Streaming acquisition of target user data to be processed, wherein the target user data includes user data generated by business applications in real time;

获取预先配置的标签信息和人群包信息,其中,所述标签信息包括至少一个标签的信息,每一标签表示用户的一个用户特征,所述人群包信息包括至少一个人群包的信息,每一人群包包括多个不同的标签;Acquiring pre-configured tag information and group group information, wherein the tag information includes at least one tag information, each tag represents a user characteristic of the user, the group group information includes at least one group group information, and each group Packages include several different tags;

根据所述标签信息和所述人群包信息,对所述目标用户数据进行画像,获得用户画像数据;以及,According to the tag information and the crowd package information, profile the target user data to obtain user profile data; and,

在对所述目标用户数据进行画像的过程中,获取所述数据库集群的后端节点的运行状态,并根据所述运行状态对所述后端节点进行运维管理。In the process of profiling the target user data, the running status of the back-end nodes of the database cluster is obtained, and the operation and maintenance management is performed on the back-end nodes according to the running status.

第二方面,本公开提供了一种用户画像生成装置,应用于部署有数据库集群的电子设备中,该装置包括:In a second aspect, the present disclosure provides a device for generating a user portrait, which is applied to an electronic device deployed with a database cluster, and the device includes:

用户数据获取单元,用于流式获取待处理的目标用户数据,其中,所述目标用户数据包括业务应用实时生成的用户数据;A user data acquisition unit, configured to acquire target user data to be processed in a streaming manner, wherein the target user data includes user data generated in real time by business applications;

配置信息获取单元,用于获取预先配置的标签信息和人群包信息,其中,所述标签信息包括至少一个标签的信息,每一标签表示用户的一个用户特征,所述人群包信息包括至少一个人群包的信息,每一人群包包括多个不同的标签;A configuration information acquisition unit, configured to acquire pre-configured label information and crowd package information, wherein the label information includes at least one label information, each label represents a user characteristic of the user, and the crowd package information includes at least one crowd Package information, each crowd package includes multiple different tags;

画像单元,用于根据所述标签信息和所述人群包信息,对所述目标用户数据进行画像,获得用户画像数据;以及,an image unit, configured to image the target user data according to the tag information and the crowd package information, and obtain user image data; and,

守护单元,用于在对所述目标用户数据进行画像的过程中,获取所述数据库集群的后端节点的运行状态,并根据所述运行状态对所述后端节点进行运维管理。The guarding unit is configured to obtain the running status of the back-end nodes of the database cluster during the profiling of the target user data, and perform operation and maintenance management on the back-end nodes according to the running status.

第三方面,本公开提供了一种电子设备,该电子设备包括:至少一个处理器;以及与所述至少一个处理器通信连接的存储器;其中,所述存储器存储有可被所述至少一个处理器执行的一个或多个计算机程序,一个或多个所述计算机程序被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述的用户画像生成方法。In a third aspect, the present disclosure provides an electronic device, which includes: at least one processor; and a memory communicated with the at least one processor; wherein, the memory stores information that can be processed by the at least one processor. One or more computer programs executed by a processor, and one or more computer programs are executed by the at least one processor, so that the at least one processor can execute the above-mentioned user portrait generation method.

第四方面,本公开提供了一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序在被处理器执行时实现上述的用户画像生成方法。In a fourth aspect, the present disclosure provides a computer-readable storage medium on which a computer program is stored, wherein the computer program implements the above-mentioned user portrait generation method when executed by a processor.

本公开所提供的实施例,在获得用户授权的情况下,通过流式获取待处理的目标用户数据,并获取预先配置的标签信息和人群包信息,由于标签信息中包括至少一个表示用户的一个用户特征的标签的信息,并且人群包信息中包括分别由多个不同标签组成的至少一个人群包的信息,因此,基于该标签信息和人群包信息对流式获取到的目标用户数据进行画像,即可及时生成用户画像数据;同时,在对目标用户数据进行画像的过程中,通过获取电子设备中部署的数据库集群的后端节点的运行状态,并根据该运行状态对该后端节点进行运维管理,还可以保障该后端节点的稳定可靠运行,从而对画像处理的健康运行提供保障。由于流式处理方式具有连续不断、无间隔的特性,因此,基于流式处理方式获取待处理的目标用户数据,可以降低进行用户画像时的处理延时;同时,通过基于后端节点的运行状态对后端节点进行运维管理,也可以保障后端节点的稳定可靠运行,避免后端节点宕机,从而保障电子设备可以及时提供实时的用户画像数据供用户查看。In the embodiment provided by the present disclosure, when the user authorization is obtained, the target user data to be processed is obtained through streaming, and the pre-configured tag information and crowd package information are obtained. Since the tag information includes at least one representing the user The tag information of the user characteristics, and the group package information includes at least one group package information composed of multiple different tags. Therefore, based on the tag information and the group package information, the target user data obtained by streaming is profiled, that is User portrait data can be generated in time; at the same time, in the process of profiling target user data, by obtaining the running status of the back-end node of the database cluster deployed in the electronic device, and operating and maintaining the back-end node according to the running status Management can also ensure the stable and reliable operation of the back-end node, thus providing guarantee for the healthy operation of image processing. Due to the continuous and non-interval characteristics of the stream processing method, obtaining the target user data to be processed based on the stream processing method can reduce the processing delay when performing user portraits; at the same time, based on the running status of the back-end nodes The operation and maintenance management of the back-end nodes can also ensure the stable and reliable operation of the back-end nodes and avoid the downtime of the back-end nodes, so as to ensure that the electronic equipment can provide real-time user portrait data for users to view in a timely manner.

应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the present disclosure, nor is it intended to limit the scope of the present disclosure. Other features of the present disclosure will be readily understood through the following description.

附图说明Description of drawings

附图用来提供对本公开的进一步理解,并且构成说明书的一部分,与本公开的实施例一起用于解释本公开,并不构成对本公开的限制。通过参考附图对详细示例实施例进行描述,以上和其他特征和优点对本领域技术人员将变得更加显而易见,在附图中:The accompanying drawings are used to provide a further understanding of the present disclosure, and constitute a part of the specification, and are used together with the embodiments of the present disclosure to explain the present disclosure, and do not constitute a limitation to the present disclosure. The above and other features and advantages will become more apparent to those skilled in the art by describing detailed example embodiments with reference to the accompanying drawings, in which:

图1为本公开实施例提供的一种用户画像生成方法的流程图;FIG. 1 is a flow chart of a method for generating a user portrait provided by an embodiment of the present disclosure;

图2为本公开实施例提供的获取目标用户数据的流程图;FIG. 2 is a flow chart of acquiring target user data provided by an embodiment of the present disclosure;

图3为本公开实施例提供的获取目标画像规则的流程图;FIG. 3 is a flow chart of the rules for acquiring target portraits provided by an embodiment of the present disclosure;

图4为本公开实施例提供的用于检测后端节点的第一框架处理图;FIG. 4 is a first framework processing diagram for detecting backend nodes provided by an embodiment of the present disclosure;

图5为本公开实施例提供的用于检测后端节点的第二框架处理图;FIG. 5 is a second framework processing diagram for detecting backend nodes provided by an embodiment of the present disclosure;

图6为本公开实施例提供的一种用户画像生成装置的框图;FIG. 6 is a block diagram of a device for generating a user portrait provided by an embodiment of the present disclosure;

图7为本公开实施例提供的一种电子设备的框图。Fig. 7 is a block diagram of an electronic device provided by an embodiment of the present disclosure.

具体实施方式Detailed ways

为使本领域的技术人员更好地理解本公开的技术方案,以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。In order for those skilled in the art to better understand the technical solution of the present disclosure, the exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and they should be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

在不冲突的情况下,本公开各实施例及实施例中的各特征可相互组合。In the case of no conflict, various embodiments of the present disclosure and various features in the embodiments can be combined with each other.

如本文所使用的,术语“和/或”包括一个或多个相关列举条目的任何和所有组合。As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.

本文所使用的术语仅用于描述特定实施例,且不意欲限制本公开。如本文所使用的,单数形式“一个”和“该”也意欲包括复数形式,除非上下文另外清楚指出。还将理解的是,当本说明书中使用术语“包括”和/或“由……制成”时,指定存在所述特征、整体、步骤、操作、元件和/或组件,但不排除存在或添加一个或多个其它特征、整体、步骤、操作、元件、组件和/或其群组。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的。The terminology used herein is for describing particular embodiments only and is not intended to limit the present disclosure. As used herein, the singular forms "a" and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It will also be understood that when the terms "comprising" and/or "consisting of" are used in this specification, the stated features, integers, steps, operations, elements and/or components are specified to be present but not excluded to be present or Add one or more other features, integers, steps, operations, elements, components and/or groups thereof. Words such as "connected" or "connected" are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.

需要说明的是,本申请中所有获取信号、信息或数据的动作都是在遵照所在地国家相应的数据保护法规政策的前提下,并获得由相应装置所有者给予授权的情况下进行的。It should be noted that all actions to obtain signals, information or data in this application are carried out under the premise of complying with the corresponding data protection laws and policies of the country where the location is located, and with the authorization given by the corresponding device owner.

除非另外限定,否则本文所用的所有术语(包括技术和科学术语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如那些在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本公开的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本文明确如此限定。Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will also be understood that terms such as those defined in commonly used dictionaries should be interpreted as having meanings consistent with their meanings in the context of the relevant art and the present disclosure, and will not be interpreted as having idealized or excessive formal meanings, Unless expressly so limited herein.

在相关技术中,电子设备在进行用户画像时,一般是先获取用户基于前端配置界面配置的、用于进行用户画像的标签和人群包的信息以及数据源的配置信息,其中,该数据源可以为用于存储业务应用生成的线上数据或者由用户上传的线下离线数据的数据库;之后,电子设备按照预设时间间隔,例如,每日0点从数据源中获取最新生成的待处理的用户数据;并针对该数据进行批量用户画像分析,以获得用户画像数据。该种用户画像生成方法由于是基于批处理机制,所以其生成的用户画像数据往往存在延迟,并不能反映业务应用中用户的最新动态和诉求,导致其他依赖于用户画像数据的系统,例如,会员系统、营销系统、电销系统等系统可能无法及时响应用户需求。In related technologies, when an electronic device performs user profiling, it generally first obtains the information of tags and crowd packages for user profiling and the configuration information of the data source configured by the user based on the front-end configuration interface, wherein the data source can be It is a database for storing online data generated by business applications or offline data uploaded by users; after that, the electronic device obtains the latest generated data to be processed from the data source at 0:00 every day according to a preset time interval. User data; and conduct batch user portrait analysis on the data to obtain user portrait data. Since this user portrait generation method is based on a batch processing mechanism, the user portrait data generated by it often has delays and cannot reflect the latest developments and demands of users in business applications, causing other systems that rely on user portrait data, such as membership System, marketing system, telemarketing system and other systems may not be able to respond to user needs in a timely manner.

为解决上述问题,本公开实施例提供一种用户画像生成方法,请参看图1,其为本公开实施例提供的一种用户画像生成方法的流程图。该方法可以应用于部署有数据库集群(如Starrocks集群)的电子设备中,该电子设备可以为服务器,该服务器可以为物理服务器,或者也可以为虚拟服务器;当然,随着技术的不断进步,该电子设备也可以为终端设备,即,也可以将该方法单独应用于终端设备中,例如,可以应用于边缘计算场景下的边缘终端设备中,此处不做特殊限定。In order to solve the above problems, an embodiment of the present disclosure provides a method for generating a user portrait, please refer to FIG. 1 , which is a flowchart of a method for generating a user portrait provided by an embodiment of the present disclosure. This method can be applied to an electronic device deployed with a database cluster (such as a Starrocks cluster), the electronic device can be a server, the server can be a physical server, or a virtual server; of course, with the continuous advancement of technology, the The electronic device can also be a terminal device, that is, the method can also be applied to a terminal device alone, for example, it can be applied to an edge terminal device in an edge computing scenario, and there is no special limitation here.

如图1所示,本公开实施例提供的用户画像生成方法包括如下步骤S101-S104,以下予以详细说明。As shown in FIG. 1 , the user portrait generation method provided by the embodiment of the present disclosure includes the following steps S101-S104, which will be described in detail below.

步骤S101,流式获取待处理的目标用户数据,其中,目标用户数据包括业务应用实时生成的用户数据。Step S101 , acquiring target user data to be processed in a streaming manner, wherein the target user data includes user data generated by a business application in real time.

目标用户数据,可以是在获得用户授权的情况下,由业务应用实时生成的、待进行用户画像的用户数据;当然,该目标用户数据还可以为由线下生成并上传至线上数据库,例如,数据仓库Hive库中的用户数据。另外,在本公开实施例中,用户数据可以为个人用户也可以为企业用户,即,用户数据可以为针对个人的、也可以是针对企业的,即针对某个客户的,此处不做特殊限定。Target user data can be user data generated by business applications in real time and subject to user portraits under the condition of user authorization; of course, the target user data can also be generated offline and uploaded to an online database, for example , the user data in the data warehouse Hive library. In addition, in the embodiments of the present disclosure, the user data can be either an individual user or an enterprise user, that is, the user data can be for an individual or for an enterprise, that is, for a certain customer, and no special description is made here. limited.

例如,在电商应用中,目标用户数据可以为业务应用实时生成的用户属性数据、用户浏览数据、点击数据、订单数据、商品关注数据等用户数据中的至少一种。For example, in an e-commerce application, the target user data may be at least one of user data such as user attribute data, user browsing data, click data, order data, and commodity attention data generated in real time by the business application.

又例如,在金融应用中,目标用户数据可以为业务应用实时生成的用户属性数据、用户借款数据、用户还款数据等数据中的至少一种。For another example, in a financial application, the target user data may be at least one of user attribute data, user loan data, user repayment data, and other data generated in real time by the business application.

在本公开实施例中,用于生成业务数据的业务应用可以为任意应用场景中的应用中的一个或多个应用,此处不做特殊限定。In the embodiments of the present disclosure, the service application used to generate service data may be one or more applications in any application scenario, which is not specifically limited here.

针对相关技术中以批处理方式获取待处理的用户数据进行用户画像时可能存在的处理延迟问题,在本公开实施例中,在获取用户数据时,可以是流式获取待处理的目标用户数据,即,以流式处理方式获取业务应用实时生成的用户数据作为目标用户数据,以及时对业务应用实时生成的用户数据进行画像,生成最新的用户画像数据,进而及时提供最新的用户画像数据供展示或者供其他业务系统使用,以使得业务应用的运营主体可以及时响应用户需求。In view of the processing delay problem that may exist when acquiring user data to be processed in a batch manner in the related art for user portrait, in the embodiment of the present disclosure, when acquiring user data, the target user data to be processed may be acquired in a streaming manner, That is, the user data generated by the business application in real time is obtained as the target user data by stream processing, and the user data generated by the business application in real time is profiled to generate the latest user portrait data, and then the latest user portrait data is provided in time for display Or it can be used by other business systems, so that the operators of business applications can respond to user needs in a timely manner.

流式处理,是指数据生产者将生成的数据记录写入有序数据流,数据使用者按照相同顺序连续不间断的从有序数据流中获取数据记录进行使用或处理的处理过程,在流式处理中,通常没有预定的开始或结束,而是通过一系列事件节点以对数据生产者产生的数据记录进行实时响应处理,其中,数据生产者例如可以为本公开实施例中用于生成用户数据的业务应用,数据使用者例如可以为执行本公开实施例所述方法的电子设备中的用户画像应用。Stream processing refers to the processing process in which the data producer writes the generated data records into the ordered data stream, and the data user continuously obtains the data records from the ordered data stream in the same order for use or processing. In the formal processing, there is usually no predetermined start or end, but a series of event nodes to perform real-time response processing on the data records generated by the data producer, where the data producer can be, for example, used to generate user For the business application of data, the data user may be, for example, a user profile application in an electronic device that executes the method described in the embodiments of the present disclosure.

步骤S102,获取预先配置的标签信息和人群包信息,其中,标签信息包括至少一个标签的信息,每一标签表示用户的一个用户特征,人群包信息包括至少一个人群包的信息,每一人群包包括多个不同的标签。Step S102, obtaining pre-configured tag information and group package information, wherein the tag information includes at least one tag information, each tag represents a user feature of the user, the group package information includes at least one group package information, and each group package Include multiple different tags.

标签信息,是用于进行用户画像的至少一个标签的信息,每一标签可以与用户的一个用户特征对应。例如,标签信息可以为“性别”、“年龄”、“兴趣爱好”等标签的信息;其中,标签(Tag),是对某一类特定群体或对象的某项特征进行的抽象分类和概括,标签值(Tag Value)通常具备可分类型。Tag information is the information of at least one tag used to make user portraits, and each tag may correspond to a user feature of the user. For example, tag information can be tag information such as "gender", "age" and "hobbies"; among them, a tag (Tag) is an abstract classification and summary of a certain characteristic of a specific group or object, Tag Values usually have separable types.

例如,针对“自然人”,可以将“男”、“女”这类特征进行抽象概括,得到标签“性别”。For example, for "natural person", features such as "male" and "female" can be abstracted and summarized to obtain the label "gender".

标签值,是指某一标签所包含的具体内容,其特性是符合MECE(MutuallyExclusive Collectively Exhaustive)原则,即彼此之间相互独立、完全穷尽。Tag value refers to the specific content contained in a tag, and its characteristics are in line with the MECE (Mutually Exclusive Collectively Exhaustive) principle, that is, they are independent of each other and completely exhaustive.

例如,针对标签“性别”,其对应的标签值可分为“男”、“女”和“未知”。For example, for the label "gender", its corresponding label values can be divided into "male", "female" and "unknown".

人群包信息,是用于进行用户画像的至少一个人群包的信息;人群包通常用于对用户进行分类。例如,针对个人用户,其对应的人群包可以有“学生人群包”、“美容人群包”、“电商人群包”等。The crowd package information is the information of at least one crowd package used for user portrait; the crowd package is usually used to classify users. For example, for individual users, the corresponding group packages may include "student group package", "beauty group group package", "e-commerce group group package" and so on.

在本公开实施例中,标签信息和人群包信息可以由用户在前端界面中配置的方式预先获得。当然,也可以由电子设备自动生成该标签信息和人群包信息,即,在一些实施例中,该标签信息和该人群包信息也可以通过以下步骤得到:对获取到的目标用户数据进行聚类分析,获得至少一个聚类簇;分别对每一聚类簇中的目标用户数据进行特征提取处理,获得与每一聚类簇对应的至少一个用户特征;根据该与每一聚类簇对应的至少一个用户特征,获得至少一个标签,并根据该至少一个标签,获得至少一个人群包;根据该至少一个标签和该至少一个人群包,获得该标签信息和该人群包信息。In the embodiment of the present disclosure, the label information and crowd package information may be pre-obtained by the user in a manner configured in the front-end interface. Of course, the tag information and group group information can also be automatically generated by the electronic device, that is, in some embodiments, the tag information and the group group information can also be obtained through the following steps: clustering the acquired target user data Analyze to obtain at least one cluster; respectively perform feature extraction processing on the target user data in each cluster to obtain at least one user feature corresponding to each cluster; according to the corresponding to each cluster Obtain at least one tag based on at least one user characteristic, and obtain at least one group package according to the at least one tag; obtain the tag information and the group group information according to the at least one tag and the at least one group package.

即,在实际实施过程中,可以是在获得目标用户数据之后,由电子设备对待处理的目标用户数据进行实时聚类分析,得到至少一个聚类簇,并通过提取每一聚类簇中用户数据对应的一个或多个用户特征,再通过对提取得到的用户特征进行语义分析处理后得到对应标签,以及通过对得到的标签进行分类,得到至少一个人群包,之后,再根据分析得到的标签和人群包,自动生成标签信息和人群包信息。That is, in the actual implementation process, after obtaining the target user data, the electronic device may perform real-time cluster analysis on the target user data to be processed to obtain at least one cluster, and extract the user data in each cluster Corresponding to one or more user features, and then get the corresponding tags by performing semantic analysis on the extracted user features, and classify the obtained tags to get at least one crowd package, and then, according to the analyzed tags and Crowd package, automatically generate tag information and crowd package information.

其中,在对目标用户数据进行聚类分析时,可以使用任意聚类分析算法,例如,k-means算法、Clara算法以及Clarans算法等算法,此处不做特殊限定;另外,在提取每一聚类簇的用户特征时,可以通过预先训练得到的特征提取模型来提取该用户特征,该特征提取模型例如可以为卷积神经网络模型(CNN,Convolutional Neural Networks)或者也可以为其他网络模型,有关模型结构以及模型训练方法此处不再赘述;此外,在电子设备基于上述步骤自动生成标签信息和人群包信息之后,也可以先将该标签信息和人群包信息展示给用户查看;并在用户补充确认后再基于用户补充确认后的标签信息和人群包信息对目标用户数据进行用户画像,此处不做特殊限定。Among them, when performing cluster analysis on target user data, any cluster analysis algorithm can be used, such as k-means algorithm, Clara algorithm, and Clarans algorithm, etc., which are not specifically limited here; When the user features of clusters are obtained, the user features can be extracted through a pre-trained feature extraction model. The feature extraction model can be, for example, a convolutional neural network model (CNN, Convolutional Neural Networks) or other network models. The model structure and model training method will not be described here; in addition, after the electronic device automatically generates tag information and crowd package information based on the above steps, it can also first display the tag information and crowd package information to the user for viewing; After confirmation, user portraits will be performed on the target user data based on the tag information and crowd package information after the user's supplementary confirmation, and there is no special limitation here.

步骤S103,根据标签信息和人群包信息,对目标用户数据进行画像,获得用户画像数据。Step S103, according to the label information and group group information, profile the target user data to obtain user profile data.

在根据以上步骤S101流式获取到待处理的目标用户数据,以及经过步骤S102获取到预先配置的标签信息和人群包信息之后,电子设备即可基于该标签信息和该人群包信息对该目标用户数据进行画像,以得到最新的用户画像数据。例如可以根据该标签信息和人群包信息,通过对目标用户数据进行筛选匹配,以生成最新的用户画像数据,在本公开实施例中,不对具体的画像处理做特殊限定。After obtaining the target user data to be processed according to the above step S101, and obtaining the pre-configured label information and crowd package information through step S102, the electronic device can target the target user based on the label information and the crowd package information The data is profiled to obtain the latest user profile data. For example, the latest user portrait data can be generated by filtering and matching the target user data according to the tag information and crowd package information. In the embodiment of the present disclosure, no specific limitation is imposed on the specific portrait processing.

步骤S104,在对目标用户数据进行画像的过程中,获取数据库集群的后端节点的运行状态,并根据该运行状态对该后端节点进行运维管理。Step S104, in the process of profiling target user data, acquire the running status of the back-end node of the database cluster, and perform operation and maintenance management on the back-end node according to the running status.

在本公开实施例中,该数据库集群,可以为Starrocks集群,其中,Starrocks是一种全场景大规模并行处理(MPP,Massively Parallel Processing)数据库,其致力于构建极速统一分析体验,支持多种数据模型,例如,明细模型、聚合模型和更新模型等,并支持多种数据导入方式,基于Starrocks构建集群并用于数据分析,可以大幅提升处理速度;当然,该数据库集群也可以为Starrocks集群以外的其他数据库集群,例如,也可以为doris集群;在本公开实施例中,如无特殊说明,以该数据库集群为Starrocks集群为例进行说明。In the embodiment of the present disclosure, the database cluster may be a Starrocks cluster, wherein Starrocks is a full-scenario massively parallel processing (MPP, Massively Parallel Processing) database, which is dedicated to building an extremely fast unified analysis experience and supports a variety of data Models, such as detailed models, aggregation models, and update models, etc., and support multiple data import methods. Building a cluster based on Starrocks and using it for data analysis can greatly improve the processing speed; of course, the database cluster can also be other than the Starrocks cluster. The database cluster, for example, may also be a doris cluster; in the embodiments of the present disclosure, unless otherwise specified, the database cluster is taken as an example for illustration.

考虑到Starrocks集群中的后端节点(BE,BackEnd),即,BE在执行用户画像的过程中,可能发生异常宕机情况或者因为资源参数配置不合适而导致处理性能不佳的情况,从而需要用户随时检测后端节点状态,并在发现宕机或处理性能不佳时进行人工重启或进行性能优化,这一方面是需要用户操作,不够方便;另外,若未及时针对宕机的后端节点进行重启或进行性能优化,还可能导致电子设备不能及时生成用户画像数据,因此,在本公开实施例中,可以在电子设备基于部署的Starrocks集群对获取到的目标用户数据进行画像的过程中,自动检测其后端节点的运行状态,并根据该运行状态自行对该后端节点进行运维管理,以保障后端节点的健康运行。Considering that the back-end nodes (BE, BackEnd) in the Starrocks cluster, that is, BE may experience abnormal downtime or poor processing performance due to inappropriate resource parameter configuration during the process of executing user portraits, so it is necessary to Users can check the status of back-end nodes at any time, and manually restart or perform performance optimization when they find downtime or poor processing performance. On the one hand, this requires user operations, which is not convenient; Restarting or performing performance optimization may also cause the electronic device to fail to generate user portrait data in time. Therefore, in the embodiment of the present disclosure, the electronic device may profile the acquired target user data based on the deployed Starrocks cluster. Automatically detect the running status of its back-end nodes, and perform operation and maintenance management on the back-end nodes according to the running status to ensure the healthy operation of the back-end nodes.

根据以上说明可知,本公开实施例提供的用户画像生成方法,因为是流式获取目标用户数据,又因为流式处理方式具有连续不断、无间隔的特性,因此,基于流式处理方式获取待处理的目标用户数据并进行画像的处理方式,可以降低进行用户画像时的处理延时;同时,通过基于后端节点的运行状态对后端节点进行运维管理,也可以保障后端节点的稳定可靠运行,避免后端节点宕机从而保障电子设备可以及时提供实时的用户画像数据供用户查看。According to the above description, it can be seen that the user portrait generation method provided by the embodiment of the present disclosure obtains the target user data in a streaming manner, and because the streaming processing method has the characteristics of continuous and no interval, therefore, based on the streaming processing method, the data to be processed is acquired. The processing method of target user data and portrait can reduce the processing delay in user portrait; at the same time, through the operation and maintenance management of back-end nodes based on the running status of back-end nodes, it can also ensure the stability and reliability of back-end nodes Running, avoiding the downtime of back-end nodes so as to ensure that electronic equipment can provide real-time user portrait data for users to view in a timely manner.

请参看图2,其为本公开实施例提供的获取目标用户数据的流程图。如图2所示,在上述步骤S101中,所述流式获取待处理的目标用户数据,可以包括以下步骤S201-S204。Please refer to FIG. 2 , which is a flow chart of acquiring target user data provided by an embodiment of the present disclosure. As shown in FIG. 2, in the above step S101, the streaming acquisition of target user data to be processed may include the following steps S201-S204.

步骤S201,检测目标数据库中用户数据的数据变更信息,其中,目标数据库用于存储业务应用生成的用户数据。Step S201 , detecting data change information of user data in a target database, wherein the target database is used to store user data generated by a service application.

目标数据库例如可以为mysql数据库或者其他关系型数据库,此处不做特殊限定。The target database may be, for example, a mysql database or other relational database, which is not specifically limited here.

在该种实施方式中,具体可以通过Canal组件或者flink-cdc组件等组件检测目标数据库,以获得该数据变更信息,其中,Canal组件是使用Java语言开发的基于数据库增量日志解析,提供增量数据订阅和消费的组件;flink-cdc组件,是用于检测并捕获数据库的变更信息,例如,数据或数据表的插入、更新以及删除等信息,并将这些变更信息按照发生顺序进行记录的组件。In this implementation, the target database can be detected by components such as the Canal component or the flink-cdc component to obtain the data change information. Components for data subscription and consumption; the flink-cdc component is used to detect and capture database change information, such as insertion, update, and deletion of data or data tables, and record these change information in the order in which they occur .

其中,所述检测目标数据库中用户数据的数据变更信息,可以为:通过检测目标数据库对应的预设日志文件的变更信息,获得该数据变更信息。Wherein, the detecting the data change information of the user data in the target database may be: obtaining the data change information by detecting the change information of the preset log file corresponding to the target database.

目标数据库对应的预设日志文件,是用于记录目标数据库的数据变更信息的日志文件。例如,在目标数据库为mysql数据库的情况下,该预设日志文件可以为mysql数据库的binlog日志文件。The preset log file corresponding to the target database is a log file for recording data change information of the target database. For example, in the case that the target database is a mysql database, the preset log file may be a binlog log file of the mysql database.

步骤S202,根据该数据变更信息,获得目标数据库中发生变更的数据记录。Step S202, according to the data change information, obtain the changed data record in the target database.

步骤S203,将该发生变更的数据记录写入至目标消息队列。Step S203, writing the changed data record into the target message queue.

目标消息队列,用于缓存基于数据变更信息所获得的发生变更的数据记录,以使得电子设备可以以流式处理的方式消费该发生变更的数据记录,并逐条进行处理,其中,目标消息队列例如可以为kafka消息队列。The target message queue is used to cache the changed data records obtained based on the data change information, so that the electronic device can consume the changed data records in a streaming manner and process them one by one. The target message queue is, for example, It can be a kafka message queue.

步骤S204,根据目标消息队列中发生变更的数据记录,流式获得目标用户数据。Step S204, according to the changed data records in the target message queue, the target user data is obtained in a streaming manner.

即,针对相关技术中以批处理方式获取待处理的用户数据进行用户画像时可能存在的处理延迟问题,在本公开实施例中,通过使用Canal组件、flink-cdc组件等可以用于检测目标数据库的增量数据变更,以在业务应用实时生成用户数据并将用户数据写入至目标数据库之后,通过检测该目标数据库中的增量数据变更,获得目标数据库中发生变更的数据记录,即,发生变更的用户数据,并将该发生变更的数据记录写入至目标消息队列的方式,使得在进行画像时,可以通过实时的、流式消费获取该目标消息队列中发生变更的数据记录,以获得目标用户数据。That is, in view of the processing delay problem that may exist when obtaining user data to be processed in a batch manner in the related art for user portrait, in the embodiment of the present disclosure, the Canal component, the flink-cdc component, etc. can be used to detect the target database Incremental data changes in the target database, so that after the business application generates user data in real time and writes the user data into the target database, by detecting the incremental data changes in the target database, the changed data records in the target database are obtained, that is, the occurrence Changed user data, and write the changed data record to the target message queue, so that when profiling, the changed data record in the target message queue can be obtained through real-time, streaming consumption to obtain target user data.

另外,为进一步提升生成用户画像的速度,在该种实施方式中,上述的根据目标消息队列中发生变更的数据记录,流式获得目标用户数据,可以为:通过流式数据导入(Routine Load)方式将目标消息队列中发生变更的数据记录导入数据库集群的对应数据表中,获得目标用户数据。In addition, in order to further increase the speed of generating user portraits, in this embodiment, the above-mentioned stream acquisition of target user data based on the changed data records in the target message queue can be: through stream data import (Routine Load) In this way, the changed data records in the target message queue are imported into the corresponding data tables of the database cluster to obtain the target user data.

在本公开实施例中,上述步骤S103所述的根据标签信息和人群包信息,对目标用户数据进行画像,获得用户画像数据,可以为:根据该标签信息和该人群包信息,获得目标画像规则,其中,该目标画像规则为支持数据库集群的后端节点执行的结构化查询语言(SQL,Structured Query Language)语句;由该数据库集群的后端节点根据该目标画像规则,对导入Starrocks集群的数据表中的目标用户画像数据进行画像,获得目标用户画像数据。In the embodiment of the present disclosure, in the above step S103, according to the tag information and the crowd package information, the portrait of the target user data is obtained to obtain the user portrait data, which may be: according to the tag information and the crowd package information, the target portrait rule is obtained , wherein the target portrait rule is a Structured Query Language (SQL, Structured Query Language) statement that supports the execution of the back-end node of the database cluster; the data imported into the Starrocks cluster is processed by the back-end node of the database cluster according to the target portrait rule The target user portrait data in the table is profiled to obtain the target user portrait data.

其中,目标用户数据可以包括个人画像数据和客群画像数据,个人画像数据例如可以为针对个人用户的画像数据,客群画像数据例如可以为针对企业客户的画像数据。The target user data may include personal portrait data and customer group portrait data. The personal portrait data may be, for example, portrait data for individual users, and the customer group portrait data may be, for example, portrait data for corporate customers.

请参看图3,其为本公开实施例提供的获取目标画像规则的流程图。如图3所示,在本公开实施例中,该目标画像规则可以通过以下步骤获得:步骤S301,生成与该标签信息和该人群包信息对应的规则条件,其中,规则条件用于对用户数据进行筛选处理;步骤S302,基于数据库集群的预设规则对该规则条件进行转换处理,获得目标画像规则。Please refer to FIG. 3 , which is a flow chart of a rule for obtaining a target portrait provided by an embodiment of the present disclosure. As shown in Figure 3, in the embodiment of the present disclosure, the target portrait rule can be obtained through the following steps: Step S301, generate rule conditions corresponding to the label information and the crowd package information, wherein the rule conditions are used to process user data Perform screening processing; step S302, convert the rule condition based on the preset rules of the database cluster to obtain target portrait rules.

即,在本公开实施例中,该标签信息和人群包信息在提供给电子设备之后,电子设备可以先将该标签信息和人群包信息整理为json(JavaScript Object Notation)形式的规则条件,之后,为便于数据库集群,例如Starrocks集群执行,再将该json形式的规则条件预先转换为sql语句,以提升画像速度。That is, in the embodiment of the present disclosure, after the label information and crowd package information are provided to the electronic device, the electronic device can organize the label information and crowd package information into rule conditions in the form of json (JavaScript Object Notation), and then, In order to facilitate the execution of database clusters, such as Starrocks clusters, the rule conditions in the form of json are pre-converted into SQL statements to improve the image speed.

在本公开实施例中,上述步骤S104所述的获取数据库集群的后端节点的运行状态,并根据该运行状态对该后端节点进行运维管理,可以包括:按照预设时间间隔检测数据库集群的后端节点的运行状态;在确定该后端节点运行异常的情况下,重新启动该后端节点。In the embodiment of the present disclosure, acquiring the running state of the back-end node of the database cluster in the above step S104, and performing operation and maintenance management on the back-end node according to the running state may include: detecting the database cluster at a preset time interval The running status of the backend node; if it is determined that the backend node is running abnormally, restart the backend node.

在该种实施方式中,可以通过守护进程工具,例如,基于python语言实现的supervisor守护进程对Starrocks集群中的各后端节点进行检测,并在检测到其运行状态发生异常的情况下,重新启动发生异常的后端节点,以保障后端节点的稳定可靠运行,避免因后端节点宕机而延误用户画像数据的及时生成。In this implementation, the daemon tool, for example, the supervisor daemon based on the python language can be used to detect each back-end node in the Starrocks cluster, and restart it when it detects that its running status is abnormal Abnormal back-end nodes to ensure the stable and reliable operation of back-end nodes and avoid delays in the timely generation of user portrait data due to back-end node downtime.

为便于理解,请参看图4,其为本公开实施例提供的用于检测后端节点的第一框架处理图。如图4所示,针对Starrocks集群中的后端节点,即BE1、BE2和BE3,可以分别通过配置守护进程supervisor1、守护进程supervisor2和守护进程supervisor3分别对应的配置文件supervisor.conf的方式实现对每一后端节点的状态检测,进而保障各后端节点再出现异常,例如宕机后可以被及时启动,其中,图4所示FE表示Starrocks集群的前端节点,Broker表示Starrocks集群中用于执行Broker导入的进程;另外,有关supervisor的设置方法因为相关技术中有详细说明,此处不再赘述。For ease of understanding, please refer to FIG. 4 , which is a first frame processing diagram for detecting backend nodes provided by an embodiment of the present disclosure. As shown in Figure 4, for the back-end nodes in the Starrocks cluster, namely BE1, BE2, and BE3, you can configure the corresponding configuration files supervisor.conf of the daemon process supervisor1, daemon process supervisor2, and daemon process supervisor3 respectively. A state detection of the back-end nodes, thereby ensuring that the back-end nodes are abnormal again, for example, they can be started in time after a downtime. Among them, FE shown in Figure 4 represents the front-end node of the Starrocks cluster, and Broker represents the implementation of Broker in the Starrocks cluster. The process of importing; in addition, the setting method of the supervisor is described in detail in related technologies, and will not be repeated here.

需要说明的是,在图4所示实施例中,并未使用supervisor对Starrocks集群的前端节点和Broker进程进行检测,在实际实施时,也可以同时对该两个进程进行检测,以进一步提升系统稳定性,此处不做特殊限定。It should be noted that in the embodiment shown in Figure 4, the supervisor is not used to detect the front-end nodes and Broker processes of the Starrocks cluster. In actual implementation, the two processes can also be detected at the same time to further improve the system Stability is not specifically limited here.

还需要说明的是,在本公开实施例中,上述步骤S104所述的获取数据库集群的后端节点的运行状态,并根据该运行状态对该后端节点进行运维管理,也可以为:按照预设时间间隔检测数据库集群的后端节点的运行状态;在确定该后端节点运行异常的情况下,获取该数据库集群对应的运行指标;根据该运行指标,生成异常分析报告;将该异常分析报告发送至目标终端设备,其中,该异常分析报告用于供业务用户对数据库集群进行性能优化处理,该业务用户与数据库集群对应,该目标终端设备包括业务用户所使用的终端设备。It should also be noted that, in the embodiment of the present disclosure, the acquisition of the running status of the back-end node of the database cluster described in the above step S104, and the operation and maintenance management of the back-end node according to the running status may also be: according to Detect the running status of the back-end node of the database cluster at a preset time interval; when it is determined that the back-end node is running abnormally, obtain the corresponding running index of the database cluster; generate an abnormality analysis report according to the running index; analyze the abnormality The report is sent to a target terminal device, wherein the abnormality analysis report is used for a business user to optimize the performance of the database cluster, the business user corresponds to the database cluster, and the target terminal device includes a terminal device used by the business user.

在本公开实施例中,数据库集群,例如Starrocks集群对应的运行指标,可以为:中央处理器(CPU,Central Processing Unit)使用率、内存使用信息、运行环境,例如JDK环境、数据库集群各节点运行日志、网络环境信息、存储读写信息、每秒调用量、是否存在慢sql等指标。In the embodiment of the present disclosure, the operation indicators corresponding to the database cluster, such as the Starrocks cluster, may be: central processing unit (CPU, Central Processing Unit) utilization rate, memory usage information, operating environment, such as JDK environment, and the operation of each node of the database cluster Logs, network environment information, stored read and write information, calls per second, whether there are slow SQL indicators, etc.

即,考虑到若后端节点发生异常,例如宕机时,通常可能是其任务资源、运行环境或内部运行状态等存在异常或设置不当,因此,在Starrocks集群的某一后端节点发生异常的情况下,为便于与该Starrocks集群对应的业务用户,例如,DBA、运维人员、开发人员或者集群厂商等进行问题排查,以提升集群稳定性,该方法在进行用户画像的过程中,还可以在检测到Starrocks集群的后端节点出现异常的情况下,通过获取Starrocks集群的运行指标,以根据该运行指标,生成异常分析报告,并将该异常分析报告推送到上述业务人员所使用的终端设备,例如,手机、平板电脑等设备中,方便上述业务人员及时对Starrocks集群进行性能优化处理。That is, considering that if the back-end node is abnormal, such as downtime, it may usually be due to abnormalities or improper settings in its task resources, operating environment, or internal operating status. Therefore, an abnormality occurs in a certain back-end node of the Starrocks Under certain circumstances, in order to facilitate business users corresponding to the Starrocks cluster, such as DBAs, operation and maintenance personnel, developers, or cluster manufacturers, to troubleshoot problems and improve cluster stability, this method can also be used in the process of user portrait When an abnormality is detected in the back-end node of the Starrocks cluster, by obtaining the operating indicators of the Starrocks cluster, an abnormal analysis report is generated based on the operating indicators, and the abnormal analysis report is pushed to the terminal device used by the above-mentioned business personnel , for example, in devices such as mobile phones and tablet computers, it is convenient for the above-mentioned business personnel to optimize the performance of the Starrocks cluster in a timely manner.

为便于理解,请参看图5,其为本公开实施例提供的检测后端节点的第二框架处理图。如图5所示,针对Starrocks集群中的任意后端节点,例如,图5所示BE1,在检测到其发生宕机的情况下,可以在基于守护进程supervisor重新启动该后端节点的同时,运行用于获取Starrocks集群的运行指标的脚本,例如图5所示auto_check.sh获取Starrocks集群的运行指标,并根据该运行指标生成异常分析报告,并将该异常分析报告通过邮件形式发送至图5所示的业务用户查看,该类业务用户可以基于该异常分析报告分析原因并对Starrocks集群进行性能优化处理,以提升其稳定性,保障其进行用户画像时的处理速度。For ease of understanding, please refer to FIG. 5 , which is a second framework processing diagram for detecting backend nodes provided by an embodiment of the present disclosure. As shown in Figure 5, for any backend node in the Starrocks cluster, for example, BE1 shown in Figure 5, when it is detected that its downtime occurs, the backend node can be restarted based on the daemon process supervisor at the same time, Run the script used to obtain the operating indicators of the Starrocks cluster, such as auto_check.sh shown in Figure 5 to obtain the operating indicators of the Starrocks cluster, and generate an abnormal analysis report based on the operating indicators, and send the abnormal analysis report to Figure 5 by email Viewed by the business users shown, this type of business users can analyze the cause based on the exception analysis report and optimize the performance of the Starrocks cluster to improve its stability and ensure its processing speed when performing user portraits.

另外,在一些实施例中,上述步骤S104所述的获取数据库集群的后端节点的运行状态,并根据该运行状态对该后端节点进行运维管理,也可以为:按照预设时间间隔检测该后端节点的运行状态;在确定该后端节点运行异常的情况下,获取数据库集群对应的运行指标和该数据库集群对应的运行资源参数信息,其中,该运行资源参数信息包括预先为该数据库集群的后端节点配置的内存资源、CPU资源的信息;在根据该运行指标,确定该运行资源参数信息满足预设条件的情况下,调整数据库集群的运行资源参数,以对数据库集群进行性能优化处理。In addition, in some embodiments, the acquisition of the running status of the back-end nodes of the database cluster described in the above step S104, and performing operation and maintenance management on the back-end nodes according to the running status may also be: detecting The running state of the back-end node; when it is determined that the back-end node is running abnormally, obtain the running index corresponding to the database cluster and the running resource parameter information corresponding to the database cluster, wherein the running resource parameter information includes pre-set data for the database Information about the memory resources and CPU resources configured by the back-end nodes of the cluster; in the case of determining that the operating resource parameter information meets the preset conditions according to the operating index, adjust the operating resource parameters of the database cluster to optimize the performance of the database cluster deal with.

该运行指标例如可以包括CPU使用率、内存使用信息等,该预设条件可以为在预设时长内该CPU使用率大于或等于第一预设阈值,和/或,该内存使用信息表示在预设时长内的内存使用率大于或等于第二预设阈值,其中,第一预设阈值和第二预设阈值可以根据需要进行设置。即,在该种实施方式中,可以在检测到数据库集群,例如Starrocks集群异常的情况下,确定CPU使用率、内存使用率等是否长时间利用率较高,若是,则可能是因为当前为数据库集群配置的CPU资源,例如,CPU核数以及内存资源不合适,此时则可以考虑适当为其多分配一些CPU资源和内存资源;当然,此处仅为举例说明,在实际实施过程中,在调整数据库集群的运行资源参数时,也可以基于预先配置的资源调整策略来进行调整,此处不做特殊限定。The operating index may include, for example, CPU usage, memory usage information, etc., and the preset condition may be that the CPU usage is greater than or equal to a first preset threshold within a preset time period, and/or, the memory usage information indicates The memory usage within the set time period is greater than or equal to the second preset threshold, wherein the first preset threshold and the second preset threshold can be set according to needs. That is, in this implementation, when an abnormality is detected in the database cluster, such as the Starrocks cluster, it can be determined whether the CPU usage rate, memory usage rate, etc. are high for a long time. If so, it may be because the current database If the CPU resources configured in the cluster, such as the number of CPU cores and memory resources are not appropriate, you can consider appropriately allocating more CPU resources and memory resources; of course, this is only an example. In the actual implementation process, in When adjusting the operating resource parameters of the database cluster, it can also be adjusted based on the pre-configured resource adjustment strategy, which is not specifically limited here.

需要说明的是,以上所述的根据数据库集群的后端节点的运行状态对该后端节点进行运维管理的一个或多个实施例可以单独或者组合使用,或者也可以与其他运维管理方式结合使用,此处不做特殊限定。It should be noted that one or more embodiments of the above-mentioned operation and maintenance management of the back-end nodes of the database cluster according to the running status of the back-end nodes can be used alone or in combination, or can also be combined with other operation and maintenance management methods Used in combination, there is no special limitation here.

在经过以上方法获得用户画像数据之后,为方便用户查看,在本公开实施例中,该方法还包括:生成与用户画像数据对应的目标展示数据,其中,该目标展示数据包括图像数据和/或表格数据;以及,展示目标展示数据。After the user portrait data is obtained through the above method, for the convenience of the user to view, in the embodiment of the present disclosure, the method further includes: generating target display data corresponding to the user portrait data, wherein the target display data includes image data and/or tabular data; and, impression target display data.

即,考虑到获取到用户画像数据可能仅仅是一些数字,可视性较差,为便于用户查看,在获得用户画像数据之后,可以基于可视分析组件,例如,FineBI对用户画像数据进行可视化分析,以获得由图像数据、表格数据等构成的可视性较佳的目标展示数据供用户查看,进而提升用户体验。That is, considering that the obtained user portrait data may only be some numbers, and the visibility is poor, for the convenience of users to view, after obtaining the user portrait data, visual analysis can be performed on the user portrait data based on visual analysis components, for example, FineBI To obtain target display data with better visibility composed of image data, table data, etc. for users to view, thereby improving user experience.

综上所述,本公开实施例提供的方法,通过流式获取待处理的目标用户数据,并基于获取到的标签信息和人群包信息对该目标用户数据进行画像,可以及时的对业务应用实时生成的用户数据进行画像,提升用户画像数据的及时性,以使得依赖用户画像数据进行业务决策或执行业务处理的用户或者业务系统可以及时基于该用户画像数据响应用户需求;另外,通过新一代的数据库集群,例如Starrocks集群进行画像,还可以进一步的提升用户画像数据的生成速度,同时,在获取用户画像数据的过程中,通过检测数据库集群中后端节点的运行状态,以对其后端节点进行运维管理,可以保障用户画像数据的快速生成;此外,在获得用户划线数据之后,还可以通过生成包括图像数据和/或表格数据的目标展示数据供用户查看,还可以提升用户体验。To sum up, the method provided by the embodiments of the present disclosure acquires the target user data to be processed by streaming, and makes a portrait of the target user data based on the acquired label information and crowd package information, so that the business application can be processed in real time. The generated user data is profiled to improve the timeliness of user profile data, so that users or business systems that rely on user profile data to make business decisions or perform business processing can respond to user needs in a timely manner based on the user profile data; in addition, through a new generation of Database clusters, such as Starrocks clusters, can further improve the generation speed of user portrait data. At the same time, in the process of obtaining user portrait data, by detecting the running status of the back-end nodes in the database cluster, the back-end nodes Operation and maintenance management can ensure the rapid generation of user portrait data; in addition, after obtaining user scribing data, it can also generate target display data including image data and/or table data for users to view, which can also improve user experience.

可以理解,本公开提及的上述各个方法实施例,在不违背原理逻辑的情况下,均可以彼此相互结合形成结合后的实施例,限于篇幅,本公开不再赘述。本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。It can be understood that the above-mentioned method embodiments mentioned in this disclosure can all be combined with each other to form a combined embodiment without violating the principle and logic. Due to space limitations, this disclosure will not repeat them. Those skilled in the art can understand that, in the above method in the specific implementation manner, the specific execution order of each step should be determined according to its function and possible internal logic.

此外,本公开还提供了用户画像生成装置、电子设备、计算机可读存储介质,上述均可用来实现本公开提供的任一种用户画像生成方法,相应技术方案和描述和参见方法部分的相应记载,不再赘述。In addition, this disclosure also provides a user portrait generation device, electronic equipment, and a computer-readable storage medium, all of which can be used to implement any user portrait generation method provided by this disclosure, and refer to the corresponding technical solutions and descriptions in the method section. ,No longer.

图6为本公开实施例提供的一种用户画像生成装置的框图。Fig. 6 is a block diagram of an apparatus for generating a user portrait provided by an embodiment of the present disclosure.

参照图6,本公开实施例提供了一种用户画像生成装置,该用户画像生成装置包括:用户数据获取单元601、配置信息获取单元602、画像单元603和守护单元604。Referring to FIG. 6 , an embodiment of the present disclosure provides a user portrait generation device, which includes: a user data acquisition unit 601 , a configuration information acquisition unit 602 , a portrait unit 603 and a guard unit 604 .

该用户数据获取单元601,用于流式获取待处理的目标用户数据,其中,目标用户数据包括业务应用实时生成的用户数据。The user data obtaining unit 601 is configured to obtain target user data to be processed in a streaming manner, wherein the target user data includes user data generated by service applications in real time.

在一些实施例中,该用户数据获取单元601在流式获取待处理的目标用户数据时,可以用于:检测目标数据库中用户数据的数据变更信息,其中,目标数据库用于存储业务应用生成的用户数据;根据数据变更信息,获得目标数据库中发生变更的数据记录;将发生变更的数据记录写入至目标消息队列;根据目标消息队列中发生变更的数据记录,流式获得目标用户数据。In some embodiments, the user data acquisition unit 601 may be used to: detect data change information of user data in the target database when acquiring the target user data to be processed in a streaming manner, wherein the target database is used to store the information generated by the business application User data; obtain the changed data records in the target database according to the data change information; write the changed data records to the target message queue; obtain the target user data in a streaming manner according to the changed data records in the target message queue.

在一些实施例中,该用户数据获取单元601在检测目标数据库中用户数据的数据变更信息时,可以用于:通过检测目标数据库对应的预设日志文件的变更信息,获得数据变更信息。In some embodiments, when detecting data change information of user data in the target database, the user data acquiring unit 601 may be configured to: obtain data change information by detecting change information of a preset log file corresponding to the target database.

在一些实施例中,该用户数据获取单元601在根据目标消息队列中发生变更的数据记录,流式获得目标用户数据时,可以用于:通过流式数据导入方式将目标消息队列中发生变更的数据记录导入数据库集群的对应数据表中,以获得目标用户数据。In some embodiments, when the user data acquisition unit 601 obtains the target user data in a streaming manner according to the changed data records in the target message queue, it can be used to: import the changed data records in the target message queue The data records are imported into the corresponding data tables of the database cluster to obtain target user data.

该配置信息获取单元602,用于获取预先配置的标签信息和人群包信息,其中,标签信息包括至少一个标签的信息,每一标签表示用户的一个用户特征,人群包信息包括至少一个人群包的信息,每一人群包包括多个不同的标签。The configuration information acquisition unit 602 is configured to acquire pre-configured tag information and group package information, wherein the tag information includes information about at least one tag, each tag represents a user feature of the user, and the group package information includes information about at least one group package. information, each crowd package includes a number of different labels.

该画像单元603,用于根据标签信息和人群包信息,对目标用户数据进行画像,获得用户画像数据。The portrait unit 603 is used to draw a picture of the target user data according to the label information and group information, and obtain user portrait data.

该守护单元604,用于在对目标用户数据进行画像的过程中,获取数据库集群的后端节点的运行状态,并根据运行状态对后端节点进行运维管理。The guarding unit 604 is configured to acquire the running status of the back-end nodes of the database cluster during the profiling of the target user data, and perform operation and maintenance management on the back-end nodes according to the running status.

在一些实施例中,该画像单元603在根据标签信息和人群包信息,对目标用户数据进行画像,获得用户画像数据时,可以用于:根据标签信息和人群包信息,获得目标画像规则,其中,目标画像规则为支持数据库集群的后端节点执行的结构化查询语言语句;由数据库集群的后端节点根据目标画像规则,对导入数据表中的目标用户数据进行画像,以获得用户画像数据,其中,用户画像数据包括个人画像数据和/或客群画像数据。In some embodiments, the portrait unit 603 can be used to obtain target portrait rules according to the label information and crowd package information when portraiting the target user data according to the tag information and crowd package information to obtain user portrait data, wherein , the target portrait rule is a structured query language statement that supports the execution of the back-end nodes of the database cluster; the back-end nodes of the database cluster profile the target user data in the imported data table according to the target portrait rules to obtain user portrait data, Wherein, the user portrait data includes personal portrait data and/or customer group portrait data.

在一些实施例中,该画像单元603在根据标签信息和人群包信息,获得目标画像规则时,可以用于:生成与标签信息和人群包信息对应的规则条件,其中,规则条件用于对用户数据进行筛选处理;基于数据库集群的预设规则对规则条件进行转换处理,获得目标画像规则。In some embodiments, when the portrait unit 603 obtains target portrait rules according to the tag information and crowd package information, it can be used to: generate rule conditions corresponding to the tag information and crowd package information, wherein the rule conditions are used to The data is screened and processed; the rule conditions are converted and processed based on the preset rules of the database cluster to obtain the target portrait rules.

在一些实施例中,该守护单元604在获取数据库集群的后端节点的运行状态,并根据运行状态对后端节点进行运维管理时,可以用于:按照预设时间间隔检测数据库集群的后端节点的运行状态;在确定后端节点运行异常的情况下,重新启动后端节点。In some embodiments, when the guard unit 604 obtains the running status of the back-end nodes of the database cluster, and performs operation and maintenance management on the back-end nodes according to the running status, it can be used to: detect the back-end nodes of the database cluster according to a preset time interval The running status of the end node; if it is determined that the back-end node is running abnormally, restart the back-end node.

在一些实施例中,该守护单元604在获取数据库集群的后端节点的运行状态,并根据运行状态对后端节点进行运维管理时,可以用于:按照预设时间间隔检测数据库集群的后端节点的运行状态;在根据运行状态确定后端节点运行异常的情况下,获取数据库集群对应的运行指标;根据运行指标,生成异常分析报告;将异常分析报告发送至目标终端设备,其中,异常分析报告用于供业务用户对数据库集群进行性能优化处理,业务用户与数据库集群对应,目标终端设备包括业务用户所使用的终端设备。In some embodiments, when the guard unit 604 obtains the running status of the back-end nodes of the database cluster, and performs operation and maintenance management on the back-end nodes according to the running status, it can be used to: detect the back-end nodes of the database cluster according to a preset time interval The operating status of the end node; when the back-end node is determined to be abnormal according to the operating status, obtain the corresponding operating index of the database cluster; generate an abnormal analysis report according to the operating index; send the abnormal analysis report to the target terminal device, where the abnormal The analysis report is used for business users to optimize the performance of the database cluster, the business users correspond to the database cluster, and the target terminal devices include the terminal devices used by the business users.

在一些实施例中,该用户画像生成装置600还包括展示单元,用于:在获得用户画像数据之后,生成与用户画像数据对应的目标展示数据,其中,目标展示数据包括图像数据和/或表格数据;展示目标展示数据。In some embodiments, the user portrait generation device 600 further includes a presentation unit, configured to: generate target presentation data corresponding to the user portrait data after obtaining the user portrait data, wherein the target presentation data includes image data and/or tables data; impression target display data.

图7为本公开实施例提供的一种电子设备的框图。Fig. 7 is a block diagram of an electronic device provided by an embodiment of the present disclosure.

参照图7,本公开实施例提供了一种电子设备,该电子设备包括:至少一个处理器701;至少一个存储器702,以及一个或多个I/O接口703,连接在处理器701与存储器702之间;其中,存储器702存储有可被至少一个处理器701执行的一个或多个计算机程序,一个或多个计算机程序被至少一个处理器701执行,以使至少一个处理器701能够执行上述的用户画像生成方法。Referring to FIG. 7 , an embodiment of the present disclosure provides an electronic device, which includes: at least one processor 701; at least one memory 702, and one or more I/O interfaces 703 connected between the processor 701 and the memory 702 Among them; wherein, the memory 702 stores one or more computer programs that can be executed by at least one processor 701, and one or more computer programs are executed by at least one processor 701, so that at least one processor 701 can perform the above-mentioned User portrait generation method.

本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,其中,所述计算机程序在被处理器执行时实现上述的用户画像生成方法。计算机可读存储介质可以是易失性或非易失性计算机可读存储介质。An embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored, wherein the computer program implements the above-mentioned method for generating a user portrait when executed by a processor. Computer readable storage media may be volatile or nonvolatile computer readable storage media.

本公开实施例还提供了一种计算机程序产品,包括计算机可读代码,或者承载有计算机可读代码的非易失性计算机可读存储介质,当所述计算机可读代码在电子设备的处理器中运行时,所述电子设备中的处理器执行上述用户画像生成方法。An embodiment of the present disclosure also provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method for generating a user portrait.

本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、系统、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读存储介质上,计算机可读存储介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。Those of ordinary skill in the art can understand that all or some of the steps in the methods disclosed above, the functional modules/units in the system, and the device can be implemented as software, firmware, hardware, and an appropriate combination thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be composed of several physical components. Components cooperate to execute. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit . Such software may be distributed on computer readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).

如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读程序指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM)、静态随机存取存储器(SRAM)、闪存或其他存储器技术、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读程序指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable program instructions, data structures, program modules, or other data. volatile, removable and non-removable media. Computer storage media include, but are not limited to, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), static random access memory (SRAM), flash memory or other memory technologies, portable Compact disc read-only memory (CD-ROM), digital versatile disc (DVD) or other optical disk storage, magnetic cartridge, magnetic tape, magnetic disk storage or other magnetic storage device, or any other device that can be used to store desired information and can be accessed by a computer any other medium. In addition, as is well known to those of ordinary skill in the art, communication media typically embodies computer-readable program instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery medium.

这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。Computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or downloaded to an external computer or external storage device over a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. A network adapter card or a network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in each computing/processing device .

用于执行本公开操作的计算机程序指令可以是汇编指令、指令集架构(ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开的各个方面。Computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state setting data, or Source or object code written in any combination, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as the “C” language or similar programming languages. Computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server implement. In cases involving a remote computer, the remote computer can be connected to the user computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as via the Internet using an Internet service provider). connect). In some embodiments, an electronic circuit, such as a programmable logic circuit, field programmable gate array (FPGA), or programmable logic array (PLA), can be customized by utilizing state information of computer-readable program instructions, which can Various aspects of the present disclosure are implemented by executing computer readable program instructions.

这里所描述的计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software DevelopmentKit,SDK)等等。The computer program products described here can be specifically realized by means of hardware, software or a combination thereof. In an optional embodiment, the computer program product is embodied as a computer storage medium. In another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. .

这里参照根据本公开实施例的方法、装置(系统)和计算机程序产品的流程图和/或框图描述了本公开的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.

这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程用户画像生成装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程用户画像生成装置的处理器执行时,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程用户画像生成装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, or other programmable user portrait generating device, thereby producing a machine that makes these instructions When executed, means for implementing the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams are produced. These computer-readable program instructions can also be stored in a computer-readable storage medium, and these instructions make the computer, the programmable user portrait generating device and/or other devices work in a specific way, so that the computer-readable medium storing the instructions is then An article of manufacture comprising instructions for implementing various aspects of the functions/acts specified in one or more blocks in the flowcharts and/or block diagrams is included.

也可以把计算机可读程序指令加载到计算机、其它可编程用户画像生成装置、或其它设备上,使得在计算机、其它可编程用户画像生成装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程用户画像生成装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。Computer-readable program instructions can also be loaded onto computers, other programmable user portrait generating devices, or other devices, so that a series of operation steps are executed on the computer, other programmable user portrait generating devices, or other devices to generate computer-implemented process, so that instructions executed on computers, other programmable user portrait generating devices, or other devices implement the functions/actions specified in one or more blocks in the flowchart and/or block diagram.

附图中的流程图和框图显示了根据本公开的多个实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, a portion of a program segment, or an instruction that includes one or more Executable instructions. In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified function or action , or may be implemented by a combination of dedicated hardware and computer instructions.

本文已经公开了示例实施例,并且虽然采用了具体术语,但它们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则可单独使用与特定实施例相结合描述的特征、特性和/或元素,或可与其他实施例相结合描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本公开的范围的情况下,可进行各种形式和细节上的改变。Example embodiments have been disclosed herein, and while specific terms have been employed, they are used and should be construed in a generic descriptive sense only and not for purposes of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone, or may be described in combination with other embodiments, unless explicitly stated otherwise. Combinations of features and/or elements. Accordingly, it will be understood by those of ordinary skill in the art that various changes in form and details may be made without departing from the scope of the present disclosure as set forth in the appended claims.

Claims (10)

1. A user portrayal generation method, applied to an electronic device deployed with a database cluster, the method comprising:
the method comprises the steps of obtaining target user data to be processed in a streaming mode, wherein the target user data comprise user data generated by service application in real time;
obtaining preconfigured tag information and crowd-sourced information, wherein the tag information comprises information of at least one tag, each tag represents a user characteristic of a user, the crowd-sourced information comprises information of at least one crowd-sourced, and each crowd-sourced comprises a plurality of different tags;
According to the label information and the crowd pack information, portrait is carried out on the target user data, and user portrait data are obtained; the method comprises the steps of,
and in the process of portraying the target user data, acquiring the running state of the back-end node of the database cluster, and carrying out operation and maintenance management on the back-end node according to the running state.
2. The method of claim 1, wherein the streaming acquisition of target user data to be processed comprises:
detecting data change information of user data in a target database, wherein the target database is used for storing the user data generated by service application;
obtaining a changed data record in the target database according to the data change information;
writing the changed data record into a target message queue;
and obtaining the target user data in a streaming mode according to the changed data record in the target message queue.
3. The method of claim 2, wherein said streaming said target user data based on said changed data record in said target message queue comprises:
And importing the changed data record in the target message queue into a corresponding data table of the database cluster in a streaming data importing mode to obtain the target user data.
4. The method of claim 3, wherein the portraying the target user data based on the tag information and the crowd-sourced information to obtain user portrayal data comprises:
obtaining a target portrait rule according to the label information and the crowd-sourced information, wherein the target portrait rule is a structured query language statement which supports the back-end node execution of the database cluster;
and carrying out portrayal on target user data imported into the data table by a rear end node of the database cluster according to the target portrayal rule to obtain the user portrayal data, wherein the user portrayal data comprises personal portrayal data and/or guest group portrayal data.
5. The method of claim 4, wherein the obtaining the target representation rule based on the tag information and the crowd-sourced information comprises:
generating rule conditions corresponding to the tag information and the crowd pack information, wherein the rule conditions are used for screening user data;
And converting the rule conditions based on the preset rules of the database cluster to obtain the target portrait rule.
6. The method of claim 1, wherein the obtaining the operation state of the backend node of the database cluster and performing operation and maintenance management on the backend node according to the operation state comprises:
detecting the running state of the back-end node of the database cluster according to a preset time interval;
and restarting the back-end node under the condition that the back-end node is determined to be abnormally operated.
7. The method of claim 1, wherein the obtaining the operation state of the backend node of the database cluster and performing operation and maintenance management on the backend node according to the operation state comprises:
detecting the running state of the back-end node of the database cluster according to a preset time interval;
acquiring an operation index corresponding to the database cluster under the condition that the operation of the back-end node is abnormal;
generating an anomaly analysis report according to the operation index;
and sending the anomaly analysis report to target terminal equipment, wherein the anomaly analysis report is used for performing performance optimization processing on the database cluster by a service user, the service user corresponds to the database cluster, and the target terminal equipment comprises terminal equipment used by the service user.
8. A user portrayal generation apparatus for use in an electronic device having a database cluster deployed therein, the apparatus comprising:
the system comprises a user data acquisition unit, a service application and a processing unit, wherein the user data acquisition unit is used for acquiring target user data to be processed in a streaming mode, and the target user data comprises user data generated by the service application in real time;
a configuration information obtaining unit, configured to obtain preconfigured tag information and crowd-sourced information, where the tag information includes information of at least one tag, each tag represents a user feature of a user, the crowd-sourced information includes information of at least one crowd-sourced, and each crowd-sourced includes a plurality of different tags;
the portrait unit is used for portrait the target user data according to the label information and the crowd pack information to obtain user portrait data; the method comprises the steps of,
and the daemon unit is used for acquiring the running state of the back-end node of the database cluster in the process of portraying the target user data, and carrying out operation and maintenance management on the back-end node according to the running state.
9. An electronic device, comprising:
at least one processor; and
A memory communicatively coupled to the at least one processor; wherein,,
the memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the user representation generation method of any one of claims 1-7.
10. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the user portrayal generation method according to any one of claims 1-7.
CN202210974291.9A 2022-08-15 2022-08-15 User portrait generation method and device, electronic equipment and storage medium Pending CN116166873A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210974291.9A CN116166873A (en) 2022-08-15 2022-08-15 User portrait generation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210974291.9A CN116166873A (en) 2022-08-15 2022-08-15 User portrait generation method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116166873A true CN116166873A (en) 2023-05-26

Family

ID=86413783

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210974291.9A Pending CN116166873A (en) 2022-08-15 2022-08-15 User portrait generation method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116166873A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596679A (en) * 2018-04-27 2018-09-28 中国联合网络通信集团有限公司 Construction method, device, terminal and the computer readable storage medium of user's portrait
CN109284195A (en) * 2018-08-27 2019-01-29 广东电网有限责任公司信息中心 A kind of real-time representation data calculation method and system
CN114004584A (en) * 2021-10-22 2022-02-01 国网重庆市电力公司电力科学研究院 Power information management method for building user portrait based on data middleboxes
CN114064438A (en) * 2021-11-24 2022-02-18 建信金融科技有限责任公司 Database fault processing method and device
CN114902212A (en) * 2020-01-16 2022-08-12 深圳市欢太科技有限公司 Image generation method, image generation device, server and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596679A (en) * 2018-04-27 2018-09-28 中国联合网络通信集团有限公司 Construction method, device, terminal and the computer readable storage medium of user's portrait
CN109284195A (en) * 2018-08-27 2019-01-29 广东电网有限责任公司信息中心 A kind of real-time representation data calculation method and system
CN114902212A (en) * 2020-01-16 2022-08-12 深圳市欢太科技有限公司 Image generation method, image generation device, server and storage medium
CN114004584A (en) * 2021-10-22 2022-02-01 国网重庆市电力公司电力科学研究院 Power information management method for building user portrait based on data middleboxes
CN114064438A (en) * 2021-11-24 2022-02-18 建信金融科技有限责任公司 Database fault processing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
姜红玉;汪朋;封雷;: "基于流式计算的实时用户画像系统研究", 计算机技术与发展, no. 07, 10 July 2020 (2020-07-10) *

Similar Documents

Publication Publication Date Title
US12105740B2 (en) Low-latency streaming analytics
US11947556B1 (en) Computerized monitoring of a metric through execution of a search query, determining a root cause of the behavior, and providing a notification thereof
US9135559B1 (en) Methods and systems for predictive engine evaluation, tuning, and replay of engine performance
US10855712B2 (en) Detection of anomalies in a time series using values of a different time series
US11620300B2 (en) Real-time measurement and system monitoring based on generated dependency graph models of system components
US11106681B2 (en) Conditional processing based on inferred sourcetypes
US11269872B1 (en) Intent-based natural language processing system
US9704185B2 (en) Product recommendation using sentiment and semantic analysis
US20190095478A1 (en) Information technology networked entity monitoring with automatic reliability scoring
US20210056458A1 (en) Predicting a persona class based on overlap-agnostic machine learning models for distributing persona-based digital content
US11574326B2 (en) Identifying topic variances from digital survey responses
US20170192872A1 (en) Interactive detection of system anomalies
US20170330239A1 (en) Methods and systems for near real-time lookalike audience expansion in ads targeting
US11388244B1 (en) Method, device, and program product for managing computer system
US20170169111A1 (en) Search query task management for search system tuning
US11847599B1 (en) Computing system for automated evaluation of process workflows
US20180012237A1 (en) Inferring user demographics through categorization of social media data
CN116302893A (en) Equipment performance evaluation method, device, electronic equipment and storage medium
US11184450B2 (en) Variable content generation and engagement tracking
CN116781568A (en) Data monitoring and alarm methods, devices, equipment and storage media
TWI709905B (en) Data analysis method and data analysis system thereof
CN111448551A (en) Method and system for tracking application activity data from a remote device and generating a corrective action data structure for the remote device
CN111967611A (en) Feature generation method and apparatus, electronic device, and storage medium
CN113342998B (en) Multimedia resource recommendation method and device, electronic equipment and storage medium
US20140215093A1 (en) System and Method of Formatting Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination