CN113094360B

CN113094360B - Cross-industry data processing method

Info

Publication number: CN113094360B
Application number: CN202110296258.0A
Authority: CN
Inventors: 孟艳冬; 郭泽谦; 梁亚东
Original assignee: Beijing Sinobase Technology Development Co ltd
Current assignee: Beijing Sinobase Technology Development Co ltd
Priority date: 2021-03-19
Filing date: 2021-03-19
Publication date: 2023-11-10
Anticipated expiration: 2041-03-19
Also published as: CN113094360A

Abstract

A cross-industry data processing method comprises the steps of abstracting business data into entities for storage, and dividing the entities into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic attachment relationship with the main entity, and the sub-entity comprises attached data which is attached to the main entity; the behavior sub-entity and the main entity have a logic attachment relationship, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source for the main entity, the sub-entity and the behavioural sub-entity. The invention greatly helps to save the enterprise expense, is convenient and quick, and improves the human efficiency; through data isolation storage, the service requirement is met, and meanwhile, the data safety is improved; the full trace of the data is traced back through the data blood edges.

Description

Cross-industry data processing method

Technical Field

The invention relates to the technical field of business data processing, in particular to a cross-industry data processing method.

Background

The data management platform integrates scattered multiparty data into a unified technical platform, and standardizes and subdivides the data, so that users can push subdivision results into the existing interactive marketing environment. The current data management platform only supports and defines a service data management-contact related service, only supports the data field and the data structure of the custom contact, and only supports two types of data sources of the docking database type and the form data type.

In the prior art, one set of data management platform cannot be compatible with a plurality of main business data management, and enterprises are required to pay more financial resources and material resources; different main service data cannot be stored in an isolated mode, so that data redundancy is caused, and data use is seriously affected; the data are not supported to be cleaned, a series of problems such as low accuracy, poor timeliness and the like of the data are caused by dirty data, and the data value cannot be mined to the maximum extent; when the data in the system has problems, the upstream and downstream of the data cannot be checked, and the problems cannot be rapidly positioned and the influence range and degree cannot be evaluated.

Disclosure of Invention

Therefore, the cross-industry data processing method provided by the invention can be used for instantiation according to different application scenes, and the problem that the data model difference between industries is large and data management and data analysis cannot be uniformly carried out is solved.

In order to achieve the above object, the present invention provides the following technical solutions: a cross-industry data processing method abstracts business data into entities for storage, wherein the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of the service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic affiliation with the main entity, and the sub-entity comprises affiliated data which is attached to the main entity; the behavior sub-entity has a logic attachment relationship with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity is used as a data source of the main entity, the sub-entity and the behavior sub-entity.

As a preferred scheme of the cross-industry data processing method, the sub-entities and the behavior sub-entities exist in a logic affiliation subordinate to one main entity, and one sub-entity or behavior sub-entity is subordinate to only one main entity.

As a preferable scheme of the cross-industry data processing method, a one-to-many or many-to-one association relation exists between main entities of different business data.

As a preferable scheme of the cross-industry data processing method, the data structure and the field of each service data are customized, and the customized service data are independently stored to realize isolation between the service data.

As a preferable scheme of the cross-industry data processing method, a plurality of business data which are isolated from each other are subjected to data aggregation according to requirements in a pushing and associated configuration mode.

As a preferable scheme of the cross-industry data processing method, the business data is subjected to function modularization and data individuation two-dimensional management;

the function modularization freely configures the functions of label management, grouping management, index management or user portraits according to each service data;

and performing label system, grouping and statistics index operation on each service data in a personalized manner, performing data deduplication according to the collected service data, and generating a dedicated user portrait.

As a preferred scheme of the cross-industry data processing method, the relationship between the business data source and the destination entity is defined as a blood-lineage relationship, and the object of the blood-lineage relationship comprises business entity to main entity, business entity to sub-entity, business entity to behavior sub-entity or main entity to main entity.

As a preferable scheme of the cross-industry data processing method, the business data is subjected to data flow display through a data blood-edge analysis chart, the data problem is positioned through the data blood-edge analysis chart, and the positioned data with problems are re-extracted or pushed by utilizing upstream and downstream business data.

As a preferable scheme of the cross-industry data processing method, data cleaning is carried out on the collected business data, wherein the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation.

The invention has the following advantages: the business data is abstracted into entities for storage, and the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic attachment relationship with the main entity, and the sub-entity comprises attached data which is attached to the main entity; the behavior sub-entity and the main entity have a logic attachment relationship, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source for the main entity, the sub-entity and the behavioural sub-entity. The invention greatly helps to save the enterprise expense, is convenient and quick, and improves the human efficiency; through data isolation storage, the service requirement is met, and meanwhile, the data safety is improved; the diversity of various data source types can be supported to the maximum extent; the realization of multiple services can meet the personalized premise and can be managed uniformly; tracing the whole trace of the data through the data blood margin; through data cleaning, the data quality is improved, so that the accuracy is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.

FIG. 1 is a schematic diagram of entity relationships in a cross-industry data processing method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of data flow between entities in a cross-industry data processing method according to an embodiment of the present invention.

Detailed Description

Other advantages and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, which, by way of illustration, is to be read in connection with certain specific embodiments, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1 and 2, a cross-industry data processing method is provided, business data is abstracted into entities for storage, and the entities are divided into a main entity, a sub-entity, a behavior sub-entity and a business entity according to different logic application modes and storage schemes; the main entity is a carrier of the service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic affiliation with the main entity, and the sub-entity comprises affiliated data which is attached to the main entity; the behavior sub-entity has a logic attachment relationship with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity is used as a data source of the main entity, the sub-entity and the behavior sub-entity.

Specifically, the main entity is a main carrier for storing data, is a main object for data analysis, and is mainly applied to the main entity. Such as: contact, enterprise information. The sub entity is attached data which is stored in the main entity, and is data which is logically attached to the main entity. Such as: educational experience, work experience, etc. information for the contact. The behavior sub-entity is behavior information generated by the main entity, has a logical attachment relationship with the main entity, is inherited to the sub-entity, and expands characteristic information (such as time and behavior type) of some behaviors on the basis of the sub-entity. Such as: purchase information for the contact. When all the business data enter the data management system, business entities are generated in the same structure, the safety and usability of the data are guaranteed, and the business entities are source entities of other entity data.

Specifically, the sub-entities and the behavior sub-entities exist in a logical affiliation subordinate to a main entity, and one sub-entity or behavior sub-entity is subordinate to only one main entity. That is, the child entity and the behavioural child entity can only exist in logical affiliation with a host entity and can only be attached to one host entity. And the main entities of different service data have an association relation of one-to-many or many-to-one.

Specifically, the data structure and the field of each service data are customized, and the customized service data are independently stored to realize isolation between the service data. Each service data is self-defined in data structure and field and is stored independently, and one service data is equivalent to a set of reduced service system, so that real data isolation is realized.

Specifically, a plurality of service data isolated from each other are subjected to data aggregation according to requirements in a pushing and associated configuration mode. The data of a plurality of service data can be aggregated and associated with the isolated service data according to requirements by pushing, associating configuration and the like, and the data aggregation can be realized.

Performing function modularization and data individuation two-dimensional management on the business data in a cross-industry data processing method;

For each service data, whether the functions of label management, cluster management, index management, user portrait are needed or not can be freely configured, and redundancy of the functional modules is avoided. Each business data has a set of own label system, grouping and statistics indexes, data deduplication is carried out according to the collected business data, a special user portrait is automatically generated, and the business data is used for enterprise accurate marketing and driving protection navigation.

In one embodiment of the cross-industry data processing method, the relationship between the business data source and the destined entity is defined as a blood-lineage relationship, and the object of the blood-lineage relationship includes a business entity to a main entity, a business entity to a sub-entity, a business entity to a behavioral sub-entity, or a main entity to a main entity.

Auxiliary figure 2, the original business data forms business entities in the system, the business entities are pushed to the designated main entity(s), sub-entities and behavioural sub-entities according to the blood relationship, and the subordinate relationship between them is established. And the association relationship between the main entities can be established when a plurality of main entities are put in storage at the same time. After the last behavior entity is put in storage, pushing the data flow to the next main entity to enter the next round until the data flow is finished according to the blood relationship of the main entity.

Specifically, the service data is subjected to data circulation display through a data blood edge analysis chart, data problem positioning is performed through the data blood edge analysis chart, and the positioned data with problems are re-extracted or pushed by using upstream and downstream service data.

The visual data blood-edge analysis chart clearly shows which 'table' the data originates from, which 'fields' and 'data volume' are received, how to 'circulate', so that 'clear at a glance' can be achieved, the 'problem root' can be rapidly positioned, and're-extraction' or 'pushing' can be carried out on the data influenced by the upstream and downstream, thereby thoroughly correcting the data problem.

In one embodiment of the cross-industry data processing method, data cleaning is performed on the collected business data, wherein the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation. After the service data is acquired, special treatment can be carried out on the data, the data can be normalized or a new field is derived, the data conversion module supports various cleaning gadgets, and the data conversion module supports various cleaning gadgets, namely 'value replacement, interception length, UTM value extraction and MD5 aggregation'.

Based on the cross-industry data processing method, business data are abstracted into entities for storage, and the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic attachment relationship with the main entity, and the sub-entity comprises attached data which is attached to the main entity; the behavior sub-entity and the main entity have a logic attachment relationship, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source for the main entity, the sub-entity and the behavioural sub-entity. The system can support the management of a plurality of service data in a self-defined manner, each service data can self-define own service data types (fields or data relationships) and functional modules (whether labels, data ratings and the like are needed or not), and after data are accessed through data sources in various modes, all main service data are managed in a set of data management platform or system at the same time, and the different main service data are stored and completely isolated, so that the data security is improved. And the upstream and downstream of all data after uploading the data management platform or the system can be queried through the data blood margin, and the problem of quick positioning can be achieved after the data problem is met, if serious data problem is met, dirty data can be emptied through one key, and then the data can be pumped/re-pushed. The invention greatly helps to save the enterprise expense, is convenient and quick, and improves the human efficiency; through data isolation storage, the service requirement is met, and meanwhile, the data safety is improved; the diversity of various data source types can be supported to the maximum extent; the realization of multiple services can meet the personalized premise and can be managed uniformly; tracing the whole trace of the data through the data blood margin; through data cleaning, the data quality is improved, so that the accuracy is improved.

While the invention has been described in detail in the foregoing general description and specific examples, it will be apparent to those skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.

Claims

1. A cross-industry data processing method is characterized in that business data is abstracted into entities for storage, and the entities are divided into a main entity, a sub-entity, a behavior sub-entity and a business entity according to different logic application modes and storage schemes; the main entity is a carrier of the service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic affiliation with the main entity, and the sub-entity comprises affiliated data which is attached to the main entity; the behavior sub-entity has a logic attachment relationship with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity is used as a data source of the main entity, the sub-entity and the behavior sub-entity.

2. A cross-industry data processing method according to claim 1, wherein the sub-entities and behavioural sub-entities exist in logical affiliation from a master entity, and a sub-entity or behavioural sub-entity is attached to only one master entity.

3. The cross-industry data processing method of claim 1, wherein a one-to-many or many-to-one association exists between the master entities of different business data.

4. The cross-industry data processing method according to claim 1, wherein each business data is subjected to data structure and field customization, and the customized business data is independently stored to realize isolation between the business data.

5. The cross-industry data processing method according to claim 4, wherein a plurality of service data isolated from each other are aggregated according to requirements in a form of push and association configuration.

6. The cross-industry data processing method according to claim 1, wherein the business data is subjected to management of two dimensions, namely functional modularization and data personalization;

7. The cross-industry data processing method of claim 1, wherein the relationship between the source and the destination entity of the business data is defined as a blood-lineage relationship, and the object of the blood-lineage relationship includes business entity to host entity, business entity to sub-entity, business entity to behavioral sub-entity, or host entity to host entity.

8. The cross-industry data processing method according to claim 7, wherein the business data is subjected to data flow display through a data blood edge analysis chart, data problem positioning is performed through the data blood edge analysis chart, and the positioned data with problems are re-extracted or pushed by using upstream and downstream business data.

9. The cross-industry data processing method of claim 1, wherein the collected business data is subjected to data cleaning, and the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation.