CN113094360B - Cross-industry data processing method - Google Patents
Cross-industry data processing method Download PDFInfo
- Publication number
- CN113094360B CN113094360B CN202110296258.0A CN202110296258A CN113094360B CN 113094360 B CN113094360 B CN 113094360B CN 202110296258 A CN202110296258 A CN 202110296258A CN 113094360 B CN113094360 B CN 113094360B
- Authority
- CN
- China
- Prior art keywords
- entity
- data
- sub
- business
- main
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2264—Multidimensional index structures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2291—User-Defined Types; Storage management thereof
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A cross-industry data processing method comprises the steps of abstracting business data into entities for storage, and dividing the entities into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic attachment relationship with the main entity, and the sub-entity comprises attached data which is attached to the main entity; the behavior sub-entity and the main entity have a logic attachment relationship, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source for the main entity, the sub-entity and the behavioural sub-entity. The invention greatly helps to save the enterprise expense, is convenient and quick, and improves the human efficiency; through data isolation storage, the service requirement is met, and meanwhile, the data safety is improved; the full trace of the data is traced back through the data blood edges.
Description
Technical Field
The invention relates to the technical field of business data processing, in particular to a cross-industry data processing method.
Background
The data management platform integrates scattered multiparty data into a unified technical platform, and standardizes and subdivides the data, so that users can push subdivision results into the existing interactive marketing environment. The current data management platform only supports and defines a service data management-contact related service, only supports the data field and the data structure of the custom contact, and only supports two types of data sources of the docking database type and the form data type.
In the prior art, one set of data management platform cannot be compatible with a plurality of main business data management, and enterprises are required to pay more financial resources and material resources; different main service data cannot be stored in an isolated mode, so that data redundancy is caused, and data use is seriously affected; the data are not supported to be cleaned, a series of problems such as low accuracy, poor timeliness and the like of the data are caused by dirty data, and the data value cannot be mined to the maximum extent; when the data in the system has problems, the upstream and downstream of the data cannot be checked, and the problems cannot be rapidly positioned and the influence range and degree cannot be evaluated.
Disclosure of Invention
Therefore, the cross-industry data processing method provided by the invention can be used for instantiation according to different application scenes, and the problem that the data model difference between industries is large and data management and data analysis cannot be uniformly carried out is solved.
In order to achieve the above object, the present invention provides the following technical solutions: a cross-industry data processing method abstracts business data into entities for storage, wherein the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of the service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic affiliation with the main entity, and the sub-entity comprises affiliated data which is attached to the main entity; the behavior sub-entity has a logic attachment relationship with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity is used as a data source of the main entity, the sub-entity and the behavior sub-entity.
As a preferred scheme of the cross-industry data processing method, the sub-entities and the behavior sub-entities exist in a logic affiliation subordinate to one main entity, and one sub-entity or behavior sub-entity is subordinate to only one main entity.
As a preferable scheme of the cross-industry data processing method, a one-to-many or many-to-one association relation exists between main entities of different business data.
As a preferable scheme of the cross-industry data processing method, the data structure and the field of each service data are customized, and the customized service data are independently stored to realize isolation between the service data.
As a preferable scheme of the cross-industry data processing method, a plurality of business data which are isolated from each other are subjected to data aggregation according to requirements in a pushing and associated configuration mode.
As a preferable scheme of the cross-industry data processing method, the business data is subjected to function modularization and data individuation two-dimensional management;
the function modularization freely configures the functions of label management, grouping management, index management or user portraits according to each service data;
and performing label system, grouping and statistics index operation on each service data in a personalized manner, performing data deduplication according to the collected service data, and generating a dedicated user portrait.
As a preferred scheme of the cross-industry data processing method, the relationship between the business data source and the destination entity is defined as a blood-lineage relationship, and the object of the blood-lineage relationship comprises business entity to main entity, business entity to sub-entity, business entity to behavior sub-entity or main entity to main entity.
As a preferable scheme of the cross-industry data processing method, the business data is subjected to data flow display through a data blood-edge analysis chart, the data problem is positioned through the data blood-edge analysis chart, and the positioned data with problems are re-extracted or pushed by utilizing upstream and downstream business data.
As a preferable scheme of the cross-industry data processing method, data cleaning is carried out on the collected business data, wherein the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation.
The invention has the following advantages: the business data is abstracted into entities for storage, and the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic attachment relationship with the main entity, and the sub-entity comprises attached data which is attached to the main entity; the behavior sub-entity and the main entity have a logic attachment relationship, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source for the main entity, the sub-entity and the behavioural sub-entity. The invention greatly helps to save the enterprise expense, is convenient and quick, and improves the human efficiency; through data isolation storage, the service requirement is met, and meanwhile, the data safety is improved; the diversity of various data source types can be supported to the maximum extent; the realization of multiple services can meet the personalized premise and can be managed uniformly; tracing the whole trace of the data through the data blood margin; through data cleaning, the data quality is improved, so that the accuracy is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.
FIG. 1 is a schematic diagram of entity relationships in a cross-industry data processing method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of data flow between entities in a cross-industry data processing method according to an embodiment of the present invention.
Detailed Description
Other advantages and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, which, by way of illustration, is to be read in connection with certain specific embodiments, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1 and 2, a cross-industry data processing method is provided, business data is abstracted into entities for storage, and the entities are divided into a main entity, a sub-entity, a behavior sub-entity and a business entity according to different logic application modes and storage schemes; the main entity is a carrier of the service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic affiliation with the main entity, and the sub-entity comprises affiliated data which is attached to the main entity; the behavior sub-entity has a logic attachment relationship with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity is used as a data source of the main entity, the sub-entity and the behavior sub-entity.
Specifically, the main entity is a main carrier for storing data, is a main object for data analysis, and is mainly applied to the main entity. Such as: contact, enterprise information. The sub entity is attached data which is stored in the main entity, and is data which is logically attached to the main entity. Such as: educational experience, work experience, etc. information for the contact. The behavior sub-entity is behavior information generated by the main entity, has a logical attachment relationship with the main entity, is inherited to the sub-entity, and expands characteristic information (such as time and behavior type) of some behaviors on the basis of the sub-entity. Such as: purchase information for the contact. When all the business data enter the data management system, business entities are generated in the same structure, the safety and usability of the data are guaranteed, and the business entities are source entities of other entity data.
Specifically, the sub-entities and the behavior sub-entities exist in a logical affiliation subordinate to a main entity, and one sub-entity or behavior sub-entity is subordinate to only one main entity. That is, the child entity and the behavioural child entity can only exist in logical affiliation with a host entity and can only be attached to one host entity. And the main entities of different service data have an association relation of one-to-many or many-to-one.
Specifically, the data structure and the field of each service data are customized, and the customized service data are independently stored to realize isolation between the service data. Each service data is self-defined in data structure and field and is stored independently, and one service data is equivalent to a set of reduced service system, so that real data isolation is realized.
Specifically, a plurality of service data isolated from each other are subjected to data aggregation according to requirements in a pushing and associated configuration mode. The data of a plurality of service data can be aggregated and associated with the isolated service data according to requirements by pushing, associating configuration and the like, and the data aggregation can be realized.
Performing function modularization and data individuation two-dimensional management on the business data in a cross-industry data processing method;
the function modularization freely configures the functions of label management, grouping management, index management or user portraits according to each service data;
and performing label system, grouping and statistics index operation on each service data in a personalized manner, performing data deduplication according to the collected service data, and generating a dedicated user portrait.
For each service data, whether the functions of label management, cluster management, index management, user portrait are needed or not can be freely configured, and redundancy of the functional modules is avoided. Each business data has a set of own label system, grouping and statistics indexes, data deduplication is carried out according to the collected business data, a special user portrait is automatically generated, and the business data is used for enterprise accurate marketing and driving protection navigation.
In one embodiment of the cross-industry data processing method, the relationship between the business data source and the destined entity is defined as a blood-lineage relationship, and the object of the blood-lineage relationship includes a business entity to a main entity, a business entity to a sub-entity, a business entity to a behavioral sub-entity, or a main entity to a main entity.
Auxiliary figure 2, the original business data forms business entities in the system, the business entities are pushed to the designated main entity(s), sub-entities and behavioural sub-entities according to the blood relationship, and the subordinate relationship between them is established. And the association relationship between the main entities can be established when a plurality of main entities are put in storage at the same time. After the last behavior entity is put in storage, pushing the data flow to the next main entity to enter the next round until the data flow is finished according to the blood relationship of the main entity.
Specifically, the service data is subjected to data circulation display through a data blood edge analysis chart, data problem positioning is performed through the data blood edge analysis chart, and the positioned data with problems are re-extracted or pushed by using upstream and downstream service data.
The visual data blood-edge analysis chart clearly shows which 'table' the data originates from, which 'fields' and 'data volume' are received, how to 'circulate', so that 'clear at a glance' can be achieved, the 'problem root' can be rapidly positioned, and're-extraction' or 'pushing' can be carried out on the data influenced by the upstream and downstream, thereby thoroughly correcting the data problem.
In one embodiment of the cross-industry data processing method, data cleaning is performed on the collected business data, wherein the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation. After the service data is acquired, special treatment can be carried out on the data, the data can be normalized or a new field is derived, the data conversion module supports various cleaning gadgets, and the data conversion module supports various cleaning gadgets, namely 'value replacement, interception length, UTM value extraction and MD5 aggregation'.
Based on the cross-industry data processing method, business data are abstracted into entities for storage, and the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic attachment relationship with the main entity, and the sub-entity comprises attached data which is attached to the main entity; the behavior sub-entity and the main entity have a logic attachment relationship, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source for the main entity, the sub-entity and the behavioural sub-entity. The system can support the management of a plurality of service data in a self-defined manner, each service data can self-define own service data types (fields or data relationships) and functional modules (whether labels, data ratings and the like are needed or not), and after data are accessed through data sources in various modes, all main service data are managed in a set of data management platform or system at the same time, and the different main service data are stored and completely isolated, so that the data security is improved. And the upstream and downstream of all data after uploading the data management platform or the system can be queried through the data blood margin, and the problem of quick positioning can be achieved after the data problem is met, if serious data problem is met, dirty data can be emptied through one key, and then the data can be pumped/re-pushed. The invention greatly helps to save the enterprise expense, is convenient and quick, and improves the human efficiency; through data isolation storage, the service requirement is met, and meanwhile, the data safety is improved; the diversity of various data source types can be supported to the maximum extent; the realization of multiple services can meet the personalized premise and can be managed uniformly; tracing the whole trace of the data through the data blood margin; through data cleaning, the data quality is improved, so that the accuracy is improved.
While the invention has been described in detail in the foregoing general description and specific examples, it will be apparent to those skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.
Claims (9)
1. A cross-industry data processing method is characterized in that business data is abstracted into entities for storage, and the entities are divided into a main entity, a sub-entity, a behavior sub-entity and a business entity according to different logic application modes and storage schemes; the main entity is a carrier of the service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic affiliation with the main entity, and the sub-entity comprises affiliated data which is attached to the main entity; the behavior sub-entity has a logic attachment relationship with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity is used as a data source of the main entity, the sub-entity and the behavior sub-entity.
2. A cross-industry data processing method according to claim 1, wherein the sub-entities and behavioural sub-entities exist in logical affiliation from a master entity, and a sub-entity or behavioural sub-entity is attached to only one master entity.
3. The cross-industry data processing method of claim 1, wherein a one-to-many or many-to-one association exists between the master entities of different business data.
4. The cross-industry data processing method according to claim 1, wherein each business data is subjected to data structure and field customization, and the customized business data is independently stored to realize isolation between the business data.
5. The cross-industry data processing method according to claim 4, wherein a plurality of service data isolated from each other are aggregated according to requirements in a form of push and association configuration.
6. The cross-industry data processing method according to claim 1, wherein the business data is subjected to management of two dimensions, namely functional modularization and data personalization;
the function modularization freely configures the functions of label management, grouping management, index management or user portraits according to each service data;
and performing label system, grouping and statistics index operation on each service data in a personalized manner, performing data deduplication according to the collected service data, and generating a dedicated user portrait.
7. The cross-industry data processing method of claim 1, wherein the relationship between the source and the destination entity of the business data is defined as a blood-lineage relationship, and the object of the blood-lineage relationship includes business entity to host entity, business entity to sub-entity, business entity to behavioral sub-entity, or host entity to host entity.
8. The cross-industry data processing method according to claim 7, wherein the business data is subjected to data flow display through a data blood edge analysis chart, data problem positioning is performed through the data blood edge analysis chart, and the positioned data with problems are re-extracted or pushed by using upstream and downstream business data.
9. The cross-industry data processing method of claim 1, wherein the collected business data is subjected to data cleaning, and the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110296258.0A CN113094360B (en) | 2021-03-19 | 2021-03-19 | Cross-industry data processing method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110296258.0A CN113094360B (en) | 2021-03-19 | 2021-03-19 | Cross-industry data processing method |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN113094360A CN113094360A (en) | 2021-07-09 |
| CN113094360B true CN113094360B (en) | 2023-11-10 |
Family
ID=76668480
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202110296258.0A Active CN113094360B (en) | 2021-03-19 | 2021-03-19 | Cross-industry data processing method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN113094360B (en) |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102929771A (en) * | 2012-09-28 | 2013-02-13 | 用友软件股份有限公司 | Log recording device and log recording method |
| CN108038222A (en) * | 2017-12-22 | 2018-05-15 | 冶金自动化研究设计院 | System for Information System Modeling and entity-property frame of data access |
| CN109739486A (en) * | 2019-01-03 | 2019-05-10 | 深圳英飞拓科技股份有限公司 | Multi-data source database manipulation implementation method and device based on JdbcTemplate |
| CN110196889A (en) * | 2019-05-30 | 2019-09-03 | 北京字节跳动网络技术有限公司 | Data processing method, device, electronic equipment and storage medium |
| CN111858615A (en) * | 2020-08-04 | 2020-10-30 | 中国工商银行股份有限公司 | Database table generation method, system, computer system and readable storage medium |
| CN111897883A (en) * | 2020-07-15 | 2020-11-06 | 中国工商银行股份有限公司 | Entity model construction method and device, electronic equipment and medium |
| CN111897890A (en) * | 2020-08-21 | 2020-11-06 | 中国工商银行股份有限公司 | Financial business processing method and device |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8209259B2 (en) * | 2003-01-09 | 2012-06-26 | Adp Dealer Services, Inc. | Software business platform with networked, association-based business entity access management |
| US7213037B2 (en) * | 2003-01-13 | 2007-05-01 | I2 Technologies Us, Inc. | Master data management system for centrally managing cached data representing core enterprise reference data maintained as locked in true state read only access until completion of manipulation process |
| US8682936B2 (en) * | 2010-12-15 | 2014-03-25 | Microsoft Corporation | Inherited entity storage model |
-
2021
- 2021-03-19 CN CN202110296258.0A patent/CN113094360B/en active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102929771A (en) * | 2012-09-28 | 2013-02-13 | 用友软件股份有限公司 | Log recording device and log recording method |
| CN108038222A (en) * | 2017-12-22 | 2018-05-15 | 冶金自动化研究设计院 | System for Information System Modeling and entity-property frame of data access |
| CN109739486A (en) * | 2019-01-03 | 2019-05-10 | 深圳英飞拓科技股份有限公司 | Multi-data source database manipulation implementation method and device based on JdbcTemplate |
| CN110196889A (en) * | 2019-05-30 | 2019-09-03 | 北京字节跳动网络技术有限公司 | Data processing method, device, electronic equipment and storage medium |
| CN111897883A (en) * | 2020-07-15 | 2020-11-06 | 中国工商银行股份有限公司 | Entity model construction method and device, electronic equipment and medium |
| CN111858615A (en) * | 2020-08-04 | 2020-10-30 | 中国工商银行股份有限公司 | Database table generation method, system, computer system and readable storage medium |
| CN111897890A (en) * | 2020-08-21 | 2020-11-06 | 中国工商银行股份有限公司 | Financial business processing method and device |
Non-Patent Citations (4)
| Title |
|---|
| Safe interaction management of state institutions and business entities based on the concepts of evolutionary economics: modeling and scenario forecasting of processes;Rudnichenko Y 等;Tem journal;第9卷(第1期);233‐241 * |
| 基于多层架构的网格开发模式的设计和实现;胥寿春;中国优秀硕士学位论文全文数据库信息科技辑(第12期);I138-330 * |
| 基于最大熵的泰语句子级实体从属关系抽取;王红斌;李金绘;沈强;线岩团;毛存礼;;南京大学学报(自然科学)(04);124-132 * |
| 基于统一搜索的信息服务平台;朴岩;陈远平;及俊川;;计算机系统应用(11);134-140 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN113094360A (en) | 2021-07-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110232085B (en) | A method and system for orchestrating big data ETL tasks | |
| US20150193423A1 (en) | Automatic relationship detection for spreadsheet data items | |
| CN110674228A (en) | Data warehouse model construction and data query method, device and equipment | |
| CN110795509A (en) | Method and device for constructing index blood relationship graph of data warehouse and electronic equipment | |
| CN111126019B (en) | Report generation method and device based on mode customization and electronic equipment | |
| CN104361140A (en) | Dynamically generated data model configuration device and method | |
| CN107864192B (en) | Information push method, device, server and readable storage medium | |
| CN106326006A (en) | Task management system aiming at task flow of data platform | |
| US9454592B2 (en) | Managing, importing, and exporting teamspace templates and teamspaces in content repositories | |
| CN106682096A (en) | Method and device for log data management | |
| CN107220757A (en) | A kind of system and method for rule configuration and parsing | |
| CN108647235A (en) | A kind of data analysing method, equipment and medium based on data warehouse | |
| CN106341542A (en) | Application program management recommendation method and system in mobile phone | |
| CN115516441A (en) | Multi-valued primary keys for multiple unique identifiers of entities | |
| EP4548228A1 (en) | Unified graph generation | |
| US8495018B2 (en) | Transitioning application replication configurations in a networked computing environment | |
| US20140108625A1 (en) | System and method for configuration policy extraction | |
| CN113094360B (en) | Cross-industry data processing method | |
| CN107451222A (en) | Model data management system | |
| CN112699107B (en) | Data management platform supporting high definition | |
| CN107967336A (en) | Big data comprehensive management platform construction method based on functional componentization | |
| CN108845857A (en) | A kind of icon management method and device based on cloud platform | |
| CN116610667A (en) | Service data processing method, device, computer equipment and storage medium | |
| US12481680B2 (en) | Team data clustering as a service | |
| US20250348474A1 (en) | Inferring graph model from semantic model |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |