[go: up one dir, main page]

CN113094360B - Cross-industry data processing method - Google Patents

Cross-industry data processing method Download PDF

Info

Publication number
CN113094360B
CN113094360B CN202110296258.0A CN202110296258A CN113094360B CN 113094360 B CN113094360 B CN 113094360B CN 202110296258 A CN202110296258 A CN 202110296258A CN 113094360 B CN113094360 B CN 113094360B
Authority
CN
China
Prior art keywords
entity
data
sub
business
main
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110296258.0A
Other languages
Chinese (zh)
Other versions
CN113094360A (en
Inventor
孟艳冬
郭泽谦
梁亚东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sinobase Technology Development Co ltd
Original Assignee
Beijing Sinobase Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sinobase Technology Development Co ltd filed Critical Beijing Sinobase Technology Development Co ltd
Priority to CN202110296258.0A priority Critical patent/CN113094360B/en
Publication of CN113094360A publication Critical patent/CN113094360A/en
Application granted granted Critical
Publication of CN113094360B publication Critical patent/CN113094360B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2291User-Defined Types; Storage management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A cross-industry data processing method comprises the steps of abstracting business data into entities for storage, and dividing the entities into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic attachment relationship with the main entity, and the sub-entity comprises attached data which is attached to the main entity; the behavior sub-entity and the main entity have a logic attachment relationship, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source for the main entity, the sub-entity and the behavioural sub-entity. The invention greatly helps to save the enterprise expense, is convenient and quick, and improves the human efficiency; through data isolation storage, the service requirement is met, and meanwhile, the data safety is improved; the full trace of the data is traced back through the data blood edges.

Description

Cross-industry data processing method
Technical Field
The invention relates to the technical field of business data processing, in particular to a cross-industry data processing method.
Background
The data management platform integrates scattered multiparty data into a unified technical platform, and standardizes and subdivides the data, so that users can push subdivision results into the existing interactive marketing environment. The current data management platform only supports and defines a service data management-contact related service, only supports the data field and the data structure of the custom contact, and only supports two types of data sources of the docking database type and the form data type.
In the prior art, one set of data management platform cannot be compatible with a plurality of main business data management, and enterprises are required to pay more financial resources and material resources; different main service data cannot be stored in an isolated mode, so that data redundancy is caused, and data use is seriously affected; the data are not supported to be cleaned, a series of problems such as low accuracy, poor timeliness and the like of the data are caused by dirty data, and the data value cannot be mined to the maximum extent; when the data in the system has problems, the upstream and downstream of the data cannot be checked, and the problems cannot be rapidly positioned and the influence range and degree cannot be evaluated.
Disclosure of Invention
Therefore, the cross-industry data processing method provided by the invention can be used for instantiation according to different application scenes, and the problem that the data model difference between industries is large and data management and data analysis cannot be uniformly carried out is solved.
In order to achieve the above object, the present invention provides the following technical solutions: a cross-industry data processing method abstracts business data into entities for storage, wherein the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of the service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic affiliation with the main entity, and the sub-entity comprises affiliated data which is attached to the main entity; the behavior sub-entity has a logic attachment relationship with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity is used as a data source of the main entity, the sub-entity and the behavior sub-entity.
As a preferred scheme of the cross-industry data processing method, the sub-entities and the behavior sub-entities exist in a logic affiliation subordinate to one main entity, and one sub-entity or behavior sub-entity is subordinate to only one main entity.
As a preferable scheme of the cross-industry data processing method, a one-to-many or many-to-one association relation exists between main entities of different business data.
As a preferable scheme of the cross-industry data processing method, the data structure and the field of each service data are customized, and the customized service data are independently stored to realize isolation between the service data.
As a preferable scheme of the cross-industry data processing method, a plurality of business data which are isolated from each other are subjected to data aggregation according to requirements in a pushing and associated configuration mode.
As a preferable scheme of the cross-industry data processing method, the business data is subjected to function modularization and data individuation two-dimensional management;
the function modularization freely configures the functions of label management, grouping management, index management or user portraits according to each service data;
and performing label system, grouping and statistics index operation on each service data in a personalized manner, performing data deduplication according to the collected service data, and generating a dedicated user portrait.
As a preferred scheme of the cross-industry data processing method, the relationship between the business data source and the destination entity is defined as a blood-lineage relationship, and the object of the blood-lineage relationship comprises business entity to main entity, business entity to sub-entity, business entity to behavior sub-entity or main entity to main entity.
As a preferable scheme of the cross-industry data processing method, the business data is subjected to data flow display through a data blood-edge analysis chart, the data problem is positioned through the data blood-edge analysis chart, and the positioned data with problems are re-extracted or pushed by utilizing upstream and downstream business data.
As a preferable scheme of the cross-industry data processing method, data cleaning is carried out on the collected business data, wherein the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation.
The invention has the following advantages: the business data is abstracted into entities for storage, and the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic attachment relationship with the main entity, and the sub-entity comprises attached data which is attached to the main entity; the behavior sub-entity and the main entity have a logic attachment relationship, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source for the main entity, the sub-entity and the behavioural sub-entity. The invention greatly helps to save the enterprise expense, is convenient and quick, and improves the human efficiency; through data isolation storage, the service requirement is met, and meanwhile, the data safety is improved; the diversity of various data source types can be supported to the maximum extent; the realization of multiple services can meet the personalized premise and can be managed uniformly; tracing the whole trace of the data through the data blood margin; through data cleaning, the data quality is improved, so that the accuracy is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.
FIG. 1 is a schematic diagram of entity relationships in a cross-industry data processing method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of data flow between entities in a cross-industry data processing method according to an embodiment of the present invention.
Detailed Description
Other advantages and advantages of the present invention will become apparent to those skilled in the art from the following detailed description, which, by way of illustration, is to be read in connection with certain specific embodiments, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1 and 2, a cross-industry data processing method is provided, business data is abstracted into entities for storage, and the entities are divided into a main entity, a sub-entity, a behavior sub-entity and a business entity according to different logic application modes and storage schemes; the main entity is a carrier of the service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic affiliation with the main entity, and the sub-entity comprises affiliated data which is attached to the main entity; the behavior sub-entity has a logic attachment relationship with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity is used as a data source of the main entity, the sub-entity and the behavior sub-entity.
Specifically, the main entity is a main carrier for storing data, is a main object for data analysis, and is mainly applied to the main entity. Such as: contact, enterprise information. The sub entity is attached data which is stored in the main entity, and is data which is logically attached to the main entity. Such as: educational experience, work experience, etc. information for the contact. The behavior sub-entity is behavior information generated by the main entity, has a logical attachment relationship with the main entity, is inherited to the sub-entity, and expands characteristic information (such as time and behavior type) of some behaviors on the basis of the sub-entity. Such as: purchase information for the contact. When all the business data enter the data management system, business entities are generated in the same structure, the safety and usability of the data are guaranteed, and the business entities are source entities of other entity data.
Specifically, the sub-entities and the behavior sub-entities exist in a logical affiliation subordinate to a main entity, and one sub-entity or behavior sub-entity is subordinate to only one main entity. That is, the child entity and the behavioural child entity can only exist in logical affiliation with a host entity and can only be attached to one host entity. And the main entities of different service data have an association relation of one-to-many or many-to-one.
Specifically, the data structure and the field of each service data are customized, and the customized service data are independently stored to realize isolation between the service data. Each service data is self-defined in data structure and field and is stored independently, and one service data is equivalent to a set of reduced service system, so that real data isolation is realized.
Specifically, a plurality of service data isolated from each other are subjected to data aggregation according to requirements in a pushing and associated configuration mode. The data of a plurality of service data can be aggregated and associated with the isolated service data according to requirements by pushing, associating configuration and the like, and the data aggregation can be realized.
Performing function modularization and data individuation two-dimensional management on the business data in a cross-industry data processing method;
the function modularization freely configures the functions of label management, grouping management, index management or user portraits according to each service data;
and performing label system, grouping and statistics index operation on each service data in a personalized manner, performing data deduplication according to the collected service data, and generating a dedicated user portrait.
For each service data, whether the functions of label management, cluster management, index management, user portrait are needed or not can be freely configured, and redundancy of the functional modules is avoided. Each business data has a set of own label system, grouping and statistics indexes, data deduplication is carried out according to the collected business data, a special user portrait is automatically generated, and the business data is used for enterprise accurate marketing and driving protection navigation.
In one embodiment of the cross-industry data processing method, the relationship between the business data source and the destined entity is defined as a blood-lineage relationship, and the object of the blood-lineage relationship includes a business entity to a main entity, a business entity to a sub-entity, a business entity to a behavioral sub-entity, or a main entity to a main entity.
Auxiliary figure 2, the original business data forms business entities in the system, the business entities are pushed to the designated main entity(s), sub-entities and behavioural sub-entities according to the blood relationship, and the subordinate relationship between them is established. And the association relationship between the main entities can be established when a plurality of main entities are put in storage at the same time. After the last behavior entity is put in storage, pushing the data flow to the next main entity to enter the next round until the data flow is finished according to the blood relationship of the main entity.
Specifically, the service data is subjected to data circulation display through a data blood edge analysis chart, data problem positioning is performed through the data blood edge analysis chart, and the positioned data with problems are re-extracted or pushed by using upstream and downstream service data.
The visual data blood-edge analysis chart clearly shows which 'table' the data originates from, which 'fields' and 'data volume' are received, how to 'circulate', so that 'clear at a glance' can be achieved, the 'problem root' can be rapidly positioned, and're-extraction' or 'pushing' can be carried out on the data influenced by the upstream and downstream, thereby thoroughly correcting the data problem.
In one embodiment of the cross-industry data processing method, data cleaning is performed on the collected business data, wherein the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation. After the service data is acquired, special treatment can be carried out on the data, the data can be normalized or a new field is derived, the data conversion module supports various cleaning gadgets, and the data conversion module supports various cleaning gadgets, namely 'value replacement, interception length, UTM value extraction and MD5 aggregation'.
Based on the cross-industry data processing method, business data are abstracted into entities for storage, and the entities are divided into main entities, sub-entities, behavior sub-entities and business entities according to different logic application modes and storage schemes; the main entity is a carrier of service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic attachment relationship with the main entity, and the sub-entity comprises attached data which is attached to the main entity; the behavior sub-entity and the main entity have a logic attachment relationship, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity serves as a data source for the main entity, the sub-entity and the behavioural sub-entity. The system can support the management of a plurality of service data in a self-defined manner, each service data can self-define own service data types (fields or data relationships) and functional modules (whether labels, data ratings and the like are needed or not), and after data are accessed through data sources in various modes, all main service data are managed in a set of data management platform or system at the same time, and the different main service data are stored and completely isolated, so that the data security is improved. And the upstream and downstream of all data after uploading the data management platform or the system can be queried through the data blood margin, and the problem of quick positioning can be achieved after the data problem is met, if serious data problem is met, dirty data can be emptied through one key, and then the data can be pumped/re-pushed. The invention greatly helps to save the enterprise expense, is convenient and quick, and improves the human efficiency; through data isolation storage, the service requirement is met, and meanwhile, the data safety is improved; the diversity of various data source types can be supported to the maximum extent; the realization of multiple services can meet the personalized premise and can be managed uniformly; tracing the whole trace of the data through the data blood margin; through data cleaning, the data quality is improved, so that the accuracy is improved.
While the invention has been described in detail in the foregoing general description and specific examples, it will be apparent to those skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.

Claims (9)

1. A cross-industry data processing method is characterized in that business data is abstracted into entities for storage, and the entities are divided into a main entity, a sub-entity, a behavior sub-entity and a business entity according to different logic application modes and storage schemes; the main entity is a carrier of the service data, and data analysis is carried out through data objects in the main entity; the sub-entity has a logic affiliation with the main entity, and the sub-entity comprises affiliated data which is attached to the main entity; the behavior sub-entity has a logic attachment relationship with the main entity, the behavior sub-entity inherits to the sub-entity, and the behavior sub-entity expands behavior characteristic information on the basis of the sub-entity; the business entity is used as a data source of the main entity, the sub-entity and the behavior sub-entity.
2. A cross-industry data processing method according to claim 1, wherein the sub-entities and behavioural sub-entities exist in logical affiliation from a master entity, and a sub-entity or behavioural sub-entity is attached to only one master entity.
3. The cross-industry data processing method of claim 1, wherein a one-to-many or many-to-one association exists between the master entities of different business data.
4. The cross-industry data processing method according to claim 1, wherein each business data is subjected to data structure and field customization, and the customized business data is independently stored to realize isolation between the business data.
5. The cross-industry data processing method according to claim 4, wherein a plurality of service data isolated from each other are aggregated according to requirements in a form of push and association configuration.
6. The cross-industry data processing method according to claim 1, wherein the business data is subjected to management of two dimensions, namely functional modularization and data personalization;
the function modularization freely configures the functions of label management, grouping management, index management or user portraits according to each service data;
and performing label system, grouping and statistics index operation on each service data in a personalized manner, performing data deduplication according to the collected service data, and generating a dedicated user portrait.
7. The cross-industry data processing method of claim 1, wherein the relationship between the source and the destination entity of the business data is defined as a blood-lineage relationship, and the object of the blood-lineage relationship includes business entity to host entity, business entity to sub-entity, business entity to behavioral sub-entity, or host entity to host entity.
8. The cross-industry data processing method according to claim 7, wherein the business data is subjected to data flow display through a data blood edge analysis chart, data problem positioning is performed through the data blood edge analysis chart, and the positioned data with problems are re-extracted or pushed by using upstream and downstream business data.
9. The cross-industry data processing method of claim 1, wherein the collected business data is subjected to data cleaning, and the data cleaning comprises value replacement, interception length, UTM value extraction and MD5 aggregation.
CN202110296258.0A 2021-03-19 2021-03-19 Cross-industry data processing method Active CN113094360B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110296258.0A CN113094360B (en) 2021-03-19 2021-03-19 Cross-industry data processing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110296258.0A CN113094360B (en) 2021-03-19 2021-03-19 Cross-industry data processing method

Publications (2)

Publication Number Publication Date
CN113094360A CN113094360A (en) 2021-07-09
CN113094360B true CN113094360B (en) 2023-11-10

Family

ID=76668480

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110296258.0A Active CN113094360B (en) 2021-03-19 2021-03-19 Cross-industry data processing method

Country Status (1)

Country Link
CN (1) CN113094360B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929771A (en) * 2012-09-28 2013-02-13 用友软件股份有限公司 Log recording device and log recording method
CN108038222A (en) * 2017-12-22 2018-05-15 冶金自动化研究设计院 System for Information System Modeling and entity-property frame of data access
CN109739486A (en) * 2019-01-03 2019-05-10 深圳英飞拓科技股份有限公司 Multi-data source database manipulation implementation method and device based on JdbcTemplate
CN110196889A (en) * 2019-05-30 2019-09-03 北京字节跳动网络技术有限公司 Data processing method, device, electronic equipment and storage medium
CN111858615A (en) * 2020-08-04 2020-10-30 中国工商银行股份有限公司 Database table generation method, system, computer system and readable storage medium
CN111897883A (en) * 2020-07-15 2020-11-06 中国工商银行股份有限公司 Entity model construction method and device, electronic equipment and medium
CN111897890A (en) * 2020-08-21 2020-11-06 中国工商银行股份有限公司 Financial business processing method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8209259B2 (en) * 2003-01-09 2012-06-26 Adp Dealer Services, Inc. Software business platform with networked, association-based business entity access management
US7213037B2 (en) * 2003-01-13 2007-05-01 I2 Technologies Us, Inc. Master data management system for centrally managing cached data representing core enterprise reference data maintained as locked in true state read only access until completion of manipulation process
US8682936B2 (en) * 2010-12-15 2014-03-25 Microsoft Corporation Inherited entity storage model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929771A (en) * 2012-09-28 2013-02-13 用友软件股份有限公司 Log recording device and log recording method
CN108038222A (en) * 2017-12-22 2018-05-15 冶金自动化研究设计院 System for Information System Modeling and entity-property frame of data access
CN109739486A (en) * 2019-01-03 2019-05-10 深圳英飞拓科技股份有限公司 Multi-data source database manipulation implementation method and device based on JdbcTemplate
CN110196889A (en) * 2019-05-30 2019-09-03 北京字节跳动网络技术有限公司 Data processing method, device, electronic equipment and storage medium
CN111897883A (en) * 2020-07-15 2020-11-06 中国工商银行股份有限公司 Entity model construction method and device, electronic equipment and medium
CN111858615A (en) * 2020-08-04 2020-10-30 中国工商银行股份有限公司 Database table generation method, system, computer system and readable storage medium
CN111897890A (en) * 2020-08-21 2020-11-06 中国工商银行股份有限公司 Financial business processing method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Safe interaction management of state institutions and business entities based on the concepts of evolutionary economics: modeling and scenario forecasting of processes;Rudnichenko Y 等;Tem journal;第9卷(第1期);233‐241 *
基于多层架构的网格开发模式的设计和实现;胥寿春;中国优秀硕士学位论文全文数据库信息科技辑(第12期);I138-330 *
基于最大熵的泰语句子级实体从属关系抽取;王红斌;李金绘;沈强;线岩团;毛存礼;;南京大学学报(自然科学)(04);124-132 *
基于统一搜索的信息服务平台;朴岩;陈远平;及俊川;;计算机系统应用(11);134-140 *

Also Published As

Publication number Publication date
CN113094360A (en) 2021-07-09

Similar Documents

Publication Publication Date Title
CN110232085B (en) A method and system for orchestrating big data ETL tasks
US20150193423A1 (en) Automatic relationship detection for spreadsheet data items
CN110674228A (en) Data warehouse model construction and data query method, device and equipment
CN110795509A (en) Method and device for constructing index blood relationship graph of data warehouse and electronic equipment
CN111126019B (en) Report generation method and device based on mode customization and electronic equipment
CN104361140A (en) Dynamically generated data model configuration device and method
CN107864192B (en) Information push method, device, server and readable storage medium
CN106326006A (en) Task management system aiming at task flow of data platform
US9454592B2 (en) Managing, importing, and exporting teamspace templates and teamspaces in content repositories
CN106682096A (en) Method and device for log data management
CN107220757A (en) A kind of system and method for rule configuration and parsing
CN108647235A (en) A kind of data analysing method, equipment and medium based on data warehouse
CN106341542A (en) Application program management recommendation method and system in mobile phone
CN115516441A (en) Multi-valued primary keys for multiple unique identifiers of entities
EP4548228A1 (en) Unified graph generation
US8495018B2 (en) Transitioning application replication configurations in a networked computing environment
US20140108625A1 (en) System and method for configuration policy extraction
CN113094360B (en) Cross-industry data processing method
CN107451222A (en) Model data management system
CN112699107B (en) Data management platform supporting high definition
CN107967336A (en) Big data comprehensive management platform construction method based on functional componentization
CN108845857A (en) A kind of icon management method and device based on cloud platform
CN116610667A (en) Service data processing method, device, computer equipment and storage medium
US12481680B2 (en) Team data clustering as a service
US20250348474A1 (en) Inferring graph model from semantic model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant