[go: up one dir, main page]

CN116522102A - Method and device for dynamically adjusting accident risk model database based on privacy calculation - Google Patents

Method and device for dynamically adjusting accident risk model database based on privacy calculation Download PDF

Info

Publication number
CN116522102A
CN116522102A CN202310560865.2A CN202310560865A CN116522102A CN 116522102 A CN116522102 A CN 116522102A CN 202310560865 A CN202310560865 A CN 202310560865A CN 116522102 A CN116522102 A CN 116522102A
Authority
CN
China
Prior art keywords
variable information
accident risk
privacy
dynamically adjusting
tag library
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310560865.2A
Other languages
Chinese (zh)
Inventor
沈健刚
高媛萍
程涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lianyang Guorong Beijing Technology Co ltd
Original Assignee
Lianyang Guorong Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lianyang Guorong Beijing Technology Co ltd filed Critical Lianyang Guorong Beijing Technology Co ltd
Priority to CN202310560865.2A priority Critical patent/CN116522102A/en
Publication of CN116522102A publication Critical patent/CN116522102A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • G06F18/2113Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0206Price or cost determination based on market factors
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Medical Informatics (AREA)
  • Technology Law (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The application discloses a method and a device for dynamically adjusting an accident risk model database based on privacy calculation, which relate to the technical field of databases and are used for acquiring accident risk modeling samples of insurance company clients; carrying out privacy intersection on the accident risk modeling sample to obtain an intersection sample; according to the label library of the intersection sample associated data source side, obtaining independent variable information and dependent variable information; calculating the average IV value of the tag library according to the independent variable information and the dependent variable information; and dynamically adjusting the ordering and pricing mechanism of the tag library according to the average IV value. According to the method and the device for dynamically adjusting the accident risk model database based on privacy calculation, under the condition that supervision is guaranteed, the tag library is dynamically optimized through the average IV value, so that the value of a data source is fully mined, safety and reliability are achieved, and the cost is low and the gain is high.

Description

基于隐私计算动态调整意外险模型数据库的方法及装置Method and device for dynamically adjusting accident risk model database based on privacy calculation

技术领域technical field

本申请涉及数据库技术领域,具体涉及一种基于隐私计算动态调整意外险模型数据库的方法及装置。The present application relates to the field of database technology, in particular to a method and device for dynamically adjusting an accident risk model database based on privacy calculation.

背景技术Background technique

意外伤害保险(简称:意外险)是人身保险业务之一。以被保险人因遭受意外伤害造成死亡、残废为给付保险金条件的人身保险。其基本内容是:投保人向保险人交纳一定的保险费,如果被保险人在保险期限内遭受意外伤害并以此为直接原因或近因,在自遭受意外伤害之日起的一定时期内造成的死亡、残废、支出医疗费或暂时丧失劳动能力,则保险人给付被保险人或其受益人一定量的保险金。其保障项目有两项,即死亡给付和残废给付。Accidental injury insurance (abbreviation: accident insurance) is one of the personal insurance businesses. Personal insurance that pays insurance money based on the death or disability of the insured due to accidental injury. Its basic content is: the insured pays a certain premium to the insurer, if the insured suffers accidental injury within the insurance period and takes this as the direct cause or proximate cause, within a certain period of time from the date of the accidental injury caused The insurer will pay the insured or its beneficiaries a certain amount of insurance money if the insured dies, becomes disabled, pays medical expenses or temporarily loses the ability to work. There are two protection items, namely death benefit and disability benefit.

随着社会的不断发展,保险公司基于外部数据源构造投保前风控模型的趋势已经形成。传统保险公司寻求外部数据来增益自己的风控模型时,会提供一部分三要素加密的带标签的样本给到数据源,让数据源的建模工程师提供建模服务,从而最终获取一个符合预期的模型,最终在数据源通过部署成评分以api的形式对保险公司提供服务。With the continuous development of society, the trend of insurance companies to construct pre-insurance risk control models based on external data sources has formed. When a traditional insurance company seeks external data to enhance its risk control model, it will provide a part of three-factor encrypted labeled samples to the data source, and let the modeling engineer of the data source provide modeling services, so as to finally obtain an expected The model is finally deployed in the data source to provide services to insurance companies in the form of APIs.

但是,伴随着数据安全法及相关的个人隐私信息保护相关条例的出台,直接获取个人加密信息并不符合未来监管的要求,考虑到产品架构,如果仅通过维护固定的数据资产并提供样本建模服务是无法充分体现数据源自有数据的价值,且数据源缺少对目标客户金融历史、购物偏好、身份特质以及人脉关系等维度的洞察能力。However, with the promulgation of the Data Security Law and related regulations on the protection of personal privacy information, direct access to personal encrypted information does not meet the requirements of future supervision. Considering the product structure, if only by maintaining fixed data assets and providing sample modeling The service cannot fully reflect the value of data originating from existing data, and the data source lacks insight into the target customer's financial history, shopping preferences, identity traits, and personal connections.

发明内容Contents of the invention

为此,本申请提供一种基于隐私计算动态调整意外险模型数据库的方法及装置,以解决现有技术存在的开发风控模型时数据源无法充分体现数据源自有数据的价值,缺少多维度的洞察能力且安全性差的问题。For this reason, this application provides a method and device for dynamically adjusting the accident risk model database based on privacy calculations, so as to solve the existing problems in the prior art that the data source cannot fully reflect the value of data originating from existing data, and lacks multi-dimensionality. The problem of poor insight and poor security.

为了实现上述目的,本申请提供如下技术方案:In order to achieve the above object, the application provides the following technical solutions:

第一方面,一种基于隐私计算动态调整意外险模型数据库的方法,包括:In the first aspect, a method for dynamically adjusting the accident risk model database based on privacy calculation, including:

获取保险公司客户意外险建模样本;Obtain modeling samples of insurance company customers' accident insurance;

将所述意外险建模样本进行隐私求交,得到交集样本;Perform privacy intersecting on the accident risk modeling samples to obtain the intersection samples;

根据所述交集样本关联数据源侧的标签库,得到自变量信息和因变量信息;Obtain independent variable information and dependent variable information according to the tag library on the data source side associated with the intersection sample;

根据所述自变量信息和因变量信息计算标签库的平均IV值;Calculate the average IV value of the tag library according to the independent variable information and dependent variable information;

根据平均IV值动态调整标签库的排序及定价机制。Dynamically adjust the sorting and pricing mechanism of the tag library according to the average IV value.

作为优选,所述意外险建模样本至少包括姓名、身份证号、手机号和回溯时间。Preferably, the accident risk modeling sample includes at least name, ID number, mobile phone number and backtracking time.

作为优选,根据所述交集样本关联数据源侧的标签库时采用左关联查询。Preferably, a left-association query is used when associating the tag library at the data source side according to the intersection samples.

作为优选,所述标签库包括保代库、购物偏好库、逆选人群库和逾期库。Preferably, the label library includes a warranty library, a shopping preference library, a reverse selection group library and an overdue library.

作为优选,所述根据平均IV值动态调整标签库的排序及定价机制时,平均IV值越大其对应的标签库权重或者曝光值越大。Preferably, when dynamically adjusting the sorting and pricing mechanism of the tag library according to the average IV value, the greater the average IV value, the greater the corresponding tag library weight or exposure value.

第二方面,一种基于隐私计算动态调整意外险模型数据库的方法,包括:In the second aspect, a method for dynamically adjusting the accident risk model database based on privacy calculations, including:

获取保险公司客户意外险建模样本;Obtain modeling samples of insurance company customers' accident insurance;

将所述意外险建模样本进行隐私求交,得到交集样本;Perform privacy intersecting on the accident risk modeling samples to obtain the intersection samples;

根据所述交集样本关联数据源侧的标签库,得到自变量信息和因变量信息;Obtain independent variable information and dependent variable information according to the tag library on the data source side associated with the intersection sample;

根据所述自变量信息和因变量信息计算标签库的缺失率;Calculate the missing rate of the tag library according to the independent variable information and dependent variable information;

根据所述缺失率动态调整标签库的排序及定价机制。Dynamically adjust the sorting and pricing mechanism of the tag library according to the missing rate.

作为优选,根据所述缺失率动态调整标签库的排序及定价机制时,所述缺失率越高其对应的标签库权重或者曝光值越小。Preferably, when dynamically adjusting the sorting and pricing mechanism of the tag library according to the missing rate, the higher the missing rate, the smaller the corresponding tag library weight or exposure value.

第三方面,一种基于隐私计算动态调整意外险模型数据库的装置,包括:In the third aspect, a device for dynamically adjusting the accident risk model database based on privacy calculations, including:

数据上传模块,用于获取保险公司客户意外险建模样本;Data upload module, which is used to obtain modeling samples of insurance company customers' accident insurance;

隐私求交模块,用于将所述意外险建模样本进行隐私求交,得到交集样本;A privacy intersecting module, configured to perform privacy intersecting on the accident risk modeling samples to obtain intersection samples;

标签库处理模块,用于根据所述交集样本关联数据源侧的标签库,得到自变量信息和因变量信息;The tag library processing module is used to obtain independent variable information and dependent variable information according to the tag library on the data source side associated with the intersection sample;

隐私统计模块,用于根据所述自变量信息和因变量信息计算标签库的平均IV值;Privacy statistics module, for calculating the average IV value of the tag library according to the independent variable information and dependent variable information;

以及根据平均IV值动态调整标签库的排序及定价机制。And dynamically adjust the sorting and pricing mechanism of the tag library according to the average IV value.

第四方面,一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现基于隐私计算动态调整意外险模型数据库的方法的步骤。In a fourth aspect, a computer device includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the method for dynamically adjusting the accident risk model database based on privacy calculation when executing the computer program.

第五方面,一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现基于隐私计算动态调整意外险模型数据库的方法的步骤。In the fifth aspect, a computer-readable storage medium stores a computer program thereon, and when the computer program is executed by a processor, the steps of the method for dynamically adjusting the accident risk model database based on privacy calculation are realized.

相比现有技术,本申请至少具有以下有益效果:Compared with the prior art, the present application has at least the following beneficial effects:

本申请提供了一种基于隐私计算动态调整意外险模型数据库的方法及装置,通过获取保险公司客户意外险建模样本;将意外险建模样本进行隐私求交,得到交集样本;根据交集样本关联数据源侧的标签库,得到自变量信息和因变量信息;根据自变量信息和因变量信息计算标签库的平均IV值;根据平均IV值动态调整标签库的排序及定价机制。本申请提供的基于隐私计算动态调整意外险模型数据库的方法及装置,在保证符合监管的情况下,通过平均IV值动态优化标签库,充分挖掘出了数据源的价值,不仅安全可靠,而且成本低、增益高。This application provides a method and device for dynamically adjusting the accident insurance model database based on privacy calculations, by obtaining accident insurance modeling samples from insurance company customers; performing privacy calculation on the accident insurance modeling samples to obtain intersection samples; The tag library on the data source side obtains independent variable information and dependent variable information; calculates the average IV value of the tag library according to the independent variable information and dependent variable information; dynamically adjusts the sorting and pricing mechanism of the tag library according to the average IV value. The method and device for dynamically adjusting the accident risk model database based on privacy calculations provided by this application can dynamically optimize the label library through the average IV value under the condition of ensuring compliance with supervision, and fully tap the value of the data source, which is not only safe and reliable, but also cost-effective Low, high gain.

附图说明Description of drawings

为了更直观地说明现有技术以及本申请,下面给出几个示例性的附图。应当理解,附图中所示的具体形状、构造,通常不应视为实现本申请时的限定条件;例如,本领域技术人员基于本申请揭示的技术构思和示例性的附图,有能力对某些单元(部件)的增/减/归属划分、具体形状、位置关系、连接方式、尺寸比例关系等容易作出常规的调整或进一步的优化。In order to illustrate the prior art and the present application more intuitively, several exemplary drawings are given below. It should be understood that the specific shapes and structures shown in the accompanying drawings should generally not be regarded as limiting conditions for the implementation of the present application; for example, those skilled in the art are able to The increase/decrease/attribution division, specific shape, positional relationship, connection mode, size ratio relationship, etc. of some units (parts) are easy to make conventional adjustments or further optimization.

图1为本申请实施例一提供的一种基于隐私计算动态调整意外险模型数据库的方法流程图;FIG. 1 is a flow chart of a method for dynamically adjusting the accident risk model database based on privacy calculation provided in Embodiment 1 of the present application;

图2为本申请实施例二提供的一种基于隐私计算动态调整意外险模型数据库的装置结构框图。FIG. 2 is a structural block diagram of a device for dynamically adjusting an accident risk model database based on privacy calculation provided in Embodiment 2 of the present application.

具体实施方式Detailed ways

以下结合附图,通过具体实施例对本申请作进一步详述。The present application will be further described in detail through specific embodiments below in conjunction with the accompanying drawings.

在本申请的描述中:除非另有说明,“多个”的含义是两个或两个以上。本申请中的术语“第一”、“第二”、“第三”等旨在区别指代的对象,而不具有技术内涵方面的特别意义(例如,不应理解为对重要程度或次序等的强调)。“包括”、“包含”、“具有”等表述方式,同时还意味着“不限于”(某些单元、部件、材料、步骤等)。In the description of this application: unless otherwise specified, "plurality" means two or more. The terms "first", "second", and "third" in this application are intended to distinguish the referred objects, and have no special meaning in terms of technical connotation (for example, it should not be understood as a reference to the degree of importance or order, etc. emphasis). Expressions such as "comprising", "including", and "having" also mean "not limited to" (certain elements, components, materials, steps, etc.).

本申请中所引用的如“上”、“下”、“左”、“右”、“中间”等的用语,通常是为了便于对照附图直观理解,而并非对实际产品中位置关系的绝对限定。在未脱离本申请揭示的技术构思的情况下,这些相对位置关系的改变,当亦视为本申请表述的范畴。Terms such as "upper", "lower", "left", "right", "middle", etc. quoted in this application are usually for the convenience of intuitive understanding with reference to the drawings, rather than absolute terms for the positional relationship in the actual product limited. Without departing from the technical concept disclosed in this application, changes in these relative positional relationships should also be regarded as the scope of this application.

实施例一Embodiment one

请参阅图1,本实施例提供了一种基于隐私计算动态调整意外险模型数据库的方法,包括:Referring to Fig. 1, this embodiment provides a method for dynamically adjusting the accident risk model database based on privacy calculation, including:

S1:获取保险公司客户意外险建模样本;S1: Obtain modeling samples of insurance company customers' accident insurance;

具体的,保险公司提供一批意外险建模样本M,意外险建模样本M要求必须包含加密的姓名、身份证号、手机号三要素信息,同时包含投保时间作为回溯变量的起点。Specifically, the insurance company provides a batch of accident insurance modeling samples M. The accident insurance modeling samples M must contain encrypted name, ID number, and mobile phone number, and include the insurance application time as the starting point of the backtracking variable.

S2:将意外险建模样本进行隐私求交,得到交集样本;S2: Intersect the accident risk modeling samples in privacy to obtain the intersection samples;

具体的,隐私计算是指在保护数据本身不对外泄露的前提下实现数据分析计算的技术集合,达到对数据“可用、不可见”的目的。本步骤将保险公司给到的意外险建模样本M进行隐私求交的前提是:双方都在本地部署了隐私计算的节点,通过隐私求交后获取交集样本M’。Specifically, privacy computing refers to a collection of technologies that realize data analysis and computing on the premise of protecting the data itself from being leaked to the outside world, so as to achieve the purpose of "available and invisible" to the data. In this step, the premise of conducting privacy calculation on the accident insurance modeling sample M given by the insurance company is that both parties have deployed privacy computing nodes locally, and the intersection sample M’ is obtained after the privacy calculation.

S3:根据交集样本关联数据源侧的标签库,得到自变量信息和因变量信息;S3: Obtain independent variable information and dependent variable information according to the tag library on the data source side associated with the intersection sample;

具体的,标签库是针对客户提供的意外险建模样本的初始库,每个库表中的变量可以假设与意外险建模样本有一定的相关性,特别是针对意外险场景,意外险建模样本中客户的保代属性、逾期属性、是否为逆选人群等均可能和样本的因变量(y值)有相关性。Specifically, the label library is the initial library for the accident risk modeling samples provided by customers. The variables in each library table can be assumed to have a certain correlation with the accident risk modeling samples, especially for the accident risk scenario. In the model sample, the insurance agent attributes, overdue attributes, and whether they belong to the reverse selection group of customers in the model sample may be related to the dependent variable (y value) of the sample.

本实施例中,针对获取的意外险建模样本,数据源侧初始配置了标签库,标签库包括A1保代库、A2购物偏好库、A3逆选人群库...Ai...An等,每个标签库中都包含了相应的变量列表(list)。本步骤通过左关联数据源侧的y值,例如理赔、出险等指标以及自变量(x值)为后续计算相应统计值做准备。In this embodiment, for the obtained accident risk modeling samples, the data source side initially configures a tag library, which includes A1 insurance agent library, A2 shopping preference library, A3 reverse selection group library...Ai...An, etc. Each tag library contains the corresponding variable list (list). In this step, prepare for the subsequent calculation of corresponding statistical values through the y value of the left-associated data source side, such as indicators such as claims and accidents, and independent variables (x values).

S4:根据自变量信息和因变量信息计算标签库的平均IV值;S4: Calculate the average IV value of the tag library according to the independent variable information and the dependent variable information;

具体的,通过步骤S2关联的库变量(自变量x)的信息及出险、赔付的(因变量y)信息,该步骤可以准确的计算出每个库的平均IV值。Specifically, through the information of the bank variable (independent variable x) and the information of accident and compensation (dependent variable y) associated with step S2, this step can accurately calculate the average IV value of each bank.

IV(information value)即信息价值,是衡量特征对于模型预测能力的指标,常用于入模训练前的特征筛选的参考依据。IV (information value) is the value of information, which is an index to measure the predictive ability of features for the model, and is often used as a reference for feature screening before model training.

例如:针对A1保代库,假设保代库里面存在5个变量(用户在近3/6/9/12/24个月作为保代销售保单张数)的平均IV值为0.2;针对A4逾期库,假设逾期库存在5个变量(用户在近3/6/9/12/24个月有逾期的订单个数)的平均IV值为0.3;针对An##库存在10个变量,且平均的IV值为0.01;作为优选,后续可以把A4逾期库的权重或者曝光值做到最大,也即最有可能被客户选取入模,而针对An##库可以将权重降到更低,这样可以有效的节省存储资源。For example: for the A1 insurance agent library, assume that there are 5 variables in the insurance agent library (the number of insurance policies sold by the user as an insurance agent in the past 3/6/9/12/24 months) with an average IV value of 0.2; for the A4 overdue library, assume There are 5 variables in overdue inventory (users have overdue orders in the past 3/6/9/12/24 months), and the average IV value is 0.3; there are 10 variables for An## inventory, and the average IV value is 0.01; as a preference, the weight or exposure value of the A4 overdue library can be maximized in the future, that is, it is most likely to be selected by the customer to enter the model, and the weight can be lowered for the An## library, which can be effective Save storage resources.

S5:根据平均IV值动态调整标签库的排序及定价机制。S5: Dynamically adjust the sorting and pricing mechanism of the tag library according to the average IV value.

通过以上步骤,可以在不同的保险公司提供多批样本后不断的进行校正,这样依据IV值作为排序指标沉淀下来数据库排序可以不断优化。客户建模时可以选择IV值靠前的数据库变量作为优选层,而部分性能较差的变量会被性能较好的变量动态调整掉(偏后或者直接被剔除),整体上节省了资源,同时优选层变量的使用个数也可以作为后续商务议价的参考,也就是客户入模优选层变量越多,收费可能越高。Through the above steps, after different insurance companies provide multiple batches of samples, continuous corrections can be made, so that the database sorting can be continuously optimized based on the IV value as the sorting index. When modeling, the customer can choose the database variable with the highest IV value as the preferred layer, and some variables with poor performance will be dynamically adjusted by the variable with better performance (be behind or directly eliminated), which saves resources on the whole, and at the same time The number of preferred layer variables used can also be used as a reference for subsequent business negotiation, that is, the more preferred layer variables a customer enters into the model, the higher the charge may be.

本实施例提供的基于隐私计算动态调整意外险模型数据库的方法,通过隐私计算使得数据更加安全合规,通过保险公司客户提供的样本正好可以成为优化迭代数据体系的着力点,实现了对数据资产的高效动态运营,充分挖掘出了数据价值,平衡了收益与成本,将劣势转化成迭代升级的契机,安全性高、成本低、增益高。The method of dynamically adjusting the accident insurance model database based on privacy calculation provided by this embodiment makes the data more secure and compliant through privacy calculation, and the sample provided by the insurance company customer can just become the focus of optimizing the iterative data system, realizing the data assets The efficient and dynamic operation fully exploits the value of data, balances benefits and costs, and turns disadvantages into opportunities for iterative upgrades, with high security, low cost, and high gains.

实施例二Embodiment two

本实施例提供了另外一种基于隐私计算动态调整意外险模型数据库的方法,包括:This embodiment provides another method for dynamically adjusting the accident risk model database based on privacy calculation, including:

S1:获取保险公司客户意外险建模样本;S1: Obtain modeling samples of insurance company customers' accident insurance;

S2:将意外险建模样本进行隐私求交,得到交集样本;S2: Intersect the accident risk modeling samples in privacy to obtain the intersection samples;

S3:根据交集样本关联数据源侧的标签库,得到自变量信息和因变量信息;S3: Obtain independent variable information and dependent variable information according to the tag library on the data source side associated with the intersection sample;

S4:根据自变量信息和因变量信息计算标签库的缺失率;S4: Calculate the missing rate of the tag library according to the independent variable information and the dependent variable information;

S5:根据缺失率动态调整标签库的排序及定价机制。S5: Dynamically adjust the sorting and pricing mechanism of the tag library according to the missing rate.

本实施例采用缺失率动态调整标签库的排序及定价机制,根据缺失率动态调整标签库的排序及定价机制时,缺失率越高其对应的标签库权重或者曝光值越小,也即缺失率越高,客户建模时选中的概率越低。This embodiment uses the missing rate to dynamically adjust the sorting and pricing mechanism of the tag library. When dynamically adjusting the sorting and pricing mechanism of the tag library according to the missing rate, the higher the missing rate, the smaller the corresponding tag library weight or exposure value, that is, the missing rate The higher the value, the lower the probability that the customer will be selected when modeling.

实施例三Embodiment three

请参阅图2,本实施例一种基于隐私计算动态调整意外险模型数据库的装置,包括:Please refer to Fig. 2, a device for dynamically adjusting the accident risk model database based on privacy calculation in this embodiment, including:

数据上传模块,用于获取保险公司客户意外险建模样本;Data upload module, which is used to obtain modeling samples of insurance company customers' accident insurance;

隐私求交模块,用于将意外险建模样本进行隐私求交,得到交集样本;The privacy intersection module is used to perform privacy intersection on accident risk modeling samples to obtain intersection samples;

标签库处理模块,用于根据交集样本关联数据源侧的标签库,得到自变量信息和因变量信息;The tag library processing module is used to obtain independent variable information and dependent variable information according to the tag library on the data source side associated with the intersection sample;

隐私统计模块,用于根据自变量信息和因变量信息计算标签库的平均IV值;The privacy statistics module is used to calculate the average IV value of the tag library according to the independent variable information and the dependent variable information;

以及根据平均IV值动态调整标签库的排序及定价机制。And dynamically adjust the sorting and pricing mechanism of the tag library according to the average IV value.

关于一种基于隐私计算动态调整意外险模型数据库装置的具体限定可以参见上文中对于一种基于隐私计算动态调整意外险模型数据库方法的限定,在此不再赘述。For specific limitations on a device for dynamically adjusting an accident risk model database based on privacy calculations, please refer to the above definition for a method for dynamically adjusting an accident risk model database based on privacy calculations, which will not be repeated here.

实施例四Embodiment four

本实施例提供了一种计算机设备,包括存储器和处理器,存储器存储有计算机程序,处理器执行计算机程序时实现基于隐私计算动态调整意外险模型数据库方法的步骤。This embodiment provides a computer device, including a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, the steps of the method for dynamically adjusting the accident risk model database based on privacy calculation are realized.

实施例五Embodiment five

本实施例提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现基于隐私计算动态调整意外险模型数据库的步骤。This embodiment provides a computer-readable storage medium on which a computer program is stored. When the computer program is executed by a processor, the step of dynamically adjusting the accident risk model database based on privacy calculation is implemented.

以上实施例的各技术特征可以进行任意的组合(只要这些技术特征的组合不存在矛盾),为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述;这些未明确写出的实施例,也都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily (as long as there is no contradiction in the combination of these technical features), for the sake of concise description, all possible combinations of the various technical features in the above embodiments are not described; these are not clear All the written examples should also be regarded as within the scope of the description in this specification.

上文中通过一般性说明及具体实施例对本申请作了较为具体和详细的描述。应当理解,基于本申请的技术构思,还可以对这些具体实施例作出若干常规的调整或进一步的创新;但只要未脱离本申请的技术构思,这些常规的调整或进一步的创新得到的技术方案也同样落入本申请的权利要求保护范围。The present application has been described more specifically and in detail through general descriptions and specific examples above. It should be understood that based on the technical concept of the present application, some conventional adjustments or further innovations can also be made to these specific embodiments; Also fall within the protection scope of the claims of the present application.

Claims (10)

1. A method for dynamically adjusting an accident risk model database based on privacy calculations, comprising:
acquiring an accident risk modeling sample of an insurance company client;
carrying out privacy intersection on the accident risk modeling sample to obtain an intersection sample;
obtaining independent variable information and dependent variable information according to the label library of the intersection sample associated data source side;
calculating the average IV value of the tag library according to the independent variable information and the dependent variable information;
and dynamically adjusting the ordering and pricing mechanism of the tag library according to the average IV value.
2. The method for dynamically adjusting an accident risk model database based on privacy calculations of claim 1, wherein the accident risk modeling samples include at least name, identification number, phone number, and backtracking time.
3. The method for dynamically adjusting an accident risk model database based on privacy calculations of claim 1, wherein a left-associated query is employed when associating a tag library on a data source side according to the intersection sample.
4. The method for dynamically adjusting an accident risk model database based on privacy calculations of claim 1, wherein the tag library comprises a warranty library, a shopping preference library, an inverse selection crowd library, and an overdue library.
5. The method for dynamically adjusting a risk of accident model database based on privacy calculations according to claim 1, wherein when the ordering and pricing mechanism of the tag library is dynamically adjusted according to the average IV value, the larger the average IV value is, the larger the tag library weight or exposure value corresponding to the average IV value is.
6. A method for dynamically adjusting an accident risk model database based on privacy calculations, comprising:
acquiring an accident risk modeling sample of an insurance company client;
carrying out privacy intersection on the accident risk modeling sample to obtain an intersection sample;
obtaining independent variable information and dependent variable information according to the label library of the intersection sample associated data source side;
calculating the deletion rate of the tag library according to the independent variable information and the dependent variable information;
and dynamically adjusting the ordering and pricing mechanism of the tag library according to the deletion rate.
7. The method for dynamically adjusting a risk of accident model database based on privacy calculations of claim 6, wherein when the ordering and pricing mechanism of the tag library is dynamically adjusted according to the deletion rate, the higher the deletion rate is, the smaller the tag library weight or exposure value is.
8. An apparatus for dynamically adjusting an accident risk model database based on privacy calculations, comprising:
the data uploading module is used for acquiring an accident risk modeling sample of an insurance company client;
the privacy intersection module is used for performing privacy intersection on the accident risk modeling sample to obtain an intersection sample;
the tag library processing module is used for obtaining independent variable information and dependent variable information according to the tag library of the intersection sample associated data source side;
the privacy statistics module is used for calculating the average IV value of the tag library according to the independent variable information and the dependent variable information;
and dynamically adjusting the ordering and pricing mechanism of the tag library according to the average IV value.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 5 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 5.
CN202310560865.2A 2023-05-17 2023-05-17 Method and device for dynamically adjusting accident risk model database based on privacy calculation Pending CN116522102A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310560865.2A CN116522102A (en) 2023-05-17 2023-05-17 Method and device for dynamically adjusting accident risk model database based on privacy calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310560865.2A CN116522102A (en) 2023-05-17 2023-05-17 Method and device for dynamically adjusting accident risk model database based on privacy calculation

Publications (1)

Publication Number Publication Date
CN116522102A true CN116522102A (en) 2023-08-01

Family

ID=87406348

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310560865.2A Pending CN116522102A (en) 2023-05-17 2023-05-17 Method and device for dynamically adjusting accident risk model database based on privacy calculation

Country Status (1)

Country Link
CN (1) CN116522102A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834983A (en) * 2014-12-25 2015-08-12 平安科技(深圳)有限公司 Business data processing method and device
CN111144738A (en) * 2019-12-24 2020-05-12 太平金融科技服务(上海)有限公司 Information processing method, information processing device, computer equipment and storage medium
US20220255764A1 (en) * 2021-02-06 2022-08-11 SoterOne, Inc. Federated learning platform and machine learning framework
CN115292750A (en) * 2022-08-24 2022-11-04 上海阵方科技有限公司 Privacy logistic regression method and system applied to financial scene
CN116049909A (en) * 2023-01-28 2023-05-02 腾讯科技(深圳)有限公司 Feature screening method, device, equipment and storage medium in federal feature engineering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104834983A (en) * 2014-12-25 2015-08-12 平安科技(深圳)有限公司 Business data processing method and device
CN111144738A (en) * 2019-12-24 2020-05-12 太平金融科技服务(上海)有限公司 Information processing method, information processing device, computer equipment and storage medium
US20220255764A1 (en) * 2021-02-06 2022-08-11 SoterOne, Inc. Federated learning platform and machine learning framework
CN115292750A (en) * 2022-08-24 2022-11-04 上海阵方科技有限公司 Privacy logistic regression method and system applied to financial scene
CN116049909A (en) * 2023-01-28 2023-05-02 腾讯科技(深圳)有限公司 Feature screening method, device, equipment and storage medium in federal feature engineering

Similar Documents

Publication Publication Date Title
US12131375B2 (en) Risk-based machine learning classifier
US9940668B2 (en) Switching between data aggregator servers
CA2978488C (en) Systems and methods for managing data
US7693810B2 (en) Method and system for advanced scenario based alert generation and processing
US20150112854A1 (en) Method of Automating a Business Loan Life Cycle
CN106952072A (en) A kind of method and system of data processing
US10599985B2 (en) Systems and methods for expediting rule-based data processing
US20140149470A1 (en) Systems and methods for storing and populating forms
US9495639B2 (en) Determining document classification probabilistically through classification rule analysis
US8266029B2 (en) System and method for managing data relating to investments from a variety of sources
US20150149337A1 (en) Apparatus, method, system, and storage medium
CN113902574A (en) Protocol data processing method, device, computer equipment and storage medium
US20160203158A1 (en) Metadata data catalog
US10235719B2 (en) Centralized GAAP approach for multidimensional accounting to reduce data volume and data reconciliation processing costs
US8620895B1 (en) Mapping organizational accounting codes to access business information
CN116522102A (en) Method and device for dynamically adjusting accident risk model database based on privacy calculation
US20140258053A1 (en) System and method for accounting of financial instruments
CN115423595B (en) File information processing method and device, computer equipment and storage medium
US20230085144A1 (en) System and method for real-time management of data records
CN119398613A (en) Data processing method, device, computer equipment and storage medium
Uddin et al. Does Insurance Technology have an Impact on the Financial Performance of Insurance Business?
KR20250012240A (en) Rpa-based ai customized loan brolerage service device
CN120509393A (en) Role-based contract review methods, devices, equipment, media and products
CN116258510A (en) Automatic marketing management method and system
CN115982158A (en) Supervision data processing method, device, equipment and medium based on data mart

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230801

RJ01 Rejection of invention patent application after publication