JP7713428B2

JP7713428B2 - Examination device, method and program

Info

Publication number: JP7713428B2
Application number: JP2022100485A
Authority: JP
Inventors: 卓志梅田; アシュリージェーン; 麻里子河崎; 智彦山下; 垠呉
Original assignee: Rakuten Group Inc
Current assignee: Rakuten Group Inc
Priority date: 2022-06-22
Filing date: 2022-06-22
Publication date: 2025-07-25
Anticipated expiration: 2042-06-22
Also published as: JP2024001669A; TW202401337A

Description

本開示は、ユーザに関する審査を支援するための技術に関する。 This disclosure relates to technology for assisting in the screening of users.

従来、端末装置から店舗のトランザクション情報等の通知を受け、トランザクション情報等を学習済モデルに投入して分類結果を取得し、この分類結果に基づいて融資スコアを算出した後、融資スコアと各金融機関の融資条件情報に基づいて融資情報を生成し、各金融機関の融資情報及び事業者に通知すべきリコメンド情報を端末装置に通知する融資情報提供装置が提案されている（特許文献１を参照）。 A loan information providing device has been proposed that receives notifications of store transaction information, etc. from a terminal device, inputs the transaction information, etc. into a trained model to obtain classification results, calculates a loan score based on this classification result, generates loan information based on the loan score and loan condition information of each financial institution, and notifies the terminal device of the loan information of each financial institution and recommended information to be notified to businesses (see Patent Document 1).

特開２０２０－１６０９４９号公報JP 2020-160949 A

従来、ユーザに対する融資や信用取引等のための審査において、機械学習モデルを用いたユーザの信用スコアの決定が、審査の適正化という観点で重要視されている。また、従来、融資スコアが高くなるモデルへの入力データの組み合わせをリコメンド情報として選択する技術が提案されている（特許文献１を参照）。しかし、従来提案されている技術は、審査の適正化という観点で一定の効果を奏するものの、信用情報等の属性データの蓄積が不十分なユーザの審査を適正に行う点において改善の余地があった。 Conventionally, in screening for loans, credit transactions, etc., determining a user's credit score using a machine learning model has been considered important from the perspective of optimizing screening. In addition, a technology has been proposed that selects a combination of input data to a model that results in a high loan score as recommendation information (see Patent Document 1). However, while the technology proposed in the past has been effective to a certain extent in terms of optimizing screening, there is room for improvement in terms of appropriately screening users who have insufficient accumulated attribute data such as credit information.

本開示は、上記した問題に鑑み、属性データの蓄積が不十分なユーザについても、適正な審査を提供することを課題とする。 In view of the above problems, the present disclosure aims to provide proper screening even for users who have insufficient accumulated attribute data.

本開示の一例は、ユーザに係る入力データを第一機械学習モデルに入力することで得られた出力に基づいて、該ユーザの第一スコアを取得する第一スコア取得手段と、前記ユーザに係る入力データを第二機械学習モデルに入力することで得られた出力に基づいて、該ユーザの第二スコアを取得する第二スコア取得手段と、対象ユーザを含むユーザ群のセグメンテーションを行うことで、該対象ユーザが属するユーザセグメントを特定するユーザセグメント特定手段と、前記対象ユーザが属する前記ユーザセグメントに応じて、該対象ユーザの第一スコア及び／又は第二スコアに基づいて該対象ユーザの審査結果を決定する審査結果決定手段と、を備える審査装置である。 An example of the present disclosure is an assessment device including: a first score acquisition means for acquiring a first score for a user based on an output obtained by inputting input data related to the user into a first machine learning model; a second score acquisition means for acquiring a second score for the user based on an output obtained by inputting input data related to the user into a second machine learning model; a user segment identification means for identifying a user segment to which the target user belongs by performing segmentation of a user group including the target user; and an assessment result determination means for determining an assessment result for the target user based on the first score and/or second score of the target user according to the user segment to which the target user belongs.

本開示は、審査装置、システム、コンピュータによって実行される方法又はコンピュータに実行させるプログラムとして把握することが可能である。また、本開示は、そのようなプログラムをコンピュータその他の装置、機械等が読み取り可能な記録媒体に記録したものとしても把握できる。ここで、コンピュータ等が読み取り可能な記録媒体とは、データやプログラム等の情報を電気的、磁気的、光学的、機械的又は化学的作用によって蓄積し、コンピュータ等から読み取ることができる記録媒体をいう。 The present disclosure can be understood as an examination device, a system, a method executed by a computer, or a program executed by a computer. The present disclosure can also be understood as such a program recorded on a recording medium readable by a computer or other device, machine, etc. Here, a recording medium readable by a computer, etc. refers to a recording medium that stores information such as data and programs through electrical, magnetic, optical, mechanical, or chemical action and can be read by a computer, etc.

本開示によれば、属性データの蓄積が不十分なユーザについても、適正な審査を提供することが可能となる。 This disclosure makes it possible to provide fair screening even to users who have insufficient accumulated attribute data.

実施形態に係る情報処理システムの構成を示す概略図である。1 is a schematic diagram showing a configuration of an information processing system according to an embodiment. 実施形態に係る審査装置の機能構成の概略を示す図である。1 is a diagram showing an outline of the functional configuration of a screening device according to an embodiment; 実施形態における審査処理の概要を示す図である。FIG. 2 is a diagram showing an overview of a screening process in an embodiment. 実施形態において採用される機械学習モデルの決定木の概念を簡略図である。FIG. 1 is a simplified diagram illustrating the concept of a decision tree of a machine learning model employed in an embodiment. 実施形態に係る機械学習処理の流れを示すフローチャートである。1 is a flowchart illustrating a flow of a machine learning process according to an embodiment. 実施形態に係る審査処理の流れを示すフローチャートである。11 is a flowchart showing the flow of a screening process according to the embodiment. バリエーションに係る審査装置の機能構成の概略を示す図である。FIG. 2 is a diagram showing an outline of the functional configuration of a review device for variations.

以下、本開示に係る審査装置、方法及びプログラムの実施の形態を、図面に基づいて説明する。但し、以下に説明する実施の形態は、実施形態を例示するものであって、本開示に係る審査装置、方法及びプログラムを以下に説明する具体的構成に限定するものではない。実施にあたっては、実施の態様に応じた具体的構成が適宜採用され、また、種々の改良や変形が行われてよい。本開示に係る技術では、後述する実施形態、バリエーションの夫々における構成の少なくとも一部を適宜、互いに採用することができる。 Below, embodiments of the screening device, method, and program according to the present disclosure will be described with reference to the drawings. However, the embodiments described below are merely examples, and the screening device, method, and program according to the present disclosure are not limited to the specific configurations described below. In implementing the present disclosure, a specific configuration according to the mode of implementation may be appropriately adopted, and various improvements and modifications may be made. In the technology according to the present disclosure, at least a portion of the configurations in the embodiments and variations described below may be appropriately adopted with each other.

本実施形態では、本開示に係る技術を、ユーザに関して融資の審査を提供するシステムにおいて実施した場合の実施の形態について説明する。但し、本開示に係る技術は、ユーザに関する何らかの審査を支援するための技術について広く用いることが可能であり、本開示の適用対象は、実施形態において示した例に限定されない。 In this embodiment, an embodiment of the technology disclosed herein will be described in a case where the technology disclosed herein is implemented in a system that provides loan screening for users. However, the technology disclosed herein can be widely used as a technology for supporting any screening for users, and the application of the present disclosure is not limited to the examples shown in the embodiment.

＜システムの構成＞
図１は、本実施形態に係る情報処理システムの構成を示す概略図である。本実施形態に係る情報処理システムでは、審査装置１と、１又は複数のサービス提供システム５と、が互いに通信可能に接続されている。ユーザは、サービス提供システム５によって提供されるサービスの利用者であり、ユーザ端末からサービス提供システム５にアクセスすることでサービスの提供を受ける。 <System Configuration>
1 is a schematic diagram showing the configuration of an information processing system according to this embodiment. In the information processing system according to this embodiment, a screening device 1 and one or more service providing systems 5 are communicably connected to each other. A user is a user of a service provided by the service providing system 5, and receives the service by accessing the service providing system 5 from a user terminal.

審査装置１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１２、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１３、ＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）やＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）等の記憶装置１４、ＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）等の通信ユニット１５、等を備えるコンピュータである。但し、審査装置１の具体的なハードウェア構成に関しては、実施の態様に応じて適宜省略や置換、追加が可能である。また、審査装置１は、単一の筐体からなる装置に限定されない。審査装置１は、所謂クラウドや分散コンピューティングの技術等を用いた、複数の装置によって実現されてよい。 The review device 1 is a computer equipped with a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, a storage device 14 such as an EEPROM (Electrically Erasable and Programmable Read Only Memory) or a HDD (Hard Disk Drive), a communication unit 15 such as a NIC (Network Interface Card), etc. However, the specific hardware configuration of the review device 1 can be omitted, replaced, or added as appropriate depending on the embodiment. Furthermore, the review device 1 is not limited to a device consisting of a single housing. The examination device 1 may be realized by multiple devices using so-called cloud or distributed computing technology, etc.

審査装置１は、ユーザ毎に当該ユーザに関する取引の審査を行い、サービス提供システム５に対して審査結果を提供する。サービス提供システム５は、審査装置１から提供された、任意のユーザを対象とした審査の結果に応じて、当該ユーザ（以下、「対象ユーザ」。）に対して融資等のサービスを提供するか否か（取引を行うか否か）を決定することが可能である。 The screening device 1 screens transactions related to each user and provides the screening results to the service provision system 5. Depending on the results of the screening of a given user provided by the screening device 1, the service provision system 5 can decide whether or not to provide a service such as a loan to that user (hereinafter, the "target user") (whether or not to conduct a transaction).

サービス提供システム５は、ＣＰＵ、ＲＯＭ、ＲＡＭ、記憶装置、通信ユニット、入力装置、出力装置等（図示は省略する）を備えるコンピュータである。また、これらのシステム及び端末は、いずれも、単一の筐体からなる装置に限定されない。これらのシステム及び端末は、所謂クラウドや分散コンピューティングの技術等を用いた、複数の装置によって実現されてよい。 The service providing system 5 is a computer equipped with a CPU, ROM, RAM, a storage device, a communication unit, an input device, an output device, etc. (not shown). Furthermore, these systems and terminals are not limited to devices consisting of a single housing. These systems and terminals may be realized by multiple devices using so-called cloud or distributed computing technology, etc.

サービス提供システム５によって提供されるサービスは、例えば、ローン等の融資審査サービス、クレジットカード／後払い決済サービス、オンラインショッピングサービス、オンライン予約サービス、電子マネー決済サービス、オペレーションセンターサービス、又は地図情報サービス等である。なお、「後払い決済」には、所謂ＢｕｙＮｏｗＰａｙＬａｔｅｒ（ＢＮＰＬ）と称されるサービスに限定されず、あらゆる後払いによる商品／サービスの購入が含まれてよいものとする。 The services provided by the service providing system 5 include, for example, loan screening services, credit card/deferred payment services, online shopping services, online reservation services, electronic money payment services, operation center services, map information services, etc. Note that "deferred payment" is not limited to services known as Buy Now Pay Later (BNPL), and may include any purchase of goods/services with deferred payment.

サービス提供システム５によって提供されるサービスは本実施形態における例示に限定されない。そして、サービス提供システム５は、サービスの提供に際してユーザの属性データを審査装置１に通知する。ここで、ユーザの属性データには当該ユーザによるサービスの利用履歴データが含まれる。サービスの利用履歴データの内容はサービスの内容に応じて様々であり、例えば、ユーザの位置情報の履歴データ、クレジットカード利用額／後払い決済利用額の支払履歴データ、電子マネー利用履歴データ、取引履歴データ（商品等の購入履歴データを含む）、予約履歴データ、オペレーションセンターからのユーザに対するオペレーション履歴データ等が含まれてよい。 The services provided by the service providing system 5 are not limited to the examples in this embodiment. The service providing system 5 notifies the review device 1 of the user's attribute data when providing the service. Here, the user's attribute data includes the service usage history data of the user. The content of the service usage history data varies depending on the content of the service, and may include, for example, history data of the user's location information, payment history data of credit card usage amount/postpaid payment usage amount, electronic money usage history data, transaction history data (including purchase history data of products, etc.), reservation history data, operation history data for the user from the operation center, etc.

ここで、属性データは、例えば、スコア（例えば、０以上１以下の連続的な値）又はラベル（例えば、有無や是非に応じた二値）等のデータ形式で表されるデータが含まれる。但し、属性データのフォーマットは本開示における例示に限定されない。また、属性データには、例えば、オンラインサービス利用状況、ポイントを含む電子的バリューの利用状況等、オンラインサービスにおけるキャンセル率が含まれてよいし、後払い決済におけるデフォルト（債務不履行）を示すデータも含まれてよい。 Here, the attribute data includes data expressed in a data format such as a score (e.g., a continuous value between 0 and 1) or a label (e.g., a binary value corresponding to presence/absence or yes/no). However, the format of the attribute data is not limited to the examples in this disclosure. In addition, the attribute data may include, for example, the usage status of online services, the usage status of electronic value including points, the cancellation rate in online services, and data indicating default (non-performance of debt) in deferred payment.

属性データには、事実属性（ｆａｃｔｕａｌａｔｔｒｉｂｕｔｅ）データと、推定属性（ｉｎｆｅｒｒｅｄａｔｔｒｉｂｕｔｅ）データとが含まれてよい。ここで、事実属性データとは、ユーザ自身から提供されることで得られたユーザ提供データやユーザについて収集された履歴データ等に基づいて、当該ユーザについて事実であると確認可能な属性データである。また、推定属性データとは、ユーザ提供データや履歴データ、事実属性データ等をＶＡＥ（ＶａｒｉａｔｉｏｎａｌＡｕｔｏｅｎｃｏｄｅｒ）に入力する等の方法で推定されて得られる属性データである。 The attribute data may include factual attribute data and inferred attribute data. Here, factual attribute data is attribute data that can be confirmed as factual for a user based on user-provided data obtained by providing it from the user himself or on history data collected about the user. Inferred attribute data is attribute data that is obtained by inferring it by inputting user-provided data, history data, factual attribute data, etc. into a VAE (Variational Autoencoder), for example.

なお、ここで用いられるＶＡＥは、ＶＡＥ前半のエンコーダが出力した潜在ベクトルをＶＡＥ後半のデコーダに入力することで、エンコーダに入力された値（ここでは、事実属性データを含む訓練データ）を異なる形式で表現するものである。ユーザ提供データとしては、例えば、ユーザ自身によって登録された氏名やメールアドレス、電話番号、住所、勤務先、就学先等を含む登録データや、アンケート等にユーザ自身が回答した結果得られたデータが挙げられる。履歴データとしては、例えば、上述した、サービス提供システム５によって提供される電子商取引サービスの利用履歴データが挙げられる。事実属性データは、先述のユーザ提供データや履歴データが、マーケティング及び／又は分析目的に適したデータ形式に変換されたデータであることが好ましい。例えば、利用履歴データに基づいて得ることができる事実属性データとして、ユーザが頻繁に利用する商品／サービスのジャンル／カテゴリやブランドの他、ユーザが頻繁に訪れる商業地や行楽地、観光地等が挙げられる。即ち、本実施形態において、属性データは、機械学習技術を用いて推定又は予測されたユーザの性格、行動傾向、ユーザペルソナ等を含む。 The VAE used here expresses the values input to the encoder (here, training data including fact attribute data) in a different format by inputting the latent vector output by the encoder in the first half of the VAE to the decoder in the second half of the VAE. Examples of user-provided data include registration data including the name, email address, telephone number, address, place of employment, place of study, etc. registered by the user himself, and data obtained as a result of the user himself answering a questionnaire, etc. Examples of history data include the above-mentioned usage history data of the electronic commerce service provided by the service providing system 5. It is preferable that the fact attribute data is data obtained by converting the above-mentioned user-provided data and history data into a data format suitable for marketing and/or analysis purposes. For example, examples of fact attribute data that can be obtained based on usage history data include genres/categories and brands of products/services frequently used by the user, as well as commercial locations, recreational areas, and tourist spots frequently visited by the user. That is, in this embodiment, the attribute data includes the user's personality, behavioral tendencies, user persona, etc. estimated or predicted using machine learning technology.

本実施形態において、各属性データには、重みが設定される。重みは、スコアの算出にあたって属性データが用いられる際の、属性データとスコアとの相関性の高さを示すものであり、後述する機械学習部２７によってスコアの適切性が評価される毎に、モデルのパラメータが、スコアがより適切な値となるように調整される。各属性データと対応する重みは、例として、後述の決定木モデル等のスコア算出のためのモデルにおける各ノード（各回帰木）と対応する重みに相当し、スコアが算出される過程で適宜、決定される。なお、スコアは、例として、各ノードの重みに基づいて決定される。 In this embodiment, a weight is set for each attribute data. The weight indicates the degree of correlation between the attribute data and the score when the attribute data is used to calculate the score, and each time the appropriateness of the score is evaluated by the machine learning unit 27 described later, the parameters of the model are adjusted so that the score becomes a more appropriate value. The weight corresponding to each attribute data corresponds to the weight corresponding to each node (each regression tree) in a model for score calculation such as a decision tree model described later, for example, and is determined appropriately in the process of calculating the score. Note that the score is determined based on the weight of each node, for example.

ここで、属性データ群には、デモグラフィック属性、ビヘイビオラル属性、又はサイコグラフィック属性が含まれてよい。デモグラフィック属性は、例えば、ユーザの性別（ジェンダー）、家族構成、年齢等であり、ビヘイビオラル属性は、サービスの利用履歴データに基づいてよく、例えば、キャッシング利用有無、リボ払い利用有無、所定の口座に係る入出金履歴、賭博又はくじを含む何らかの商品／サービスに係る商取引履歴（オンラインマーケットプレイス等におけるオンライン取引履歴を含んでよい）、位置情報や場所情報を用いたユーザの移動履歴等であり、サイコグラフィック属性は、例えば、賭博又はくじに係る趣向等である。但し、利用可能なユーザの属性は、本実施形態における例示に限定されない。例えば、オペレーションセンターサービス等からの「オペレーション（架電等）に要する時間」、「クレジットカード利用額／後払い決済利用額」も、属性として用いられてよい。なお、デモグラフィック属性に類する属性がユーザ提供データ又は履歴データを根拠とする属性に基づいて推定された属性であってよい。同様に、ビヘイビオラル属性に類する属性がユーザ提供データ又は履歴データを根拠とする属性に基づいて推定された属性であってよい。サイコグラフィック属性は、ユーザによる意思入力の結果を一例として含むユーザ提供データを根拠とする属性であってよい。 Here, the attribute data group may include demographic attributes, behavioral attributes, or psychographic attributes. Demographic attributes are, for example, the user's gender, family structure, age, etc., and behavioral attributes may be based on service usage history data, such as whether or not cash advances are used, whether or not revolving payments are used, deposit and withdrawal history for a specific account, commercial transaction history for some product/service including gambling or lottery (may include online transaction history in an online marketplace, etc.), user movement history using location information or place information, etc., and psychographic attributes are, for example, preferences for gambling or lottery. However, the available user attributes are not limited to the examples in this embodiment. For example, "time required for operation (calls, etc.)" and "credit card usage amount/deferred payment usage amount" from an operation center service, etc. may also be used as attributes. Note that attributes similar to demographic attributes may be attributes estimated based on attributes based on user-provided data or history data. Similarly, attributes similar to behavioral attributes may be inferred attributes based on user-provided data or attributes based on historical data. Psychographic attributes may be attributes based on user-provided data, including, for example, the result of user input.

図２は、本実施形態に係る審査装置１の機能構成の概略を示す図である。審査装置１は、記憶装置１４に記録されているプログラムが、ＲＡＭ１３に読み出され、ＣＰＵ１１によって実行されて、審査装置１に備えられた各ハードウェアが制御されることで、属性決定部２１、第一スコア取得部２２、第二スコア取得部２３、ユーザセグメント特定部２４、相関判定部２５、審査結果決定部２６、及び機械学習部２７を備える審査装置として機能する。なお、本実施形態及び後述するバリエーションでは、審査装置の備える各機能は、汎用プロセッサであるＣＰＵ１１によって実行されるが、これらの機能の一部又は全部は、１又は複数の専用プロセッサによって実行されてもよい。 Figure 2 is a diagram showing an outline of the functional configuration of the review device 1 according to this embodiment. The review device 1 functions as a review device equipped with an attribute determination unit 21, a first score acquisition unit 22, a second score acquisition unit 23, a user segment identification unit 24, a correlation determination unit 25, a review result determination unit 26, and a machine learning unit 27, by a program recorded in the storage device 14 being read into the RAM 13 and executed by the CPU 11, which controls each piece of hardware equipped in the review device 1. Note that in this embodiment and the variations described below, each function equipped in the review device is executed by the CPU 11, which is a general-purpose processor, but some or all of these functions may be executed by one or more dedicated processors.

属性決定部２１は、ユーザ自身から提供されたユーザ提供データ及び／又は当該ユーザの履歴データに基づいて、当該ユーザについて事実であると確認可能な事実属性データを決定する。本実施形態において、属性決定部２１は、ユーザ提供データ及び／又は履歴データを集計する、マップ等の他のデータを参照して該当する属性を決定する、ユーザ提供データ及び／又は履歴データをそのまま用いる、等の手法を用いて、当該ユーザに係る事実属性データを決定する。なお、本実施形態では、ユーザに係る事実属性データをユーザ提供データ及び／又は該ユーザの履歴データに基づいて決定する方法を採用しているが、ユーザに係る事実属性データはその他の方法で取得されてもよい。 The attribute determination unit 21 determines factual attribute data that can be confirmed as fact for the user based on user-provided data provided by the user and/or the user's history data. In this embodiment, the attribute determination unit 21 determines factual attribute data for the user using a method such as aggregating user-provided data and/or history data, determining the relevant attribute by referring to other data such as a map, or using the user-provided data and/or history data as is. Note that, although this embodiment employs a method of determining factual attribute data for the user based on user-provided data and/or the user's history data, factual attribute data for the user may be obtained by other methods.

また、属性決定部２１は、少なくとも、ユーザについて決定された１又は複数の事実属性データを含む入力データに基づいて、当該ユーザについて推定された推定属性データを決定する。本実施形態において、属性決定部２１は、ユーザに係る１又は複数の事実属性データを含む入力データを機械学習モデルである属性推定モデルに入力して得られた出力値に基づいて、推定属性データを決定してよい。なお、本実施形態において、属性推定モデルからの出力値はユーザが所定の属性を有する蓋然性を示す値であり、属性決定部２１は、属性推定モデルから得られた出力値が所定の範囲内にある場合に、ユーザが当該属性を有すると決定する。ユーザが所定の属性を有すると決定された場合、属性決定部２１は、ユーザについて推定された属性データのラベルを、属性の有無又は属性の種類を示す値に設定する。また、属性データはラベルではなくスコアで示されてもよい。この場合、属性決定部２１は、ユーザについて推定された推定属性データのスコアに、推定された属性が適用され得る度合い（確率）を示す値を設定する。当該度合いは、属性推定モデルの出力値であってよい。 The attribute determination unit 21 determines estimated attribute data estimated for a user based on input data including at least one or more fact attribute data determined for the user. In this embodiment, the attribute determination unit 21 may determine estimated attribute data based on an output value obtained by inputting input data including one or more fact attribute data related to the user into an attribute estimation model, which is a machine learning model. In this embodiment, the output value from the attribute estimation model is a value indicating the probability that the user has a predetermined attribute, and the attribute determination unit 21 determines that the user has the attribute when the output value obtained from the attribute estimation model is within a predetermined range. When it is determined that the user has the predetermined attribute, the attribute determination unit 21 sets the label of the attribute data estimated for the user to a value indicating the presence or absence of the attribute or the type of the attribute. The attribute data may be represented by a score instead of a label. In this case, the attribute determination unit 21 sets a value indicating the degree (probability) that the estimated attribute can be applied to the score of the estimated attribute data estimated for the user. The degree may be the output value of the attribute estimation model.

第一スコア取得部２２は、ユーザに係る入力データを第一機械学習モデルに入力することで得られた出力（例えば、０を最小値、１を最大値として正規化／規格化されたスコア）に基づいて、当該ユーザの第一スコアを取得する。より具体的には、第一スコア取得部２２は、ユーザに係る入力データを、第一カテゴリの取引（第一の種類の取引）におけるユーザの属性と当該ユーザの信用度との関係を学習することで生成された第一機械学習モデルに入力することで得られた出力に基づいて、当該ユーザの第一スコアを取得する。本開示において、第一カテゴリの取引とは、所定以上の期間及び／又は所定以上の額に係る取引を指す。即ち、第一カテゴリの取引としては、例えば、長期ローン又は高額ローン等の、比較的長期及び／又は高額の取引が挙げられる。 The first score acquisition unit 22 acquires a first score for a user based on an output (e.g., a score normalized/standardized with 0 as the minimum value and 1 as the maximum value) obtained by inputting input data related to the user into a first machine learning model. More specifically, the first score acquisition unit 22 acquires a first score for a user based on an output obtained by inputting input data related to the user into a first machine learning model generated by learning the relationship between the attributes of the user in a first category transaction (first type of transaction) and the creditworthiness of the user. In the present disclosure, a first category transaction refers to a transaction that is for a period of a predetermined length or more and/or for an amount of a predetermined length or more. That is, examples of transactions in the first category include relatively long-term and/or high-value transactions, such as long-term loans or high-value loans.

本実施形態において、第一機械学習モデルは、第一カテゴリの取引と同じ種類の他の取引（例えば、他の長期ローン又は高額ローン等）の状況を含む属性データを入力値とし当該ユーザに係る信用度を示す値（第一スコア）を出力値とする教師データを用いて生成及び／又は更新される。そして、第一機械学習モデルは、第一カテゴリの取引において、融資をしてよいか否か、又はデフォルト確率と対応するような審査結果を出力する、審査用の機械学習モデルである。このため、第一機械学習モデルには、従前の高額融資審査モデルにおける入力データの種別が適宜採用されてもよい。 In this embodiment, the first machine learning model is generated and/or updated using training data in which attribute data including the status of other transactions of the same type as the first category transaction (e.g., other long-term loans or high-value loans, etc.) is used as input values, and a value indicating the creditworthiness of the user (first score) is used as output values. The first machine learning model is a machine learning model for screening, which outputs screening results for transactions in the first category, such as whether or not a loan may be granted, or a result that corresponds to the probability of default. For this reason, the first machine learning model may appropriately adopt the types of input data used in the previous high-value loan screening model.

第一機械学習モデルへの入力データは、上述した事実属性データ及び／又は推定属性データを含む属性データ群を含む。入力データには、例として、年収、勤務先業種、生年月日、生活状況（居住期間（例えば、月単位）、勤務期間（例えば、月単位）、（既契約の他のローン等の）借入状況／返済状況等が含まれる。 The input data to the first machine learning model includes a group of attribute data including the above-mentioned fact attribute data and/or estimated attribute data. Examples of the input data include annual income, type of employment, date of birth, living situation (length of residence (e.g., in months), length of employment (e.g., in months), borrowing status/repayment status (of other loans already contracted, etc.), etc.).

第二スコア取得部２３は、ユーザに係る入力データを第二機械学習モデルに入力することで得られた出力（例えば、０を最小値、１を最大値として正規化／規格化されたスコア）に基づいて、当該ユーザの第二スコアを取得する。より具体的には、第二スコア取得部２３は、ユーザに係る入力データを、第一カテゴリの取引とは異なる第二カテゴリの取引（第二の種類の取引）におけるユーザの属性と当該ユーザのリスクとの関係を学習することで生成された第二機械学習モデルに入力することで得られた出力に基づいて、当該ユーザの第二スコアを取得する。本開示において、第二カテゴリの取引とは、第一カテゴリの取引に比べて短い期間及び／又は少ない額に係る取引を指す。即ち、第二カテゴリの取引としては、例えば、後払い決済、クレジットカード利用、短期ローン又は少額ローン等の、比較的短期及び／又は少額の取引が挙げられる。 The second score acquisition unit 23 acquires a second score for the user based on an output (e.g., a score normalized/standardized with 0 as the minimum value and 1 as the maximum value) obtained by inputting input data related to the user into a second machine learning model. More specifically, the second score acquisition unit 23 acquires a second score for the user based on an output obtained by inputting input data related to the user into a second machine learning model generated by learning the relationship between the attributes of the user and the risk of the user in a second category transaction (second type of transaction) different from the first category transaction. In this disclosure, a transaction in the second category refers to a transaction that is shorter in duration and/or smaller in amount than transactions in the first category. That is, examples of transactions in the second category include transactions that are relatively short-term and/or small in amount, such as postpaid settlement, credit card use, short-term loans, or small loans.

本実施形態において、第二機械学習モデルは、第二カテゴリの取引と同じ種類の他の取引（例えば、他の後払い決済、クレジットカード利用、短期ローン又は少額ローン等）の状況を含む属性データを入力値とし、当該ユーザが支払いを行わずに（当該ユーザについて債権が回収されずに）デフォルトとなる蓋然性に基づくリスクを示す値（第二スコア）を出力値とする教師データを用いて生成及び／又は更新される。リスクの表現方法は限定されず、様々な指標が採用されてよいが、例えば、リスクは、デフォルトとなる確率を用いて表現することが出来る。また、リスクを示す指標には、デフォルトとなる確率以外の指標が採用されてよい。例えば、リスクをクラス（ランク）分けし、いずれのクラス（ランク）であるかを指標として用いてもよい。即ち、第二機械学習モデルは、例えば、第二カテゴリの取引である後払い決済で対象ユーザが商品及び／又はサービスを購入した場合に後払い決済の精算が当該対象ユーザによって正しく履行されないリスクに応じて変化するスコア、又は、同じく第二カテゴリの取引であるクレジットカード決済で対象ユーザが商品及び／又はサービスを購入した場合のデフォルト確率に関連するスコア、等を出力する、審査用の機械学習モデルである。 In this embodiment, the second machine learning model is generated and/or updated using training data in which attribute data including the status of other transactions of the same type as the second category transaction (e.g., other deferred payment, credit card use, short-term loans, small loans, etc.) is used as an input value, and a value (second score) indicating a risk based on the probability that the user will default without making a payment (without the debt being collected from the user) is used as an output value. There is no limitation on the method of expressing risk, and various indices may be adopted. For example, the risk can be expressed using the probability of default. Indices other than the probability of default may be adopted as an indicator of risk. For example, the risk may be divided into classes (ranks), and the class (rank) may be used as an indicator. That is, the second machine learning model is a machine learning model for screening that outputs, for example, a score that changes depending on the risk that the target user will not properly settle the deferred payment when the target user purchases goods and/or services with a deferred payment, which is a transaction of the second category, or a score related to the probability of default when the target user purchases goods and/or services with a credit card payment, which is also a transaction of the second category.

このため、第二機械学習モデルには、クレジットカード審査等の少額融資の審査モデルにおける入力データの種別が適宜採用されてよい。また、第二機械学習モデルには、ＢＮＰＬ（Ｂｕｙ－Ｎｏｗ－Ｐａｙ－Ｌａｔｅｒ、後払い決済）の審査モデルが採用されてよい。ＢＮＰＬ審査モデルを第二機械学習モデルとして採用する場合、何かしらの信用スコアやユーザ属性を、本開示における最終的な審査結果の決定の用に供してよい。 For this reason, the second machine learning model may appropriately adopt the type of input data in a screening model for small loans such as credit card screening. In addition, the second machine learning model may adopt a BNPL (Buy-Now-Pay-Later) screening model. When a BNPL screening model is adopted as the second machine learning model, some kind of credit score or user attributes may be used to determine the final screening result in this disclosure.

第二機械学習モデルにおいても、第一機械学習モデルと同様、事実属性データ及び／又は推定属性データを含む属性データ群を適宜、入力データとして採用することができる。入力データには、例として、年収、生活状況（勤務先業種、勤務形態（リモートワークが可能か否か等））、クレジットカード利用状況等が含まれる。 In the second machine learning model, as in the first machine learning model, a group of attribute data including fact attribute data and/or estimated attribute data can be appropriately adopted as input data. Examples of input data include annual income, living situation (industry of employment, working style (whether remote work is possible, etc.)), credit card usage, etc.

ユーザセグメント特定部２４は、対象ユーザを含むユーザ群のセグメンテーションを行うことで、当該対象ユーザが属するユーザセグメントを特定する。ここで、セグメンテーションは、ユーザ群に含まれるユーザの属性データを参照して、共通する属性データを有するユーザを同じユーザセグメントに入れることで実行される。例えば、年収、勤務先業種、性別及び年齢層が共通するユーザが、同じユーザセグメントに入れられる。ここで、セグメンテーションの際に参照されるユーザ属性は限定されないが、例えば第一機械学習モデルの入力データと第二機械学習の入力データとの間で共通する属性（換言すれば、第一機械学習モデル及び第二機械学習モデルのいずれにおいても入力データとして用いられる属性）をセグメンテーションに用いることで、後述する相関の判定行うことが容易となる。また、セグメンテーションには、ルールベースのセグメンテーションが用いられてもよいし、機械学習モデルによる推論ベースのセグメンテーションが用いられてもよい。また、セグメンテーションは、ユーザ群に対して一般的なクラスタリング手法を用いたクラスタリングを実行し、クラスタリングの結果を参照することで行われてもよい。 The user segment identification unit 24 performs segmentation of a user group including the target user to identify a user segment to which the target user belongs. Here, segmentation is performed by referring to the attribute data of users included in the user group and placing users having common attribute data in the same user segment. For example, users having the same annual income, business type, gender, and age group are placed in the same user segment. Here, the user attributes referred to during segmentation are not limited, but for example, by using attributes common between the input data of the first machine learning model and the input data of the second machine learning model (in other words, attributes used as input data in both the first machine learning model and the second machine learning model) for segmentation, it becomes easier to determine the correlation described later. In addition, rule-based segmentation may be used for the segmentation, or inference-based segmentation using a machine learning model may be used. In addition, segmentation may be performed by performing clustering using a general clustering method on the user group and referring to the clustering results.

相関判定部２５は、ユーザセグメントにおける第一スコアと第二スコアとの相関の強さに基づいて、当該ユーザセグメントが第一スコアと第二スコアとの間に所定の基準以上の相関を有する第一ユーザセグメントであるか、又は所定の基準未満の相関を有する（本開示では、相関が無いものも含む。）第二ユーザセグメントであるかを判定する。即ち、相関判定部２５は、所定の基準に従って、ユーザが属するユーザセグメントが、第一スコアと第二スコアとの相関が強いユーザセグメントであるか、又は第一スコアと第二スコアとの相関が弱いユーザセグメントであるかを判定する。より具体的には、例えば、相関判定部２５は、任意のユーザセグメントについて、過去に算出された第一スコアの統計値（例えば、平均値や中央値）と、過去に算出された第二スコアの統計値（例えば、平均値や中央値）との差分を算出し、当該差分が所定の基準以上である場合には当該ユーザセグメントを第一ユーザセグメントであると判定し、当該差分が所定の基準未満である場合には当該ユーザセグメントを第二ユーザセグメントであると判定することが出来る。但し、ユーザセグメントにおける第一スコアと第二スコアとの相関の具体的な判定手段は、本開示における例示に限定されない。 Based on the strength of the correlation between the first score and the second score in the user segment, the correlation determination unit 25 determines whether the user segment is a first user segment having a correlation between the first score and the second score that is equal to or greater than a predetermined standard, or a second user segment having a correlation less than a predetermined standard (including those with no correlation in the present disclosure). That is, the correlation determination unit 25 determines whether the user segment to which the user belongs is a user segment having a strong correlation between the first score and the second score, or a user segment having a weak correlation between the first score and the second score, according to a predetermined standard. More specifically, for example, the correlation determination unit 25 calculates the difference between a statistical value of the first score calculated in the past (e.g., the average value or the median value) and a statistical value of the second score calculated in the past (e.g., the average value or the median value) for any user segment, and determines that the user segment is a first user segment if the difference is equal to or greater than a predetermined standard, and determines that the user segment is a second user segment if the difference is less than the predetermined standard. However, the specific means for determining the correlation between the first score and the second score in the user segment is not limited to the examples in the present disclosure.

例えば、セグメントを構成するユーザ群が５０－６０代の男性である場合、例えば信用情報機関により決定される信用情報に基づくスコアである第一スコアの高さと、例えば少額決済用の審査モデルからの出力に基づくスコアである第二スコアの高さとは相関が強い（ｃｏｒｒｅｌａｔｅｄ）と仮定される。一方で、セグメントを構成するユーザ群が２０－３０代の男性である場合、第一スコアと第二スコアとの相関は弱いか又は相関が無い（ＮＯＴｃｏｒｒｅｌａｔｅｄ）と仮定される。これは、２０－３０代のような若年層は、信用情報の蓄積が十分でないために第一スコアの精度が十分でなく、例えば少額決済のような若年層の利用頻度が高い用途の審査モデルによる第二スコアの精度が十分であるという傾向が強いことに由来している。 For example, if the user group constituting the segment is men in their 50s and 60s, it is assumed that there is a strong correlation (correlated) between the height of the first score, which is a score based on credit information determined by a credit information agency, and the height of the second score, which is a score based on the output from a screening model for small payments. On the other hand, if the user group constituting the segment is men in their 20s and 30s, it is assumed that there is a weak correlation between the first score and the second score, or that there is no correlation (NOT correlated). This is because younger people in their 20s and 30s tend not to have sufficient accumulated credit information, so the accuracy of the first score is insufficient, and there is a strong tendency for the accuracy of the second score based on a screening model for uses that are frequently used by young people, such as small payments, to be sufficient.

審査結果決定部２６は、対象ユーザが属するユーザセグメントに応じて、当該対象ユーザの第一スコア及び／又は第二スコアに基づいて当該対象ユーザの審査結果を決定する。本実施形態において、審査結果決定部２６は、対象ユーザが属するユーザセグメントが第一ユーザセグメントであると判定された場合、第一カテゴリの取引については第一スコアに基づいて、第二カテゴリの取引については第二スコアに基づいて、当該対象ユーザの審査結果を決定する。即ち、本実施形態では、第一スコアと第二スコアとの相関が強い第一ユーザセグメントに属するユーザの場合、第一スコアが審査に不利に働く値である場合に、第二スコアにかかわらず、最終的な審査結果がマイナスの形で決定される（例えば、融資が下りない等）。 The screening result determination unit 26 determines the screening result of the target user based on the first score and/or second score of the target user depending on the user segment to which the target user belongs. In this embodiment, when the user segment to which the target user belongs is determined to be the first user segment, the screening result determination unit 26 determines the screening result of the target user based on the first score for transactions in the first category and based on the second score for transactions in the second category. That is, in this embodiment, in the case of a user who belongs to the first user segment in which the first score and the second score are highly correlated, if the first score is a value that works against the screening, the final screening result is determined to be negative regardless of the second score (for example, the loan is not granted).

一方、審査結果決定部２６は、対象ユーザが属するユーザセグメントが第二ユーザセグメントであると判定された場合、第一スコア及び第二スコアのうちより高い信用度又はより低いリスクを示すスコアに基づいて、当該対象ユーザの審査結果を決定する。換言すれば、本実施形態において、審査結果決定部２６は、特定のユーザセグメントが第一スコアと第二スコアとの相関が弱いか又は相関が無い第二ユーザセグメントである場合、当該ユーザセグメントにユーザが属するときに、第一スコアが審査に不利に働く値である場合であっても、第二スコアに基づけば信用が十分であると判定可能な値であるような場合に、最終的な審査結果がプラスの形で決定される（審査が有利に働くよう審査結果が決定される）。これは、第一スコアが高く表面上は信用が不足していると判定されるような場合であっても、より信頼がおけるスコア（第二スコア）でそのユーザが信用に足るという結果を示す場合には、審査結果がプラスの形で決定されることを意味する。 On the other hand, when the user segment to which the target user belongs is determined to be the second user segment, the screening result determination unit 26 determines the screening result of the target user based on the score indicating the higher creditworthiness or the lower risk among the first score and the second score. In other words, in this embodiment, when a specific user segment is a second user segment in which the correlation between the first score and the second score is weak or there is no correlation, the screening result determination unit 26 determines the final screening result in a positive manner (the screening result is determined so that the screening is favorable) when the first score is a value that works against the screening when the user belongs to the user segment, but is a value that can be determined to be sufficiently trustworthy based on the second score. This means that even if the first score is high and it is determined on the surface that the user lacks credit, the screening result is determined in a positive manner if the more reliable score (second score) indicates that the user is trustworthy.

図３は、本実施形態における審査処理の概要を示す図である。即ち、本実施形態では、対象ユーザの審査を行う際、審査用の機械学習モデルにより出力された第一スコアと、他の用途の機械学習モデルにより出力された第二スコアと、に基づき、当該ユーザのローン審査結果が決定されるが、この際、ユーザセグメント毎の第一スコア及び第二スコアの相関の強弱に応じて、審査結果決定部２６による審査結果の決定ロジックが決定され、審査結果が出力される。 Figure 3 is a diagram showing an overview of the screening process in this embodiment. That is, in this embodiment, when screening a target user, the loan screening result of the user is determined based on the first score output by the machine learning model for screening and the second score output by the machine learning model for another purpose, and at this time, the screening result determination logic by the screening result determination unit 26 is determined according to the strength of correlation between the first score and the second score for each user segment, and the screening result is output.

但し、第一スコア及び／又は第二スコアに基づいて当該対象ユーザの審査結果を決定する具体的な手法は本実施形態における開示に限定されない。例えば、審査結果決定部２６は、第一スコア及び第二スコアに基づいて算出された第三スコアに基づいて、当該対象ユーザの審査結果を決定してもよい。より具体的には、例えば、審査結果決定部２６は、対象ユーザが属するユーザセグメントが第一ユーザセグメントであると判定された場合、第一カテゴリの取引については第一スコアの重みを大きく、第二スコアの重みを小さく設定して算出された第三スコアに基づいて、第二カテゴリの取引については第一スコアの重みを小さく、第二スコアの重みを大きく設定して算出された第三スコアに基づいて、当該対象ユーザの審査結果を決定してもよい。また、例えば、審査結果決定部２６は、対象ユーザが属するユーザセグメントが第二ユーザセグメントであると判定された場合、第一スコア及び第二スコアのうちより高い信用度又はより低いリスクを示すスコアの重みを大きく、他方のスコアの重みを小さく設定して算出された第三スコアに基づいて、当該対象ユーザの審査結果を決定してもよい。 However, the specific method of determining the screening result of the target user based on the first score and/or the second score is not limited to the disclosure in this embodiment. For example, the screening result determination unit 26 may determine the screening result of the target user based on the third score calculated based on the first score and the second score. More specifically, for example, when the user segment to which the target user belongs is determined to be the first user segment, the screening result determination unit 26 may determine the screening result of the target user based on the third score calculated by setting the weight of the first score large and the weight of the second score small for transactions in the first category, and based on the third score calculated by setting the weight of the first score small and the weight of the second score large for transactions in the second category. Also, for example, when the user segment to which the target user belongs is determined to be the second user segment, the screening result determination unit 26 may determine the screening result of the target user based on the third score calculated by setting the weight of the score indicating a higher creditworthiness or lower risk among the first score and the second score large and setting the weight of the other score small.

機械学習部２７は、第一スコア取得部２２による第一スコア取得に用いられる機械学習モデル、及び第二スコア取得部２３による第二スコア取得に用いられる機械学習モデルを生成及び／又は更新する。第一スコア取得のための機械学習モデルは、対象ユーザに係る１又は複数のユーザ属性のデータが入力された場合に、当該ユーザに係る信用度を示す第一スコアを出力する機械学習モデルである。また、第二スコア取得のための機械学習モデルは、対象ユーザに係る１又は複数のユーザ属性のデータが入力された場合に、当該ユーザが支払いを行わずにデフォルトとなる蓋然性に基づくリスクの程度を示す第二スコアを出力する機械学習モデルである。 The machine learning unit 27 generates and/or updates a machine learning model used by the first score acquisition unit 22 to acquire the first score, and a machine learning model used by the second score acquisition unit 23 to acquire the second score. The machine learning model for acquiring the first score is a machine learning model that outputs a first score indicating the creditworthiness of a target user when data on one or more user attributes related to the target user is input. The machine learning model for acquiring the second score is a machine learning model that outputs a second score indicating the degree of risk based on the likelihood that the user will default without making a payment when data on one or more user attributes related to the target user is input.

第一スコア取得のための機械学習モデルの生成及び／又は更新にあたって、機械学習部２７は、ユーザ毎に、当該ユーザの属性データ群を入力値とし当該ユーザに係るスコアを出力値として定義した教師データを作成する。そして、機械学習部２７は、当該教師データに基づいて、第一機械学習モデルを生成及び／又は更新する。上述の通り、第一機械学習モデルに入力される属性データ群には、事実属性データ及び推定属性データが含まれ、対応するユーザのスコアと組み合わせられて、教師データとして機械学習部２７に入力される。教師データに設定されるスコアは、ルールベースで決定されたスコアであってよく、マニュアルで設定された（アノテーションがなされた）スコアであってもよい。また、第一機械学習モデルによって過去に出力された後で、管理者等によって修正されたスコアであってもよい。 When generating and/or updating the machine learning model for obtaining the first score, the machine learning unit 27 creates teacher data for each user, in which the attribute data group of the user is defined as an input value and the score related to the user is defined as an output value. Then, the machine learning unit 27 generates and/or updates the first machine learning model based on the teacher data. As described above, the attribute data group input to the first machine learning model includes fact attribute data and estimated attribute data, and is combined with the score of the corresponding user and input to the machine learning unit 27 as teacher data. The score set in the teacher data may be a rule-based score or may be a manually set (annotated) score. It may also be a score that was previously output by the first machine learning model and then corrected by an administrator or the like.

第二スコア取得のための機械学習モデル生成及び／又は更新にあたって、機械学習部２７は、ユーザの属性毎に、所定の属性を有する複数のユーザのデフォルト発生率に係る統計量（本実施形態では、平均値。但し、例えば最頻値や中央値等の統計的指標が用いられてもよい。）を、当該属性を有するユーザのリスクの程度を示す第二スコアとして定義した教師データに基づいて、機械学習モデルを作成する。算出された第二スコアは、対応するユーザの事実属性データ及び推定属性データを含む属性データ群と組み合わせられて、教師データとして機械学習部２７に入力される。 When generating and/or updating a machine learning model for obtaining the second score, the machine learning unit 27 creates a machine learning model based on training data in which a statistical quantity (in this embodiment, the average value; however, a statistical indicator such as the mode or median may also be used) relating to the default occurrence rate of multiple users having a specific attribute is defined as a second score indicating the degree of risk of the user having the attribute. The calculated second score is combined with a group of attribute data including the factual attribute data and estimated attribute data of the corresponding user, and is input to the machine learning unit 27 as training data.

本開示に係る技術を実装するにあたり第一機械学習モデル又は第二機械学習モデル等として採用可能な機械学習モデル生成／更新のフレームワークは、例として、アンサンブル学習アルゴリズムに基づく。当該フレームワークには、例えば、勾配ブースティング決定木（ＧｒａｄｉｅｎｔＢｏｏｓｔｉｎｇＤｅｃｉｓｉｏｎＴｒｅｅ：ＧＢＤＴ）に基づく機械学習フレームワーク（例えば、ＬｉｇｈｔＧＢＭ）が採用されてよい。換言すると、当該フレームワークは、前後の弱学習器（弱分類器）間で正解と予測値との誤差を引き継がせるような決定木モデルに基づく機械学習フレームワークが採用されてよい。ここでの予測値とは、例として、スコアの予測値を指す。なお、当該フレームワークは、ＬｉｇｈｔＧＢＭの他、ＸＧＢｏｏｓｔやＣａｔＢｏｏｓｔ等のブースティング手法を採用してよい。決定木を用いるフレームワークによれば、ニューラルネットワークを用いるフレームワークと比較して少ないパラメータ調整の手間で、比較的高い性能を有する機械学習モデルを生成／更新することができる。但し、本開示に係る技術を実装するにあたり採用可能な機械学習モデル生成／更新のフレームワークは、本実施形態における例示に限定されない。例えば、学習器として勾配ブースティング決定木に代えてランダムフォレスト等の他の学習器が採用されてよいし、ニューラルネットワーク等の所謂弱学習器とは称されない学習器が採用されてもよい。また、特にニューラルネットワーク等の所謂弱学習器とは称されない学習器が採用される場合には、アンサンブル学習が採用されなくてもよい。 A machine learning model generation/update framework that can be adopted as the first machine learning model or the second machine learning model when implementing the technology according to the present disclosure is based on an ensemble learning algorithm, for example. For example, a machine learning framework (for example, LightGBM) based on a gradient boosting decision tree (GBDT) may be adopted as the framework. In other words, the framework may be a machine learning framework based on a decision tree model that transfers the error between the correct answer and the predicted value between the previous and next weak learners (weak classifiers). The predicted value here refers to the predicted value of the score, for example. In addition to LightGBM, the framework may adopt boosting methods such as XGBoost and CatBoost. According to a framework using a decision tree, a machine learning model with relatively high performance can be generated/updated with less effort in parameter adjustment compared to a framework using a neural network. However, the machine learning model generation/update framework that can be adopted when implementing the technology according to the present disclosure is not limited to the example in this embodiment. For example, instead of a gradient boosting decision tree, another learning device such as a random forest may be adopted as a learning device, or a learning device that is not called a weak learning device such as a neural network may be adopted. In particular, when a learning device that is not called a weak learning device such as a neural network is adopted, ensemble learning may not be adopted.

図４は、本実施形態において採用される機械学習モデルの決定木の概念を簡略図である。決定木アルゴリズムに基づいた勾配ブースティングの機械学習フレームワークを採用する場合、決定木の各ノードの分岐条件の最適化が行われる。具体的には、決定木アルゴリズムに基づいた勾配ブースティングの機械学習フレームワークでは、一つの親のノードから分岐した二つの子のノードの夫々が示す属性を有するユーザ群についてスコアを夫々算出し、このスコアの差分が大きくなるように（例えば、差分が最大になるように、又は所定の閾値以上になるように）、即ち、二つの子のノードがきれいに分岐するように、親のノードの分岐条件が最適化される。例えば、ノードの分岐条件として示される属性が年齢である場合、分岐の閾値に設定される年齢を変更したり、分岐条件を年齢以外の属性に変更したりしてもよい。このようにして、決定木の全ノードの分岐条件を再帰的に最適化することで、属性データ群に基づくスコアの推定精度を向上させることができる。 Figure 4 is a simplified diagram of the concept of a decision tree in a machine learning model employed in this embodiment. When a gradient boosting machine learning framework based on a decision tree algorithm is employed, the branching conditions of each node in the decision tree are optimized. Specifically, in the gradient boosting machine learning framework based on a decision tree algorithm, scores are calculated for each user group having attributes indicated by two child nodes branched from one parent node, and the branching conditions of the parent node are optimized so that the difference between these scores is large (for example, so that the difference is maximized or is equal to or greater than a predetermined threshold), that is, so that the two child nodes branch neatly. For example, if the attribute indicated as the branching condition of the node is age, the age set as the branching threshold may be changed, or the branching condition may be changed to an attribute other than age. In this way, the branching conditions of all nodes in the decision tree are recursively optimized, thereby improving the accuracy of estimating the score based on the attribute data group.

＜処理の流れ＞
次に、本実施形態に係る審査装置によって実行される処理の流れを説明する。なお、以下に説明する処理の具体的な内容及び処理順序は、本開示を実施するための一例である。具体的な処理内容及び処理順序は、本開示の実施の形態に応じて適宜選択されてよい。 <Processing flow>
Next, a flow of processing executed by the screening device according to the present embodiment will be described. Note that the specific contents and processing order of the processing described below are an example for implementing the present disclosure. The specific contents and processing order may be appropriately selected according to the embodiment of the present disclosure.

図５は、本実施形態に係る機械学習処理の流れを示すフローチャートである。本フローチャートに示された処理は、定期的に、又は管理者によって指定されたタイミングで実行される。 Figure 5 is a flowchart showing the flow of the machine learning process according to this embodiment. The process shown in this flowchart is executed periodically or at a timing specified by the administrator.

本実施形態において、機械学習処理では、第一機械学習モデル及び第二機械学習モデルが生成及び／又は更新される。機械学習部２７は、過去に蓄積されたユーザ毎の属性データ群と、対応するユーザについて予め決定された第一スコアと、の組み合わせを含む教師データを作成する（ステップＳ１０１）。そして、機械学習部２７は、作成された教師データを用いて、第一機械学習モデルを生成及び／又は更新する（ステップＳ１０２）。また、機械学習部２７は、過去に蓄積されたユーザ毎の属性データ群と、対応するユーザについて予め決定された第二スコアと、の組み合わせを含む教師データを作成する（ステップＳ１０３）。そして、機械学習部２７は、作成された教師データを用いて、第二機械学習モデルを生成及び／又は更新する（ステップＳ１０４）。その後、本フローチャートに示された処理は終了する。 In this embodiment, in the machine learning process, a first machine learning model and a second machine learning model are generated and/or updated. The machine learning unit 27 creates teacher data including a combination of a group of attribute data for each user accumulated in the past and a first score predetermined for the corresponding user (step S101). Then, the machine learning unit 27 uses the created teacher data to generate and/or update the first machine learning model (step S102). In addition, the machine learning unit 27 creates teacher data including a combination of a group of attribute data for each user accumulated in the past and a second score predetermined for the corresponding user (step S103). Then, the machine learning unit 27 creates and/or updates the second machine learning model using the created teacher data (step S104). Thereafter, the process shown in this flowchart ends.

図６は、本実施形態に係る審査処理の流れを示すフローチャートである。本フローチャートに示された処理は、定期的に、又は指定されたタイミングで、対象となるユーザ毎に実行される。 Figure 6 is a flowchart showing the flow of the screening process according to this embodiment. The process shown in this flowchart is executed periodically or at a specified time for each target user.

ステップＳ２０１からステップＳ２０３では、第一スコア及び第二スコアが取得される。属性決定部２１は、対象ユーザの属性データ群を取得する（ステップＳ２０１）。そして、第一スコア取得部２２は、ステップＳ２０１で取得された属性データ群のうち、第一機械学習モデルに対応する入力データを第一機械学習モデルに入力し、出力された値に基づいて、当該ユーザに設定される第一スコアを取得する（ステップＳ２０２）。また、第二スコア取得部２３は、ステップＳ２０１で取得された属性データ群のうち、第二機械学習モデルに対応する入力データを第二機械学習モデルに入力し、出力された値に基づいて、当該ユーザに設定される第二スコアを取得する（ステップＳ２０３）。その後、処理はステップＳ２０４へ進む。 In steps S201 to S203, a first score and a second score are obtained. The attribute determination unit 21 obtains a group of attribute data of the target user (step S201). Then, the first score acquisition unit 22 inputs input data corresponding to the first machine learning model from the group of attribute data obtained in step S201 into the first machine learning model, and obtains a first score to be set for the user based on the output value (step S202). Also, the second score acquisition unit 23 inputs input data corresponding to the second machine learning model from the group of attribute data obtained in step S201 into the second machine learning model, and obtains a second score to be set for the user based on the output value (step S203). After that, the process proceeds to step S204.

ステップＳ２０４及びステップＳ２０５では、対象ユーザが属するユーザセグメントが特定され、当該ユーザセグメントにおける第一スコアと第二スコアとの相関が判定される。ユーザセグメント特定部２４は、対象ユーザを含むユーザ群のセグメンテーションを行うことで、当該対象ユーザが属するユーザセグメントを特定する（ステップＳ２０４）。対象ユーザが属するユーザセグメントが特定されると、相関判定部２５は、特定されたユーザセグメントにおける第一スコアと第二スコアとの相関の強さに基づいて、当該ユーザセグメントが第一スコアと第二スコアとの間に所定の基準以上の相関を有する第一ユーザセグメントであるか、又は所定の基準未満の相関を有する第二ユーザセグメントであるかを判定する（ステップＳ２０５）。当該ユーザセグメントが第一ユーザセグメントであると判定された場合（ステップＳ２０５のＹＥＳ）、処理はステップＳ２０６へ進む。一方、当該ユーザセグメントが第二ユーザセグメントであると判定された場合（ステップＳ２０５のＮＯ）、処理はステップＳ２０７へ進む。 In steps S204 and S205, a user segment to which the target user belongs is identified, and a correlation between the first score and the second score in the user segment is determined. The user segment identification unit 24 identifies the user segment to which the target user belongs by performing segmentation of a user group including the target user (step S204). When the user segment to which the target user belongs is identified, the correlation determination unit 25 determines whether the user segment is a first user segment having a correlation between the first score and the second score that is equal to or greater than a predetermined standard, or a second user segment having a correlation that is less than a predetermined standard, based on the strength of the correlation between the first score and the second score in the identified user segment (step S205). If the user segment is determined to be the first user segment (YES in step S205), the process proceeds to step S206. On the other hand, if the user segment is determined to be the second user segment (NO in step S205), the process proceeds to step S207.

ステップＳ２０６では、対象カテゴリに係るスコアに基づいて、当該対象ユーザの審査結果が決定される。審査結果決定部２６は、対象ユーザが属するユーザセグメントが第一ユーザセグメントであると判定された場合、本審査処理が第一カテゴリの取引について実行されたものである場合には第一スコアに基づいて、本審査処理が第二カテゴリの取引について実行されたものである場合には第二スコアに基づいて、当該対象ユーザの審査結果を決定する。その後、処理はステップＳ２０８へ進む。 In step S206, the screening result of the target user is determined based on the score for the target category. If it is determined that the user segment to which the target user belongs is the first user segment, the screening result determination unit 26 determines the screening result of the target user based on the first score if this screening process was performed for a transaction in the first category, and based on the second score if this screening process was performed for a transaction in the second category. Then, the process proceeds to step S208.

ステップＳ２０７では、より高い信用度又はより低いリスクを示すスコアに基づいて、当該対象ユーザの審査結果が決定される。審査結果決定部２６は、ステップＳ２０５において、対象ユーザが属するユーザセグメントが第二ユーザセグメントであると判定された場合に、第一スコア及び第二スコアのうちより高い信用度又はより低いリスクを示すスコアに基づいて、当該対象ユーザの審査結果を決定する。その後、処理はステップＳ２０８へ進む。 In step S207, the screening result of the target user is determined based on the score indicating higher credibility or lower risk. If it is determined in step S205 that the user segment to which the target user belongs is the second user segment, the screening result determination unit 26 determines the screening result of the target user based on the score indicating higher credibility or lower risk out of the first score and the second score. Then, the process proceeds to step S208.

ステップＳ２０８では、審査結果が出力される。審査結果決定部２６は、ステップＳ２０６又はステップＳ２０７における審査結果を出力する。その後、本フローチャートに示された処理は終了する。 In step S208, the screening results are output. The screening result determination unit 26 outputs the screening results in step S206 or step S207. After that, the processing shown in this flowchart ends.

＜効果＞
本実施形態によれば、例えば若年層のような信用情報の蓄積が不十分なユーザセグメントに属する任意のユーザを対象としたローン審査等をより高い精度で支援することができる。また、本実施形態によれば、異なる入力データ・モデルの組み合わせにより異なる信用スコア間の相関を考慮することで、より適正にユーザの審査を支援することができる。＜Effects＞
According to this embodiment, it is possible to more accurately support loan screening and the like for any user who belongs to a user segment with insufficient accumulation of credit information, such as the younger generation. Also, according to this embodiment, it is possible to more appropriately support user screening by considering the correlation between different credit scores by combining different input data and models.

＜バリエーション＞
以下、上記説明した実施形態のバリエーションとして、審査装置１ｂが対象ユーザの属性データ群に基づいて第二スコアを決定するバリエーションを説明する。ここで、本バリエーションに係る審査装置１ｂのシステム構成及び本バリエーションにおいて用いられるユーザの属性データの例は、上記説明した実施形態と概略同様であるため、説明を省略する（図１を参照）。 <Variations>
Hereinafter, as a variation of the embodiment described above, a variation in which the assessment device 1b determines the second score based on the attribute data group of the target user will be described. Here, the system configuration of the assessment device 1b according to this variation and an example of the user attribute data used in this variation are roughly similar to those in the embodiment described above, and therefore the description will be omitted (see FIG. 1).

図７は、バリエーションに係る審査装置１ｂの機能構成の概略を示す図である。本バリエーションにおいて、審査装置１ｂは、上記説明した実施形態において説明した、属性決定部２１、第一スコア取得部２２、第二スコア取得部２３、ユーザセグメント特定部２４、相関判定部２５、審査結果決定部２６、及び機械学習部２７に加えて、商品関連データ取得部２８を備える。 Figure 7 is a diagram showing an outline of the functional configuration of the review device 1b relating to the variation. In this variation, the review device 1b includes a product-related data acquisition unit 28 in addition to the attribute determination unit 21, first score acquisition unit 22, second score acquisition unit 23, user segment identification unit 24, correlation determination unit 25, review result determination unit 26, and machine learning unit 27 described in the embodiment described above.

商品関連データ取得部２８は、対象ユーザが取引に係る融資を受けて購入しようとする商品／サービスの商品関連データを取得する。ここで、商品関連データ取得部２８は、商品関連データとして、商品／サービスの属性データ、及び当該商品／サービスを販売する店舗（オンライン店舗を含む。）の属性データの少なくともいずれかを取得する。ここで、商品関連データは商品／サービスに関連する情報であればよく、商品関連データとして取得されるデータは限定されないが、商品／サービスの属性データとしては、例えば、価格や商品／サービスカテゴリ／ジャンル等が挙げられ、店舗の属性データとしては、例えば、店舗価格帯や店舗カテゴリ／ジャンル等が挙げられる。更に、商品関連データには、対象商品／サービスの取引に付随するビジネスルールや取引の種類も含まれてよい。ここで、ビジネスルールや取引の種類には、当該商品／サービスに関連する商慣習や取引に付随する契約の内容、商品／サービスの引き渡し態様（配送方法や店舗での受け渡し等）が含まれてよい。 The product-related data acquisition unit 28 acquires product-related data of the product/service that the target user intends to purchase by receiving the loan related to the transaction. Here, the product-related data acquisition unit 28 acquires at least one of attribute data of the product/service and attribute data of the store (including online store) selling the product/service as the product-related data. Here, the product-related data may be any information related to the product/service, and the data acquired as the product-related data is not limited. However, the product/service attribute data may include, for example, price and product/service category/genre, and the store attribute data may include, for example, store price range and store category/genre. Furthermore, the product-related data may include business rules and the type of transaction associated with the transaction of the target product/service. Here, the business rules and the type of transaction may include commercial practices related to the product/service, the contents of the contract associated with the transaction, and the delivery mode of the product/service (delivery method, delivery at the store, etc.).

そして、本バリエーションにおいて、第二スコア取得部２３は、第二機械学習モデルの出力に基づいて、対象ユーザによる商品及び／又はサービスの購入についての融資の承認又は却下を判定するための第二スコアを取得する。ここで、第二機械学習モデルは商品関連データ毎に適宜、学習されていてよい。 In this variation, the second score acquisition unit 23 acquires a second score for determining whether to approve or reject a loan for the purchase of goods and/or services by the target user based on the output of the second machine learning model. Here, the second machine learning model may be trained appropriately for each item-related data.

より具体的には、対象商品／サービスの商品関連データが取得されると、第二スコア取得部２３は、対象ユーザの属性データを、機械学習部２７によって生成又は更新された最新の第二機械学習モデルに入力することで、出力値として、対象ユーザが対象商品／サービスを融資で購入した場合に融資の精算が当該対象ユーザによって正しく履行されないリスクに応じて変化する第二スコアを得る。第二スコアが得られた後の処理については、上記実施形態において説明した処理と同様であるため、説明を省略する。 More specifically, when the product-related data of the target product/service is acquired, the second score acquisition unit 23 inputs the attribute data of the target user into the latest second machine learning model generated or updated by the machine learning unit 27, and obtains, as an output value, a second score that changes according to the risk that the target user will not properly settle the loan if the target user purchases the target product/service with a loan. The processing after the second score is obtained is the same as the processing described in the above embodiment, and therefore will not be described.

＜その他のバリエーション＞
以下、上記説明した実施形態のバリエーションとして、審査装置１が第一機械学習モデル及び第二機械学習モデルを含むｎ個の複数の機械学習モデル（第一～第ｎ機械学習モデル）のそれぞれの出力（第一～第ｎスコア）に基づいて、ユーザの審査結果を決定してよい（ｎ＞１）。また、第一～第ｎ機械学習モデルへの入力データは、上述の第一機械学習モデルへの入力データであってよく、第二機械学習モデルへの入力データであってよい。 <Other variations>
Hereinafter, as a variation of the embodiment described above, the screening device 1 may determine a screening result for a user based on the outputs (first to n-th scores) of a number n of machine learning models (first to n-th machine learning models) including a first machine learning model and a second machine learning model (n>1). Furthermore, the input data to the first to n-th machine learning models may be the input data to the above-mentioned first machine learning model, or may be the input data to the second machine learning model.

第一～第ｎ機械学習モデルは、取引に係る期間及び／又は額に応じて分類される第一～第ｎカテゴリの取引におけるユーザの属性と当該ユーザの信用度との関係を学習することで生成されてよい。また、第一～第ｎ機械学習モデルは、期間又は額に限らず、金利、担保の有無、返済方式等のその他の特徴に応じて分類される第一～第ｎカテゴリの取引におけるユーザの属性と当該ユーザの信用度との関係を学習することで生成されてよい。 The first to nth machine learning models may be generated by learning the relationship between a user's attributes and the creditworthiness of the user in first to nth categories of transactions classified according to the period and/or amount of the transaction. In addition, the first to nth machine learning models may be generated by learning the relationship between a user's attributes and the creditworthiness of the user in first to nth categories of transactions classified according to other characteristics such as interest rates, the presence or absence of collateral, and repayment method, without being limited to the period or amount.

審査装置１は、対象ユーザが属するユーザセグメントに応じて、当該対象ユーザの第一～第ｎスコアに基づいて当該対象ユーザの審査結果を決定する。審査装置１は、対象ユーザが属するユーザセグメントに応じて、第一～第ｎスコアの何れかに基づいて審査結果を決定してよく、重み付けされた第一～第ｎスコアに基づいて審査結果を決定してよい。 The review device 1 determines the review result of the target user based on the first to nth scores of the target user depending on the user segment to which the target user belongs. The review device 1 may determine the review result based on any of the first to nth scores depending on the user segment to which the target user belongs, or may determine the review result based on the weighted first to nth scores.

１審査装置

1. Examination equipment

Claims

a first score acquisition means for acquiring a first score of the user based on an output indicating the user's creditworthiness, the output being obtained by inputting input data including the user's attribute data into a first machine learning model generated by learning the relationship between the user's attributes in a first type of transaction and the user's creditworthiness;
a second score acquisition means for acquiring a second score for the user based on an output indicating the risk of the user, the output being obtained by inputting input data including the user's attribute data into a second machine learning model generated by learning the relationship between the user's attributes and the risk of the user in a second type of transaction different from the first type of transaction;
A user segment identification means for identifying a user segment to which a target user belongs by performing segmentation of a user group including the target user;
a correlation determination means for determining whether the user segment is a first user segment having a correlation between the first score and the second score that is equal to or greater than a predetermined standard, or a second user segment having a correlation between the first score and the second score that is less than a predetermined standard, based on the strength of the correlation between the first score and the second score in the user segment;
an examination result determination means for determining an examination result of the target user based on the score indicating a higher creditworthiness or a lower risk among the first score and the second score when the user segment to which the target user belongs is determined to be the second user segment;
An examination device comprising:

The first score acquisition means acquires a first score for the user based on an output obtained by inputting the input data related to the user into a first machine learning model, which is a machine learning model for screening the first type of transaction related to a predetermined period or more and/or a predetermined amount or more;
The second score acquisition means acquires a second score for the user based on an output obtained by inputting the input data related to the user into a second machine learning model, which is a machine learning model for screening the second type of transaction related to a shorter period and/or a smaller amount than the first type of transaction.
The examination device according to claim 1.

The first score acquisition means acquires a first score for the user by using the first machine learning model generated and/or updated using user attribute data including the status of other transactions of the same type as the first type of transaction and teacher data including a value indicating the creditworthiness of the user.
The examination device according to claim 2.

The second score acquisition means acquires a second score for the user by using the second machine learning model generated and/or updated using user attribute data including statuses of other transactions of the same type as the second type of transaction and teacher data including a value indicating a risk related to the user.
The examination device according to claim 2.

The first score obtaining means estimates the first score using the first machine learning model generated and/or updated using a machine learning framework based on a gradient boosting decision tree;
The second score obtaining means estimates the second score using the second machine learning model generated and/or updated using a machine learning framework based on a gradient boosting decision tree.
The examination device according to claim 1.

the correlation determination means determines, for the user segment, that the user segment is a first user segment when a difference between a statistical value of a first score calculated in the past and a statistical value of a second score calculated in the past is equal to or greater than a predetermined criterion, and determines, for the user segment, that the user segment is a second user segment when the difference is less than the predetermined criterion;
The examination device according to claim 1 .

When it is determined that the user segment to which the target user belongs is the first user segment, the screening result determination means determines a screening result of the target user for the first type of transaction based on the first score.
The examination device according to claim 1 .

When it is determined that the user segment to which the target user belongs is the first user segment, the screening result determination means determines a screening result of the target user for the second type of transaction based on the second score.
The examination device according to claim 1 .

The first type of transaction is a type of transaction other than a deferred payment transaction,
the first score acquisition means acquires the first score of the user based on an output obtained by inputting the input data related to the user into the first machine learning model, which is a machine learning model for screening the first type of transaction;
The second score acquisition means acquires the second score of the user based on an output obtained by inputting the input data related to the user into the second machine learning model, which is a machine learning model for screening the transaction by deferred payment, and which is generated and/or updated using teacher data including user attribute data including the status of the transaction by deferred payment and a value indicating a risk related to the user;
The output value of the second machine learning model is a judgment score that changes depending on the risk that the target user will not correctly settle the deferred payment when the target user purchases goods and/or services with deferred payment,
The examination device according to claim 8 .

The computer
a first score acquisition step of acquiring a first score of the user based on an output indicating the user's creditworthiness obtained by inputting input data including the user's attribute data into a first machine learning model generated by learning the relationship between the user's attributes in a first type of transaction and the user's creditworthiness;
a second score acquisition step of acquiring a second score for the user based on an output indicating the risk of the user, the output being obtained by inputting input data including user attribute data into a second machine learning model generated by learning the relationship between the user attributes and the risk of the user in a second type of transaction different from the first type of transaction;
a user segment identification step of identifying a user segment to which the target user belongs by performing segmentation of a user group including the target user;
a correlation determination step of determining whether the user segment is a first user segment having a correlation between the first score and the second score that is equal to or greater than a predetermined standard, or a second user segment having a correlation between the first score and the second score that is less than a predetermined standard, based on the strength of the correlation between the first score and the second score in the user segment;
an examination result determination step of determining an examination result of the target user based on the score indicating a higher creditworthiness or a lower risk among the first score and the second score when the user segment to which the target user belongs is determined to be the second user segment;
How to do it.

Computer,
a first score acquisition means for acquiring a first score of the user based on an output indicating the user's creditworthiness, the output being obtained by inputting input data including the user's attribute data into a first machine learning model generated by learning the relationship between the user's attributes in a first type of transaction and the user's creditworthiness;
a second score acquisition means for acquiring a second score for the user based on an output indicating the risk of the user, the output being obtained by inputting input data including the user's attribute data into a second machine learning model generated by learning the relationship between the user's attributes and the risk of the user in a second type of transaction different from the first type of transaction;
A user segment identification means for identifying a user segment to which a target user belongs by performing segmentation of a user group including the target user;
a correlation determination means for determining whether the user segment is a first user segment having a correlation between the first score and the second score that is equal to or greater than a predetermined standard, or a second user segment having a correlation between the first score and the second score that is less than a predetermined standard, based on the strength of the correlation between the first score and the second score in the user segment;
an examination result determination means for determining an examination result of the target user based on the score indicating a higher creditworthiness or a lower risk among the first score and the second score when the user segment to which the target user belongs is determined to be the second user segment;
A program that functions as a