JP7757023B2

JP7757023B2 - Classification support system, classification support device, classification support method, and program

Info

Publication number: JP7757023B2
Application number: JP2019003266A
Authority: JP
Inventors: 武志長岡; 貴之北川; 和宏小阪
Original assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Current assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Priority date: 2019-01-11
Filing date: 2019-01-11
Publication date: 2025-10-21
Anticipated expiration: 2039-01-11
Also published as: JP2020113035A

Description

本発明の実施形態は、分類支援システム、分類支援装置、分類支援方法、及びプログラムに関する。 The present invention relates to a classification support system, a classification support device , a classification support method, and a program.

システム開発の上流工程において、システム開発の受託事業者（以下、単に「受託者」という。）は、システム開発の委託事業者（以下、単に「委託者」という。）から提示されるＲＦＰ（Request For Proposal；提案依頼書あるいは調達仕様書）に基づいて、委託者が要求する機能を備えるシステムを提案する。ＲＦＰには、システムの機能要件、非機能要件、及び制約事項等が記載されている。機能要件とは、必ずシステムに搭載されなければならない機能に関する要件である。非機能要件とは、機能要件以外の要件であり、例えば、システムの性能、ユーザビリティ、信頼性、運用性、可用性、安全性、移行性、及び拡張性等に関する要件である。制約事項とは、システム開発時及び運用時における制約に関する事項であり、例えば、契約条件、受託者資格、開発体制、プロジェクト管理方法、開発環境、及び運用環境等に関する制約を定めるための事項である。 In the upstream process of system development, the system development contractor (hereinafter simply referred to as the "contractor") proposes a system with the functions required by the client (hereinafter simply referred to as the "client") based on the RFP (Request For Proposal; request for proposal or procurement specifications) presented by the system development contractor (hereinafter simply referred to as the "client"). The RFP lists the system's functional requirements, non-functional requirements, and constraints. Functional requirements are requirements related to functions that must be included in the system. Non-functional requirements are requirements other than functional requirements, such as requirements related to system performance, usability, reliability, operability, availability, safety, migration, and scalability. Constraints are matters related to restrictions during system development and operation, such as contract terms, contractor qualifications, development organization, project management method, development environment, and operating environment.

委託者の要求を満たすシステムを提案するためには、受託者は、ＲＦＰに記載された機能要件、非機能要件、及び制約事項に関する情報を、網羅的に抽出しなければならない。なぜならば、機能要件、非機能要件、及び制約事項の抽出における抜け漏れは、例えば、後工程において手戻りの発生に繋がり、納期遅延及びコスト超過等のトラブルを発生させる原因になりうるからである。 In order to propose a system that meets the client's requirements, the contractor must comprehensively extract information regarding the functional requirements, non-functional requirements, and constraints listed in the RFP. This is because omissions in the extraction of functional requirements, non-functional requirements, and constraints could lead to rework in later processes, which could cause problems such as delivery delays and cost overruns.

一般に、ＲＦＰは、自然言語で記述された半構造化された文書である。ＲＦＰは、所定のガイドラインに基づいて作成される。但し、このガイドラインは、一般に、目次構成や品質特性について規定するものであり、機能要件、非機能要件、及び制約事項等の分類と、特定の用語との対応付けを記載したものではない。そのため、ガイドラインに基づいて機能要件、非機能要件、及び制約事項等の分類を行うことは困難である。また、システムの規模によっては、ＲＦＰは、例えば数百ページ以上の膨大な分量の文書になる場合もある。そのため、膨大な分量のＲＦＰから機能要件、非機能要件、及び制約事項を網羅的に抽出して、抽出された情報の分類を精度よく、かつ効率的に行うためには、従来、経験豊富な技術者が有するノウハウが必要であった。そのため、ＲＦＰから機能要件、非機能要件、及び制約事項をに抽出して分類を行う作業者が、経験豊富な技術者である場合と経験が浅い技術者である場合とでは、作業効率や作業品質等において大きな差が生じることがあった。 RFPs are generally semi-structured documents written in natural language. RFPs are created based on specific guidelines. However, these guidelines generally stipulate the table of contents and quality characteristics, and do not specify the correspondence between classifications such as functional requirements, non-functional requirements, and constraints and specific terms. Therefore, it is difficult to classify functional requirements, non-functional requirements, and constraints based on the guidelines. Furthermore, depending on the scale of the system, RFPs can be enormous documents, for example, hundreds of pages or more. Therefore, in the past, comprehensively extracting functional requirements, non-functional requirements, and constraints from massive RFPs and accurately and efficiently classifying the extracted information required the know-how of experienced engineers. Therefore, there was a significant difference in work efficiency and work quality between experienced and inexperienced engineers who extracted and classified functional requirements, non-functional requirements, and constraints from RFPs.

特開２００６－１２７３９７号公報Japanese Patent Application Laid-Open No. 2006-127397 特開２００８－２５０７６０号公報Japanese Patent Application Laid-Open No. 2008-250760 特開２０１２－２４３１９４号公報JP 2012-243194 A 特開２０１４－２０３２２８号公報JP 2014-203228 A 特開２０１７－２２４１２６号公報Japanese Patent Application Laid-Open No. 2017-224126

本発明が解決しようとする課題は、半構造化された文書に含まれる情報の分類作業を、作業品質を損なうことなく効率化することができる分類支援システム、分類支援装置、学習装置、分類支援方法、及びプログラムを提供することである。 The problem that this invention aims to solve is to provide a classification support system, classification support device, learning device, classification support method, and program that can improve the efficiency of the classification work of information contained in semi-structured documents without compromising the quality of the work.

実施形態の分類支援システムは、分類学習部と、分類部と、解析部と、分類結果出力部と、を持つ。分類学習部は、システムに搭載されなければならない機能に関する要件を示す機能要件と、前記機能要件以外の要件を示す非機能要件と、に分類を行うことが可能なテキストを少なくとも含む半構造化された学習用の文書に含まれる文字列に基づく学習用情報と、前記文字列の内容の前記分類を特定する分類情報と、が対応付けられた学習用データを取得し、前記学習用情報に基づく学習用入力情報と、分類情報と、を入力して機械学習を行い、学習済みの学習モデルを示す学習済みモデルを記憶部に記憶させる。分類部は、前記機能要件と前記非機能要件とに分類を行うことが可能なテキストを少なくとも含む半構造化された文書に含まれる文字列に基づく分類用情報を取得し、前記分類用情報に基づく分類用入力情報を前記学習済みモデルに入力することによって前記文字列の内容の前記機能要件と前記非機能要件とに分類する前記分類の分類結果に関する情報であって、前記分類ごとの確率の値を示す分類結果関連情報を取得する。解析部は、前記分類結果関連情報と所定の閾値とに基づいて前記分類結果を示す分類結果情報を生成する。分類結果出力部は、前記分類結果情報を出力する。前記解析部は、前記分類結果関連情報を解析した結果に基づいて前記閾値の値を更新する。 A classification support system according to an embodiment includes a classification learning unit, a classification unit, an analysis unit, and a classification result output unit. The classification learning unit acquires training data in which training information based on character strings included in semi-structured training documents containing at least text that can be classified into functional requirements indicating requirements related to functions that must be installed in the system and non-functional requirements indicating requirements other than the functional requirements and classification information that identifies the classification of the content of the character strings is associated with the training information. The classification unit inputs training input information based on the training information and the classification information to perform machine learning, and stores a trained model representing a trained learning model in a storage unit. The classification unit acquires classification information based on character strings included in semi-structured documents containing at least text that can be classified into functional requirements and non-functional requirements , and inputs the classification input information based on the classification information into the trained model to acquire classification result-related information indicating a probability value for each classification, the classification result information indicating the classification result based on the classification result-related information and a predetermined threshold. The analysis unit generates classification result information indicating the classification result based on the classification result-related information and a predetermined threshold. The classification result output unit outputs the classification result information. The analysis unit updates the threshold value based on a result of analyzing the classification result related information.

予め定義された複数の区分、及び各区分に分類される要求仕様テキストの一例を示す図。FIG. 10 is a diagram showing an example of a plurality of predefined categories and requirement specification texts classified into each category. 分類支援システム１の全体構成図。FIG. 1 is an overall configuration diagram of a classification support system 1. 学習用データ作成装置１０の機能構成を示すブロック図。FIG. 2 is a block diagram showing the functional configuration of the learning data creation device 10. 要求仕様分類テーブルＴ１のテーブル構成を示す図。FIG. 2 is a diagram showing the table configuration of a required specification classification table T1. 学習装置２０の機能構成を示す図。FIG. 2 is a diagram showing the functional configuration of a learning device 20. 分類支援装置３０の機能構成を示す図。FIG. 2 is a diagram showing the functional configuration of a classification support device 30. 学習用データ作成装置１０の動作を示すフローチャート。4 is a flowchart showing the operation of the learning data creation device 10. 単語ベクトルモデルＭ１の生成における学習装置２０の動作を示すフローチャート。10 is a flowchart showing the operation of the learning device 20 in generating a word vector model M1. 要求仕様分類モデルＭ２の生成における学習装置２０の動作を示すフローチャート。10 is a flowchart showing the operation of the learning device 20 in generating a requirement specification classification model M2. 分類支援装置３０の動作を示すフローチャート。10 is a flowchart showing the operation of the classification support device 30.

以下、実施形態の分類支援システム、分類支援装置、学習装置、分類支援方法、及びプログラムを、図面を参照して説明する。 The classification support system, classification support device, learning device, classification support method, and program of the embodiments will be described below with reference to the drawings.

以下に説明する実施形態の分類支援システム１は、半構造化された文書から切り出された文字列を、予め定義された複数の区分のいずれかに分類することを支援するためのシステムである。半構造化された文書とは、例えば、システム開発における要求仕様文書、役所等への申請書類、及び自由記述欄を含むアンケート結果等である。 The classification support system 1 of the embodiment described below is a system that supports the classification of character strings extracted from semi-structured documents into one of multiple pre-defined categories. Examples of semi-structured documents include requirement specification documents in system development, application documents for government offices, and survey results that include free-form comment fields.

なお、以下に説明する分類支援システム１は、システム開発における要求仕様文書から切り出された文字列（以下、「要求仕様テキスト」という。）を、予め定義された複数の区分である、機能仕様、非機能仕様、又は制約事項のいずれかに分類することを支援するためのシステムであるものとする。 The classification support system 1 described below is a system that supports the classification of character strings extracted from requirements specification documents in system development (hereinafter referred to as "requirements specification text") into one of several predefined categories: functional specifications, non-functional specifications, or constraints.

以下、上述した、予め定義された複数の区分の具体例について説明する。
図１は、予め定義された複数の区分、及び各区分に分類される要求仕様テキストの一例を示す図である。図１に示すように、要求仕様文書から切り出された要求仕様テキストは、機能要件、非機能要件、又は制約事項のいずれかの区分（大分類）に分類される。また、非機能要件、及び制約事項に分類される要求仕様テキストは、更に複数の区分（小分類）のいずれかに分類される。例えば、大分類である「非機能要件」は、更に、「性能」、「信頼性」、又は「運用性」等の小分類の中のいずれかに分類される。また、例えば、大分類である「制約事項」は、更に、「受託者資格」、又は「プロジェクト管理方法」等の小分類の中のいずれかに分類される。 Specific examples of the above-mentioned predefined divisions will be described below.
FIG. 1 is a diagram showing an example of multiple predefined categories and requirements specification text classified into each category. As shown in FIG. 1, requirements specification text extracted from a requirements specification document is classified into one of the following categories (major categories): functional requirements, non-functional requirements, or constraints. Furthermore, requirements specification text classified into non-functional requirements and constraints is further classified into one of multiple categories (minor categories). For example, the major category "non-functional requirements" is further classified into one of the minor categories, such as "performance,""reliability," or "operability." Furthermore, for example, the major category "constraints" is further classified into one of the minor categories, such as "contractor qualifications" or "project management method."

機能要件とは、必ずシステムに搭載されなければならない機能に関する要件である。図１に示すように、「機能要件」に分類される要求仕様テキストは、例えば「・〇〇システムと連携し、〇〇情報の取得ができること。」といった要件を表す文字列である。また、非機能要件とは、機能要件以外の要件である。図１に示すように、「非機能要件」の「性能」に分類される要求仕様テキストは、例えば「・１時間当たり〇〇件のオンラインデータを処理できること。」といった要件を表す文字列である。また、制約事項とは、システム開発時及び運用時における制約に関する事項である。図１に示すように、「制約事項」の「受託者資格」に分類される要求仕様テキストは、例えば「・受託者は、本作業の履行が確実に行われるよう、本作業の全期間に渡って、必要となるスキル、経験を有した要員の確保を保証すること。」といった事項を表す文字列である。 Functional requirements are requirements related to functions that must be included in a system. As shown in Figure 1, requirement specification text classified as "functional requirements" is a string of characters that expresses requirements, such as "It must be able to link with the XX system and obtain XX information." Non-functional requirements are requirements other than functional requirements. As shown in Figure 1, requirement specification text classified as "performance" under "non-functional requirements" is a string of characters that expresses requirements, such as "It must be able to process XX items of online data per hour." Constraints are matters related to restrictions during system development and operation. As shown in Figure 1, requirement specification text classified as "contractor qualifications" under "constraints" is a string of characters that expresses matters, such as "The contractor must ensure the availability of personnel with the necessary skills and experience throughout the entire period of this work to ensure that this work is carried out reliably."

以下、実施形態の分類支援システム１の全体構成について説明する。
図２は、分類支援システム１の全体構成図である。図１に示すように、分類支援システム１は、学習用データ作成装置１０と、学習装置２０と、分類支援装置３０と、を含んで構成される。学習用データ作成装置１０、学習装置２０、及び分類支援装置３０は、それぞれ、例えばパーソナルコンピュータ等の情報処理装置を含んで構成される。 The overall configuration of the classification support system 1 according to the embodiment will be described below.
Fig. 2 is an overall configuration diagram of the classification support system 1. As shown in Fig. 1, the classification support system 1 includes a training data creation device 10, a learning device 20, and a classification support device 30. The training data creation device 10, the learning device 20, and the classification support device 30 each include an information processing device such as a personal computer.

学習用データ作成装置１０は、学習装置２０が行う機械学習（以下、単に「学習」ともいう。）に用いられる学習用データ（教師データ）を作成するための装置である。学習用データ作成装置１０には、学習用の要求仕様テキストが入力される。入力された学習用の要求仕様テキストは、例えば学習用データ作成装置１０を操作する経験豊富な技術者によって分類がなされ、分類結果と対応付けられる。学習用データ作成装置１０は、学習用の要求仕様テキストと分類結果とが対応付けられたデータ（以下、「学習用データ」という。）を、学習装置２０へ出力する。 The training data creation device 10 is a device for creating training data (teacher data) used for machine learning (hereinafter simply referred to as "learning") performed by the learning device 20. Training requirement specification text is input to the training data creation device 10. The input training requirement specification text is classified, for example, by an experienced engineer operating the training data creation device 10, and associated with the classification results. The training data creation device 10 outputs data (hereinafter referred to as "training data") in which the training requirement specification text and the classification results are associated with each other to the learning device 20.

学習装置２０は、要求仕様テキストの分類を行うための学習済みモデルを生成する装置である。学習装置２０は、学習用データ作成装置１０から出力された学習用データに含まれる学習用の要求仕様テキストに対して例えば自然言語処理等の前処理、及び単語ベクトルへの変換（単語ベクトル化）を行う。単語ベクトルへの変換は、機械学習によって生成された単語ベクトルモデルを用いて行われる。 The learning device 20 is a device that generates a trained model for classifying requirement specification text. The learning device 20 performs preprocessing, such as natural language processing, and conversion to word vectors (word vectorization) on the training requirement specification text included in the training data output from the training data creation device 10. The conversion to word vectors is performed using a word vector model generated by machine learning.

なお、単語ベクトルモデルは、例えばインターネット等から得られるコーパス、あるいは学習用の要求仕様テキスト等を教師データとして機械学習が行われることによって予め生成される。コーパスとは、大量に収集された、例えば新聞、雑誌、及び本等に含まれる文字列や、文字化された話し言葉等の情報が、コンピュータによって検索及び分析等ができるように加工された情報からなる言語資料である。なお、本実施形態においては、単語ベクトルモデルは、コーパスを教師データとして機械学習が行われることによって生成されるものとする。 The word vector model is generated in advance by machine learning using, for example, a corpus obtained from the Internet or training requirement specification text as training data. A corpus is a linguistic resource consisting of information such as character strings contained in large amounts of collected material, such as newspapers, magazines, and books, or transcribed spoken language, which has been processed so that it can be searched and analyzed by a computer. In this embodiment, the word vector model is generated by machine learning using a corpus as training data.

学習装置２０は、単語ベクトルに変換された学習用の要求仕様テキストと分類結果とが対応付けられたデータを教師データとして機械学習を行うことにより、要求仕様テキストを分類するための学習済みモデルである要求仕様分類モデルを生成する。学習装置２０は、生成された学習済みモデル（すなわち、単語ベクトルモデル及び要求仕様分類モデル）を分類支援装置３０へ出力する。 The learning device 20 performs machine learning using training data in which the learning requirements text converted into word vectors is associated with the classification results, thereby generating a requirements classification model, which is a trained model for classifying the requirements text. The learning device 20 outputs the generated trained models (i.e., the word vector model and requirements classification model) to the classification assistance device 30.

分類支援装置３０は、分類対象の要求仕様テキストを取得する。分類支援装置３０は、取得した要求仕様テキストに対して例えば自然言語処理等の前処理、及び単語ベクトルモデルによる単語ベクトルへの変換を行う。分類支援装置３０は、単語ベクトルに変換された要求仕様テキストを要求仕様分類モデルに入力することにより、要求仕様テキストの分類結果に関する情報を得る。そして、分類支援装置３０は、要求仕様テキストの分類結果に関する情報を解析し、分類結果を示す情報を出力する。 The classification support device 30 acquires the requirements specification text to be classified. The classification support device 30 performs preprocessing, such as natural language processing, on the acquired requirements specification text, and converts it into word vectors using a word vector model. The classification support device 30 inputs the requirements specification text converted into word vectors into the requirements specification classification model, thereby obtaining information related to the classification results of the requirements specification text. The classification support device 30 then analyzes the information related to the classification results of the requirements specification text and outputs information indicating the classification results.

なお、本実施形態においては、学習用データ作成装置１０と、学習装置２０と、分類支援装置３０とが、それぞれ別々の装置であるものとしたが、これに限られない。例えば、学習用データ作成装置１０、学習装置２０、分類支援装置３０のうちいずれか２つの装置が、あるいは全ての装置が、同一の装置として構成されていても構わない。また、学習済みモデルが、外部の装置に記憶される構成であっても構わない。 In this embodiment, the training data creation device 10, the learning device 20, and the classification support device 30 are each separate devices, but this is not limited to this. For example, any two or all of the training data creation device 10, the learning device 20, and the classification support device 30 may be configured as the same device. Furthermore, the trained model may be stored in an external device.

以下、学習用データ作成装置１０の機能構成について更に詳しく説明する。
図３は、学習用データ作成装置１０の機能構成を示すブロック図である。図３に示すように、学習用データ作成装置１０は、学習用要求仕様テキスト取得部１０１と、要求仕様分類テーブル記憶部１０２と、学習用データ生成部１０３と、操作入力部１０４と、学習用データ記憶部１０５と、学習用データ出力部１０６と、を含んで構成される。 The functional configuration of the learning data creation device 10 will be described in more detail below.
Fig. 3 is a block diagram showing the functional configuration of the learning data creation device 10. As shown in Fig. 3, the learning data creation device 10 includes a learning requirement specification text acquisition unit 101, a requirement specification classification table storage unit 102, a learning data generation unit 103, an operation input unit 104, a learning data storage unit 105, and a learning data output unit 106.

学習用要求仕様テキスト取得部１０１は、外部の装置あるいは記憶媒体等から、学習用の要求仕様文書（半構造化された文書）から切り出された、学習用の要求仕様テキスト（文字列）を取得する。学習用要求仕様テキスト取得部１０１は、取得した学習用の要求仕様テキストを学習用データ生成部１０３へ出力する。なお、学習用要求仕様テキスト取得部１０１が、学習用の要求仕様文書から学習用の要求仕様テキストを切り出す処理を行う構成であってもよい。 The training requirements specification text acquisition unit 101 acquires training requirements specification text (character strings) extracted from a training requirements specification document (semi-structured document) from an external device or storage medium. The training requirements specification text acquisition unit 101 outputs the acquired training requirements specification text to the training data generation unit 103. Note that the training requirements specification text acquisition unit 101 may also be configured to perform a process of extracting training requirements specification text from the training requirements specification document.

要求仕様分類テーブル記憶部１０２（分類記憶部）は、予め生成された要求仕様分類テーブルＴ１を記憶する。要求仕様分類テーブル記憶部１０２は、例えば、ＲＡＭ（Random Access Memory；読み書き可能なメモリ）、フラッシュメモリ、ＥＥＰＲＯＭ（Electrically Erasable Programmable Read Only Memory）、及びＨＤＤ（Hard Disk Drive）等の記憶媒体、又はこれらの記憶媒体の任意の組み合わせによって構成される。 The requirements specification classification table storage unit 102 (classification storage unit) stores a pre-generated requirements specification classification table T1. The requirements specification classification table storage unit 102 is configured, for example, by a storage medium such as RAM (Random Access Memory; readable and writable memory), flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), or HDD (Hard Disk Drive), or any combination of these storage media.

ここで、要求仕様分類テーブルＴ１のテーブル構成の一例について説明する。
図４は、要求仕様分類テーブルＴ１のテーブル構成を示す図である。要求仕様分類テーブルＴ１は、要求仕様文書（半構造化された文書）に含まれる要求仕様テキスト（文字列）の内容の分類を特定する分類情報の一覧を示すテーブルである。図４に示すように、要求仕様分類テーブルＴ１は、「大分類」と、「小分類」と、「分類種別」と、「閾値」と、の少なくとも４つの項目が対応付けられた表形式のデータである。 Here, an example of the table configuration of the required specification classification table T1 will be described.
4 is a diagram showing the table configuration of the requirements classification table T1. The requirements classification table T1 is a table showing a list of classification information that specifies the classification of the content of the requirements text (character strings) included in the requirements document (semi-structured document). As shown in FIG. 4, the requirements classification table T1 is tabular data in which at least four items, namely, "major classification,""minorclassification,""classificationtype," and "threshold," are associated with each other.

「大分類」は、学習用の要求仕様テキストの大分類を示す項目である。図４に示すように、「大分類」の値としては、「機能要件」と、「非機能要件」と、「制約事項」とがある。「小分類」は、学習用の要求仕様テキストの小分類を示す項目である。図４に示すように、「大分類」の値が「機能要件」である場合には、「小分類」には値が格納されていない。また、「大分類」の値が「非機能要件」である場合には、「小分類」の値としては、例えば「可用性」及び「性能」等が含まれる。また、「大分類」の値が「制約事項」である場合には、「小分類」の値としては、例えば「開発環境」及び「運用条件」等が含まれる。 "Major Category" is an item that indicates the major category of the training requirements specification text. As shown in Figure 4, the values of "Major Category" include "Functional Requirements," "Non-Functional Requirements," and "Constraints." "Minor Category" is an item that indicates the minor category of the training requirements specification text. As shown in Figure 4, when the value of "Major Category" is "Functional Requirements," no value is stored in "Minor Category." When the value of "Major Category" is "Non-Functional Requirements," the values of "Minor Category" include, for example, "Availability" and "Performance." When the value of "Major Category" is "Constraints," the values of "Minor Category" include, for example, "Development Environment" and "Operating Conditions."

「分類種別」は、「大分類」及び「小分類」によって特定される分類が、予め定められた分類（すなわち、例えば分類支援システム１が初期状態である場合において、要求仕様分類テーブルＴ１に登録されている分類）の１つであるか、あるいは、例えば分類支援システム１のシステム管理者等がシステムの運用開始後に独自に追加した分類であるか、を示す項目である。図４に示すように、「分類種別」の値には、「基本」と「カスタム」とがある。「分類種別」の値が「基本」である場合には、「大分類」及び「小分類」によって特定される分類が、予め定められた分類であることを表す。「分類種別」の値が「カスタム」である場合には、「大分類」及び「小分類」によって特定される分類が、システム管理者等によって独自に追加された分類であることを表す。 "Classification type" is an item that indicates whether the classification identified by the "major category" and "minor category" is one of the predetermined categories (i.e., a category registered in the requirement specification classification table T1 when the classification support system 1 is in its initial state, for example), or whether it is a category that was added independently by, for example, a system administrator of the classification support system 1 after the system began operation. As shown in Figure 4, the values of "classification type" are "basic" and "custom." When the value of "classification type" is "basic," this indicates that the classification identified by the "major category" and "minor category" is a predetermined category. When the value of "classification type" is "custom," this indicates that the classification identified by the "major category" and "minor category" is a category that was added independently by a system administrator, for example.

「閾値」は、分類支援装置３０による要求仕様テキストの分類において、要求仕様分類モデルから出力された確率に基づいて分類結果を決定するために用いられる値である。「閾値」の詳細については後述する。 The "threshold" is a value used by the classification support device 30 to determine the classification result based on the probability output from the requirements specification classification model when classifying requirements specification text. Details of the "threshold" will be described later.

なお、要求仕様分類テーブルＴ１のテーブル構成は、（例えば「中分類」等を含むように）分類情報がさらに多階層に階層化された構成であってもよいし、例えば「大分類」のみを有する階層化されていない構成であってもよい。 The table structure of the requirements specification classification table T1 may be structured so that the classification information is further hierarchically organized (for example, to include "medium classifications"), or it may be structured so that it is not hierarchical, for example, having only "major classifications."

再び図３に戻って説明する。
学習用データ生成部１０３は、学習用要求仕様テキスト取得部１０１から出力された学習用の要求仕様テキストを取得する。また、学習用データ生成部１０３は、要求仕様分類テーブル記憶部１０２に記憶された要求仕様分類テーブルＴ１を読み出す。そして、学習用データ生成部１０３は、取得した学習用の要求仕様テキストと、読み出した要求仕様分類テーブルＴ１に含まれる分類情報の中から操作入力部１０４による操作入力によって選択された特定の分類情報と、を対応付けることにより学習用データＤ１を生成する。なお、分類情報とは、「大分類」の値と「小分類」の値との組み合わせを示す情報である。 Returning to FIG. 3, the description will be continued.
The learning data generation unit 103 acquires the learning requirement specification text output from the learning requirement specification text acquisition unit 101. The learning data generation unit 103 also reads out the requirement specification classification table T1 stored in the requirement specification classification table storage unit 102. The learning data generation unit 103 then generates learning data D1 by associating the acquired learning requirement specification text with specific classification information selected by operation input from the operation input unit 104 from the classification information included in the read requirement specification classification table T1. Note that the classification information is information indicating a combination of a "major classification" value and a "minor classification" value.

具体的には、学習用データ生成部１０３は、例えば、取得した学習用の要求仕様テキストと、読みだした要求分類仕様テーブルＴ１を示す情報とを、学習用データ作成装置１０が備える、例えばディスプレイ等の表示部（図示せず）に表示させる。そして、要求仕様テキストの分類に関するノウハウを有する経験豊富な技術者（ユーザ）が、表示された要求分類仕様テーブルＴ１に含まれる分類情報の中から、入力された学習用の要求仕様テキストに対応する特定の分類情報を選択して対応付ける操作入力を、操作入力部１０４を介して行う。これにより、学習用の要求仕様テキストと特定の分類情報とが対応付けられた学習用データＤ１が生成される。学習用データ生成部１０３は、生成された学習用データＤ１を学習用データ記憶部１０５に記憶させる。 Specifically, the training data generation unit 103 displays, for example, the acquired training requirements specification text and information indicating the read-out requirements classification specification table T1 on a display unit (not shown) such as a display provided in the training data creation device 10. Then, an experienced engineer (user) with know-how regarding the classification of requirements specification text performs an operation input via the operation input unit 104 to select and associate specific classification information corresponding to the input training requirements specification text from the classification information included in the displayed requirements classification specification table T1. This generates training data D1 in which the training requirements specification text is associated with the specific classification information. The training data generation unit 103 stores the generated training data D1 in the training data storage unit 105.

操作入力部１０４は、ユーザ（例えば経験豊富な技術者）による操作入力を受け付ける。操作入力部１０４は、例えば、キーボード、マウス、及びタッチパネル等の入力デバイスを含んで構成される。 The operation input unit 104 accepts operation input from a user (e.g., an experienced engineer). The operation input unit 104 is configured to include input devices such as a keyboard, mouse, and touch panel.

学習用データ記憶部１０５は、学習用データ生成部１０３によって生成された学習用データＤ１を記憶する。学習用データ記憶部１０５は、例えば数千件～数十万件の学習用データＤ１を記憶する。学習用データ記憶部１０５は、例えば、ＲＡＭ、フラッシュメモリ、ＥＥＰＲＯＭ、及びＨＤＤ等の記憶媒体、又はこれらの記憶媒体の任意の組み合わせによって構成される。 The learning data storage unit 105 stores the learning data D1 generated by the learning data generation unit 103. The learning data storage unit 105 stores, for example, several thousand to several hundred thousand pieces of learning data D1. The learning data storage unit 105 is configured, for example, by a storage medium such as RAM, flash memory, EEPROM, or HDD, or any combination of these storage media.

学習用データ出力部１０６は、学習用データ記憶部１０５に記憶された学習用データＤ１を取得する。学習用データ出力部１０６は、取得した学習用データＤ１を学習装置２０へ出力する。学習用データ出力部１０６は、例えば、全ての学習用データＤ１の生成が完了した場合に、あるいは、生成された学習用データＤ１の件数が所定の閾値に達した場合に、学習用データＤ１を学習装置２０へ出力する。 The learning data output unit 106 acquires the learning data D1 stored in the learning data storage unit 105. The learning data output unit 106 outputs the acquired learning data D1 to the learning device 20. The learning data output unit 106 outputs the learning data D1 to the learning device 20, for example, when generation of all learning data D1 has been completed or when the number of pieces of generated learning data D1 has reached a predetermined threshold.

以下、学習装置２０の機能構成について更に詳しく説明する。
図５は、学習装置２０の機能構成を示す図である。図５に示すように、学習装置２０は、コーパス取得部２０１と、前処理ルール記憶部２０２と、テキスト前処理部２０３と、単語ベクトル学習部２０４と、学習済みモデル記憶部２０５と、学習用データ取得部２０６と、単語ベクトル変換部２０７と、要求仕様分類学習部２０８と、学習済みモデル出力部２０９と、を含んで構成される。 The functional configuration of the learning device 20 will be described in more detail below.
Fig. 5 is a diagram showing the functional configuration of the learning device 20. As shown in Fig. 5, the learning device 20 includes a corpus acquisition unit 201, a preprocessing rule storage unit 202, a text preprocessing unit 203, a word vector learning unit 204, a trained model storage unit 205, a training data acquisition unit 206, a word vector conversion unit 207, a requirement specification classification learning unit 208, and a trained model output unit 209.

コーパス取得部２０１は、例えばインターネット等から得られるコーパスを取得する。コーパス取得部２０１は、取得したコーパスをテキスト前処理部２０３へ出力する。なお、コーパス取得部２０１は、個々のコーパスを取得する代わりに、複数のコーパスからなるコーパス群を取得し、当該コーパス群を個々のコーパスに切り出す処理を行うようにしてもよい。 The corpus acquisition unit 201 acquires a corpus obtained from, for example, the Internet. The corpus acquisition unit 201 outputs the acquired corpus to the text preprocessing unit 203. Note that instead of acquiring individual corpora, the corpus acquisition unit 201 may acquire a corpus group consisting of multiple corpora and perform processing to cut the corpus group into individual corpora.

前処理ルール記憶部２０２は、前処理ルールＲ１を予め記憶する。前処理ルールＲ１とは、コーパスに含まれる文字列、及び学習用データに含まれる学習用の要求仕様テキストに対して、前処理（例えば自然言語処理）を行う際に用いられるルールを示す情報である。例えば、前処理ルールＲ１には、分かち書きを行うためのルール、正規化を行うためのルール、及び不要な単語の削除を行うためのルール等が含まれる。前処理ルール記憶部２０２は、例えば、ＲＡＭ、フラッシュメモリ、ＥＥＰＲＯＭ、及びＨＤＤ等の記憶媒体、又はこれらの記憶媒体の任意の組み合わせによって構成される。 The preprocessing rule storage unit 202 stores preprocessing rules R1 in advance. Preprocessing rules R1 are information indicating rules used when preprocessing (e.g., natural language processing) character strings contained in the corpus and training requirement specification text contained in the training data. For example, preprocessing rules R1 include rules for word segmentation, normalization, and deletion of unnecessary words. The preprocessing rule storage unit 202 is configured, for example, by a storage medium such as RAM, flash memory, EEPROM, or HDD, or any combination of these storage media.

テキスト前処理部２０３は、コーパス取得部２０１から出力されたコーパスを取得する。また、テキスト前処理部２０３は、前処理ルール記憶部２０２に記憶された前処理ルールＲ１を読み出す。そして、テキスト前処理部２０３は、読み出した前処理ルールＲ１に基づいて、取得したコーパスに含まれる文字列に対して前処理（例えば自然言語処理）を行う。テキスト前処理部２０３は、前処理がなされたコーパスに含まれる文字列を単語ベクトル学習部２０４へ出力する。 The text pre-processing unit 203 acquires the corpus output from the corpus acquisition unit 201. The text pre-processing unit 203 also reads out the pre-processing rule R1 stored in the pre-processing rule storage unit 202. The text pre-processing unit 203 then performs pre-processing (e.g., natural language processing) on the character strings included in the acquired corpus based on the read out pre-processing rule R1. The text pre-processing unit 203 outputs the character strings included in the pre-processed corpus to the word vector learning unit 204.

単語ベクトル学習部２０４は、テキスト前処理部２０３から出力された、前処理がなされたコーパスに含まれる文字列を取得する。単語ベクトル学習部２０４は、取得した文字列を入力として機械学習を行うことにより、学習済みの単語ベクトルの学習モデルである単語ベクトルモデルＭ１を生成する。単語ベクトル学習部２０４は、生成した単語ベクトルモデルＭ１を、学習済みモデル記憶部２０５に記憶させる。 The word vector learning unit 204 acquires character strings contained in the preprocessed corpus output from the text preprocessing unit 203. The word vector learning unit 204 performs machine learning using the acquired character strings as input, thereby generating a word vector model M1, which is a learning model of trained word vectors. The word vector learning unit 204 stores the generated word vector model M1 in the trained model storage unit 205.

なお、単語ベクトルとは、文章中に含まれる単語が、例えば共に用いられやすい他の単語についての傾向を機械学習を用いて学習し、その傾向（特徴）を数値化（ベクトル化）したものである。なお、単語ベクトルへの変換の手法として、例えばｗｏｒｄ２ｖｅｃ等を用いることができる。なお、単語ベクトルモデルＭ１とは、例えば、文字列の入力に応じて単語ベクトルを出力するように機械学習がなされた学習モデルである。 A word vector is a quantification (vectorization) of a word's tendency (feature) to be used together with other words that are often used together with words contained in a sentence, learned using machine learning. Methods for converting to word vectors include word2vec, for example. A word vector model M1 is a learning model that has undergone machine learning to output a word vector in response to an input string of characters, for example.

学習済みモデル記憶部２０５は、単語ベクトル学習部２０４によって生成された単語ベクトルモデルＭ１と、要求仕様分類学習部２０８によって生成された、後述する要求仕様分類モデルＭ２と、を記憶する。学習済みモデル記憶部２０５は、例えば、ＲＡＭ、フラッシュメモリ、ＥＥＰＲＯＭ、及びＨＤＤ等の記憶媒体、又はこれらの記憶媒体の任意の組み合わせによって構成される。 The trained model storage unit 205 stores the word vector model M1 generated by the word vector learning unit 204 and the requirement specification classification model M2 (described below) generated by the requirement specification classification learning unit 208. The trained model storage unit 205 is configured, for example, by a storage medium such as RAM, flash memory, EEPROM, or HDD, or any combination of these storage media.

学習用データ取得部２０６は、学習用データ作成装置１０から出力された学習用データＤ１を取得する。学習用データ取得部２０６は、取得した学習用データＤ１に含まれる学習用の要求仕様テキストをテキスト前処理部２０３へ出力する。また、学習用データ取得部２０６は、取得した学習用データＤ１に含まれる分類情報を要求仕様分類学習部２０８へ出力する。 The learning data acquisition unit 206 acquires the learning data D1 output from the learning data creation device 10. The learning data acquisition unit 206 outputs the learning requirement specification text included in the acquired learning data D1 to the text preprocessing unit 203. The learning data acquisition unit 206 also outputs the classification information included in the acquired learning data D1 to the requirement specification classification learning unit 208.

テキスト前処理部２０３は、学習用データ取得部２０６から出力された学習用の要求仕様テキストを取得する。また、テキスト前処理部２０３は、前処理ルール記憶部２０２に記憶された前処理ルールＲ１を読み出す。そして、テキスト前処理部２０３は、読み出した前処理ルールＲ１に基づいて、取得した学習用の要求仕様テキストに対して前処理（例えば自然言語処理）を行う。テキスト前処理部２０３は、前処理がなされた学習用の要求仕様テキストを単語ベクトル変換部２０７へ出力する。 The text pre-processing unit 203 acquires the training requirement specification text output from the training data acquisition unit 206. The text pre-processing unit 203 also reads out the pre-processing rule R1 stored in the pre-processing rule storage unit 202. The text pre-processing unit 203 then performs pre-processing (e.g., natural language processing) on the acquired training requirement specification text based on the read out pre-processing rule R1. The text pre-processing unit 203 outputs the pre-processed training requirement specification text to the word vector conversion unit 207.

単語ベクトル変換部２０７は、テキスト前処理部２０３から出力された、前処理がなされた学習用の要求仕様テキストを取得する。また、単語ベクトル変換部２０７は、学習済みモデル記憶部２０５に記憶された単語ベクトルモデルＭ１を読み出す。単語ベクトル変換部２０７は、取得した学習用の要求仕様テキストを、読み出した単語ベクトルモデルＭ１に入力することによって、学習用の要求仕様テキストを単語ベクトルに変換する。単語ベクトル変換部２０７は、単語ベクトルに変換された学習用の要求仕様テキストを要求仕様分類学習部２０８へ出力する。 The word vector conversion unit 207 acquires the preprocessed training requirement specification text output from the text preprocessing unit 203. The word vector conversion unit 207 also reads out the word vector model M1 stored in the trained model storage unit 205. The word vector conversion unit 207 converts the training requirement specification text into a word vector by inputting the acquired training requirement specification text into the read word vector model M1. The word vector conversion unit 207 outputs the training requirement specification text converted into a word vector to the requirement specification classification training unit 208.

要求仕様分類学習部２０８は、学習用データ取得部２０６から出力された、学習用データＤ１に含まれる分類情報を取得する。また、要求仕様分類学習部２０８は、単語ベクトル変換部２０７から出力された、単語ベクトルに変換された学習用の要求仕様テキストを取得する。そして、要求仕様分類学習部２０８は、取得した単語ベクトルに変換された学習用の要求仕様テキストと、取得した分類情報と、を対応付けたデータを教師データとして機械学習を行うことにより、学習済みの学習モデルである要求仕様分類モデルＭ２を得る。 The requirements classification learning unit 208 acquires the classification information contained in the learning data D1 output from the learning data acquisition unit 206. The requirements classification learning unit 208 also acquires the learning requirements text converted into word vectors output from the word vector conversion unit 207. The requirements classification learning unit 208 then performs machine learning using as training data the data that associates the learning requirements text converted into the acquired word vectors with the acquired classification information, thereby obtaining a requirements classification model M2, which is a trained learning model.

要求仕様分類モデルＭ２は、例えば、単語ベクトルに変換された要求仕様テキストの入力に応じて分類結果に関する情報を出力するように機械学習がなされた、ニューラルネットワークの学習モデルである。要求仕様分類学習部２０８は、生成された要求仕様分類モデルＭ２を、学習済みモデル記憶部２０５に記憶させる。 The requirements specification classification model M2 is, for example, a neural network learning model that has undergone machine learning to output information related to classification results in response to input requirements specification text converted into word vectors. The requirements specification classification learning unit 208 stores the generated requirements specification classification model M2 in the trained model storage unit 205.

学習済みモデル出力部２０９は、学習済みモデル記憶部２０５に記憶された、単語ベクトルモデルＭ１と要求仕様分類モデルＭ２とを読み出す。学習済みモデル出力部２０９は、読み出された単語ベクトルモデルＭ１と要求仕様分類モデルＭ２とを、分類支援装置３０へ出力する。 The trained model output unit 209 reads out the word vector model M1 and requirement specification classification model M2 stored in the trained model storage unit 205. The trained model output unit 209 outputs the read out word vector model M1 and requirement specification classification model M2 to the classification assistance device 30.

以下、分類支援装置３０の機能構成について更に詳しく説明する。
図６は、分類支援装置３０の機能構成を示す図である。図６に示すように、分類支援装置３０は、学習済みモデル取得部３０１と、学習済みモデル記憶部３０２と、要求仕様テキスト取得部３０３と、前処理ルール記憶部３０４と、テキスト前処理部３０５と、単語ベクトル変換部３０６と、要求仕様分類部３０７と、分類結果解析部３０８と、分類結果出力部３０９と、を含んで構成される。 The functional configuration of the classification support device 30 will be described in more detail below.
Fig. 6 is a diagram showing the functional configuration of the classification support device 30. As shown in Fig. 6, the classification support device 30 includes a trained model acquisition unit 301, a trained model storage unit 302, a requirement specification text acquisition unit 303, a preprocessing rule storage unit 304, a text preprocessing unit 305, a word vector conversion unit 306, a requirement specification classification unit 307, a classification result analysis unit 308, and a classification result output unit 309.

学習済みモデル取得部３０１は、学習装置２０から出力された、単語ベクトルモデルＭ１と要求仕様分類モデルＭ２とを取得する。学習済みモデル取得部３０１は、単語ベクトルモデルＭ１と要求仕様分類モデルＭ２とを、学習済みモデル記憶部３０２に記憶させる。 The trained model acquisition unit 301 acquires the word vector model M1 and the requirement specification classification model M2 output from the learning device 20. The trained model acquisition unit 301 stores the word vector model M1 and the requirement specification classification model M2 in the trained model storage unit 302.

学習済みモデル記憶部３０２は、学習済みモデル取得部３０１から出力された単語ベクトルモデルＭ１と要求仕様分類モデルＭ２とを記憶する。学習済みモデル記憶部３０２は、例えば、ＲＡＭ、フラッシュメモリ、ＥＥＰＲＯＭ、及びＨＤＤ等の記憶媒体、又はこれらの記憶媒体の任意の組み合わせによって構成される。 The trained model storage unit 302 stores the word vector model M1 and requirement specification classification model M2 output from the trained model acquisition unit 301. The trained model storage unit 302 is configured, for example, by a storage medium such as RAM, flash memory, EEPROM, or HDD, or any combination of these storage media.

要求仕様テキスト取得部３０３は、外部の装置あるいは記憶媒体等から、分類対象の要求仕様文書（半構造化された文書）から切り出された要求仕様テキスト（文字列）を取得する。要求仕様テキスト取得部３０３は、取得した要求仕様テキストをテキスト前処理部３０５へ出力する。なお、要求仕様テキスト取得部３０３が、分類対象の要求仕様文書から要求仕様テキストを切り出す処理を行う構成であってもよい。 The requirements specification text acquisition unit 303 acquires requirements specification text (character strings) extracted from the requirements specification document (semi-structured document) to be classified from an external device or storage medium. The requirements specification text acquisition unit 303 outputs the acquired requirements specification text to the text pre-processing unit 305. Note that the requirements specification text acquisition unit 303 may also be configured to perform processing to extract requirements specification text from the requirements specification document to be classified.

前処理ルール記憶部３０４は、前処理ルールＲ１を予め記憶する。なお、前処理ルール記憶部３０４は、学習装置２０の前処理ルール記憶部２０２から前処理ルールＲ１を取得するようにしてもよい。前処理ルール記憶部３０４は、例えば、ＲＡＭ、フラッシュメモリ、ＥＥＰＲＯＭ、及びＨＤＤ等の記憶媒体、又はこれらの記憶媒体の任意の組み合わせによって構成される。 The preprocessing rule storage unit 304 stores the preprocessing rule R1 in advance. The preprocessing rule storage unit 304 may also acquire the preprocessing rule R1 from the preprocessing rule storage unit 202 of the learning device 20. The preprocessing rule storage unit 304 is configured, for example, by a storage medium such as RAM, flash memory, EEPROM, or HDD, or any combination of these storage media.

テキスト前処理部３０５は、要求仕様テキスト取得部３０３から出力された要求仕様テキストを取得する。また、テキスト前処理部３０５は、前処理ルール記憶部３０４に記憶された前処理ルールＲ１を読み出す。そして、テキスト前処理部３０５は、読み出した前処理ルールＲ１に基づいて、取得した要求仕様テキストに対して前処理（例えば自然言語処理）を行う。テキスト前処理部３０５は、前処理がなされた要求仕様テキストを単語ベクトル変換部３０６へ出力する。 The text pre-processing unit 305 acquires the requirement specification text output from the requirement specification text acquisition unit 303. The text pre-processing unit 305 also reads the pre-processing rule R1 stored in the pre-processing rule storage unit 304. The text pre-processing unit 305 then performs pre-processing (e.g., natural language processing) on the acquired requirement specification text based on the read pre-processing rule R1. The text pre-processing unit 305 outputs the pre-processed requirement specification text to the word vector conversion unit 306.

単語ベクトル変換部３０６は、テキスト前処理部３０５から出力された、前処理がなされた要求仕様テキストを取得する。また、単語ベクトル変換部３０６は、学習済みモデル記憶部３０２に記憶された単語ベクトルモデルＭ１を読み出す。単語ベクトル変換部３０６は、取得した要求仕様テキストを、読み出した単語ベクトルモデルＭ１に入力することによって、要求仕様テキストを単語ベクトルに変換する。単語ベクトル変換部３０６は、単語ベクトルに変換された要求仕様テキストを、要求仕様分類部３０７へ出力する。 The word vector conversion unit 306 acquires the preprocessed requirement specification text output from the text preprocessing unit 305. The word vector conversion unit 306 also reads out the word vector model M1 stored in the trained model storage unit 302. The word vector conversion unit 306 converts the requirement specification text into a word vector by inputting the acquired requirement specification text into the read-out word vector model M1. The word vector conversion unit 306 outputs the requirement specification text converted into a word vector to the requirement specification classification unit 307.

要求仕様分類部３０７は、単語ベクトル変換部３０６から出力された、単語ベクトルに変換された要求仕様テキストを取得する。また、要求仕様分類部３０７は、学習済みモデル記憶部３０２に記憶された要求仕様分類モデルＭ２を読み出す。そして、要求仕様分類部３０７は、取得した単語ベクトルに変換された要求仕様テキストを、読み出した要求仕様分類モデルＭ２に入力することによって、要求仕様テキストの分類結果に関する情報を得る。要求仕様分類部３０７は、取得した要求仕様テキストの分類結果に関する情報を分類結果解析部３０８へ出力する。 The requirements classification unit 307 acquires the requirements text converted into word vectors output from the word vector conversion unit 306. The requirements classification unit 307 also reads out the requirements classification model M2 stored in the trained model storage unit 302. The requirements classification unit 307 then inputs the acquired requirements text converted into word vectors into the read requirements classification model M2, thereby obtaining information related to the classification results of the requirements text. The requirements classification unit 307 outputs information related to the classification results of the acquired requirements text to the classification result analysis unit 308.

なお、要求仕様分類部３０７から出力される、要求仕様テキストの分類結果に関する情報とは、例えば、図４に示した要求仕様分類テーブルに含まれる全ての分類情報（すなわち、「大分類」と「小分類」との組み合わせ）ごとの確率を示す値である。すなわち、要求仕様テキストの分類結果に関する情報とは、例えば、分類対象の要求仕様テキストが各分類である確率をそれぞれ示す情報である。 Note that the information on the classification results of the requirement specification text output from the requirement specification classification unit 307 is, for example, a value indicating the probability for each of all classification information (i.e., combinations of "major classifications" and "minor classifications") included in the requirement specification classification table shown in FIG. 4. In other words, the information on the classification results of the requirement specification text is, for example, information indicating the probability that the requirement specification text to be classified falls into each classification.

分類結果解析部３０８は、要求仕様分類部３０７から出力された、要求仕様テキストの分類結果に関する情報を取得する。分類結果解析部３０８は、分類結果に関する情報を解析することにより、要求仕様テキストの分類結果を示す情報を生成する。分類結果解析部３０８は、生成された要求仕様テキストの分類結果を示す情報を、分類結果出力部３０９へ出力する。 The classification result analysis unit 308 acquires information related to the classification results of the requirement specification text output from the requirement specification classification unit 307. The classification result analysis unit 308 analyzes the information related to the classification results to generate information indicating the classification results of the requirement specification text. The classification result analysis unit 308 outputs the generated information indicating the classification results of the requirement specification text to the classification result output unit 309.

なお、分類結果解析部３０８から出力される、要求仕様テキストの分類結果を示す情報とは、例えば、要求仕様分類部３０７から出力された分類情報ごとの確率を示す値に基づいて選定された、特定の分類を示す情報である。例えば、分類結果を示す情報は、当該確率の値が所定の閾値以上の値である分類を示す情報である。あるいは、例えば、分類結果を示す情報は、当該確率の値のうち大きいほうから上位ｎ件分（ｎは自然数）の値にそれぞれ対応する分類を示す情報である。あるいは、例えば、分類結果を示す情報は、当該確率の値のうち最も大きい値に対応する分類を示す情報である。 The information indicating the classification results of the requirements specification text output from the classification result analysis unit 308 is, for example, information indicating a specific classification selected based on the value indicating the probability for each piece of classification information output from the requirements specification classification unit 307. For example, the information indicating the classification result is information indicating a classification whose probability value is equal to or greater than a predetermined threshold. Alternatively, for example, the information indicating the classification result is information indicating the classification corresponding to the top n largest probability values (n is a natural number). Alternatively, for example, the information indicating the classification result is information indicating the classification corresponding to the largest probability value.

なお、分類結果解析部３０８は、分類結果に関する情報を解析した結果に基づいて、図４に示した要求仕様分類テーブルＴ１の「閾値」の値を調整（チューニング）するようにしてもよい。例えば、分類結果解析部３０８は、要求仕様テキストの分類の精度をより向上させるように、「閾値」の値を更新するようにしてもよい。なお、「閾値」の値の調整（チューニング）は、要求仕様テキストの分類における経験豊富な技術者（ユーザ）によって人手で行われてもよいし、チューニング用のツール等を用いて自動的に行われてもよい。 The classification result analysis unit 308 may adjust (tune) the "threshold" value of the requirements specification classification table T1 shown in FIG. 4 based on the results of analyzing the information related to the classification results. For example, the classification result analysis unit 308 may update the "threshold" value to further improve the accuracy of classification of requirements specification text. The "threshold" value may be adjusted (tuned) manually by an engineer (user) with extensive experience in classifying requirements specification text, or may be adjusted automatically using a tuning tool, etc.

分類結果出力部３０９は、分類結果解析部３０８から出力された分類結果を示す情報を取得する。分類結果出力部３０９は、取得した分類結果を示す情報を外部の装置へ出力する。なお、分類結果出力部３０９が、分類結果を示す情報を、分類支援装置３０が備える例えばディスプレイ等の表示部（図示せず）に表示させる構成であってもよい。 The classification result output unit 309 acquires information indicating the classification results output from the classification result analysis unit 308. The classification result output unit 309 outputs the acquired information indicating the classification results to an external device. Note that the classification result output unit 309 may be configured to display the information indicating the classification results on a display unit (not shown), such as a display, provided in the classification support device 30.

以下、学習用データ作成装置１０の動作の一例について説明する。
図７は、学習用データ作成装置１０の動作を示すフローチャートである。まず、学習用データ生成部１０３は、要求仕様分類テーブル記憶部１０２に記憶された要求仕様分類テーブルＴ１を読み出す（ステップＳ１０１）。 An example of the operation of the learning data creation device 10 will now be described.
7 is a flowchart showing the operation of the learning data creation device 10. First, the learning data generation unit 103 reads out the required specification classification table T1 stored in the required specification classification table storage unit 102 (step S101).

学習用データ生成部１０３は、学習用要求仕様テキスト取得部１０１から出力された学習用の要求仕様テキストを取得する（ステップＳ１０２）。また、操作入力部１０４は、ユーザによる操作入力を受け付ける（ステップＳ１０３）。そして、学習用データ生成部１０３は、取得した学習用の要求仕様テキストと、読み出した要求仕様分類テーブルＴ１に含まれる分類情報の中から操作入力部１０４による操作入力によって選択された特定の分類情報と、を対応付けることにより学習用データＤ１を生成する（ステップＳ１０４）。学習用データ生成部１０３は、生成した学習用データＤ１を学習用データ記憶部１０５に記憶させる（ステップＳ１０５）。 The training data generation unit 103 acquires the training requirement specification text output from the training requirement specification text acquisition unit 101 (step S102). The operation input unit 104 accepts operation input by the user (step S103). The training data generation unit 103 then generates training data D1 by associating the acquired training requirement specification text with specific classification information selected by operation input from the operation input unit 104 from the classification information included in the read-out requirement specification classification table T1 (step S104). The training data generation unit 103 stores the generated training data D1 in the training data storage unit 105 (step S105).

全ての学習用の要求仕様テキストに対する学習データの生成が完了していない場合（すなわち、分類情報が対応付けられていない学習用の要求仕様テキストが存在する場合）（ステップＳ１０６・ＮＯ）、学習用データ作成装置１０は、引き続き上記の学習データの生成処理を繰り返す（ステップＳ１０２～ステップＳ１０５）。全ての学習用の要求仕様テキストに対する学習データの生成が完了した場合（ステップＳ１０６・ＹＥＳ）、学習用データ出力部１０６は、学習用データ記憶部１０５に記憶された学習用データＤ１を学習装置２０へ出力する（ステップＳ１０７）。以上で、図７のフローチャートが示す学習用データ作成装置１０の動作が終了する。 If the generation of learning data for all learning requirement specification texts has not been completed (i.e., if there is learning requirement specification text to which classification information is not associated) (step S106, NO), the learning data creation device 10 continues to repeat the above learning data generation process (steps S102 to S105). If the generation of learning data for all learning requirement specification texts has been completed (step S106, YES), the learning data output unit 106 outputs the learning data D1 stored in the learning data storage unit 105 to the learning device 20 (step S107). This completes the operation of the learning data creation device 10 shown in the flowchart of Figure 7.

以下、単語ベクトルモデルＭ１の生成における学習装置２０の動作の一例について説明する。
図８は、単語ベクトルモデルＭ１の生成における学習装置２０の動作を示すフローチャートである。まず、テキスト前処理部２０３は、前処理ルール記憶部２０２に記憶された前処理ルールＲ１を読み出す（ステップＳ２０１）。 An example of the operation of the learning device 20 in generating the word vector model M1 will now be described.
8 is a flowchart showing the operation of the learning device 20 in generating the word vector model M1. First, the text preprocessing unit 203 reads out the preprocessing rule R1 stored in the preprocessing rule storage unit 202 (step S201).

テキスト前処理部２０３は、コーパス取得部２０１から出力されたコーパスを取得する（ステップＳ２０２）。そして、テキスト前処理部２０３は、読み出した前処理ルールＲ１に基づいて、取得したコーパスに含まれる文字列に対して前処理（例えば自然言語処理）を実行する（ステップＳ２０３）。 The text pre-processing unit 203 acquires the corpus output from the corpus acquisition unit 201 (step S202). Then, the text pre-processing unit 203 performs pre-processing (e.g., natural language processing) on the character strings included in the acquired corpus based on the read pre-processing rule R1 (step S203).

単語ベクトル学習部２０４は、テキスト前処理部２０３から出力された、前処理がなされたコーパスに含まれる文字列を取得し、学習用入力データとして保持する。全てのコーパスに対して学習用入力データの作成処理が完了していない場合（すなわち、学習用入力データに追加されていないコーパスが存在する場合）（ステップＳ２０５・ＮＯ）、学習装置２０は、引き続き上記の学習用入力データの作成処理を繰り返す（ステップＳ２０２～ステップＳ２０４）。 The word vector learning unit 204 acquires character strings contained in the preprocessed corpus output from the text preprocessing unit 203 and stores them as training input data. If the training input data creation process has not been completed for all corpora (i.e., if there are corpora that have not been added to the training input data) (step S205: NO), the learning device 20 continues to repeat the above training input data creation process (steps S202 to S204).

全てのコーパスに対して学習用入力データの作成処理が完了した場合（ステップＳ２０５・ＹＥＳ）、単語ベクトル学習部２０４は、学習用入力データを入力として機械学習を実行することにより、単語ベクトルモデルＭ１を生成し、学習済みモデル記憶部２０５に記憶させる（ステップＳ２０６）。以上で図８のフローチャートが示す、単語ベクトルモデルＭ１の生成における学習装置２０の動作が終了する。 When the process of creating training input data has been completed for all corpora (step S205, YES), the word vector training unit 204 performs machine learning using the training input data as input to generate a word vector model M1 and stores it in the trained model storage unit 205 (step S206). This completes the operation of the training device 20 in generating the word vector model M1, as shown in the flowchart in Figure 8.

以下、要求仕様分類モデルＭ２の生成における学習装置２０の動作の一例について説明する。
図９は、要求仕様分類モデルＭ２の生成における学習装置２０の動作を示すフローチャートである。まず、テキスト前処理部２０３は、前処理ルール記憶部２０２に記憶された前処理ルールＲ１を読み出す（ステップＳ２１１）。 An example of the operation of the learning device 20 in generating the requirement specification classification model M2 will be described below.
9 is a flowchart showing the operation of the learning device 20 in generating the requirement specification classification model M2. First, the text preprocessing unit 203 reads out the preprocessing rule R1 stored in the preprocessing rule storage unit 202 (step S211).

学習用データ取得部２０６は、学習用データ作成装置１０から出力された学習用データＤ１を取得する（ステップＳ２１２）。学習用データ取得部２０６は、取得した学習用データＤ１に含まれる学習用の要求仕様テキストを、テキスト前処理部２０３へ出力する。また、学習用データ取得部２０６は、取得した学習用データＤ１に含まれる分類情報を、要求仕様分類学習部２０８へ出力する。 The learning data acquisition unit 206 acquires the learning data D1 output from the learning data creation device 10 (step S212). The learning data acquisition unit 206 outputs the learning requirement specification text included in the acquired learning data D1 to the text preprocessing unit 203. The learning data acquisition unit 206 also outputs the classification information included in the acquired learning data D1 to the requirement specification classification learning unit 208.

テキスト前処理部２０３は、学習用データ取得部２０６から出力された学習用の要求仕様テキストを取得する。そして、テキスト前処理部２０３は、読み出した前処理ルールＲ１に基づいて、取得した学習用の要求仕様テキストに対して前処理（例えば自然言語処理）を実行する（ステップＳ２１３）。 The text pre-processing unit 203 acquires the training requirement specification text output from the training data acquisition unit 206. Then, the text pre-processing unit 203 performs pre-processing (e.g., natural language processing) on the acquired training requirement specification text based on the read pre-processing rule R1 (step S213).

単語ベクトル変換部２０７は、テキスト前処理部２０３から出力された、前処理がなされた学習用の要求仕様テキストを取得する。また、単語ベクトル変換部２０７は、学習済みモデル記憶部２０５に記憶された単語ベクトルモデルＭ１を読み出す。単語ベクトル変換部２０７は、取得した学習用の要求仕様テキストを、読み出した単語ベクトルモデルＭ１に入力することによって、学習用の要求仕様テキストを単語ベクトルに変換する（ステップＳ２１４）。 The word vector conversion unit 207 acquires the preprocessed training requirement specification text output from the text preprocessing unit 203. The word vector conversion unit 207 also reads out the word vector model M1 stored in the trained model storage unit 205. The word vector conversion unit 207 inputs the acquired training requirement specification text into the read-out word vector model M1, thereby converting the training requirement specification text into a word vector (step S214).

要求仕様分類学習部２０８は、学習用データ取得部２０６から出力された、学習用データＤ１に含まれる分類情報を取得する。また、要求仕様分類学習部２０８は、単語ベクトル変換部２０７から出力された、単語ベクトルに変換された学習用の要求仕様テキストを取得する。そして、要求仕様分類学習部２０８は、取得した単語ベクトルに変換された学習用の要求仕様テキストと、取得した分類情報と、を対応付けたデータを学習用入力データとして保持する（ステップＳ２１５）。 The requirements classification learning unit 208 acquires the classification information contained in the learning data D1 output from the learning data acquisition unit 206. The requirements classification learning unit 208 also acquires the learning requirements text converted into word vectors output from the word vector conversion unit 207. The requirements classification learning unit 208 then stores data that associates the learning requirements text converted into the acquired word vectors with the acquired classification information as learning input data (step S215).

全ての学習用データに対して学習用入力データの作成処理が完了していない場合（すなわち、学習用入力データに追加されていない学習用データが存在する場合）（ステップＳ２１６・ＮＯ）、学習装置２０は、引き続き上記の学習用入力データの作成処理を繰り返す（ステップＳ２１２～ステップＳ２１５）。全ての学習用データに対して学習用入力データの作成処理が完了した場合（ステップＳ２１６・ＹＥＳ）、要求仕様分類学習部２０８は、学習用入力データを教師データとして機械学習を実行することにより、学習済みの学習モデルである要求仕様分類モデルＭ２を生成し、学習済みモデル記憶部２０５に記憶させる（ステップＳ２１７）。学習済みモデル出力部２０９は、学習済みモデル記憶部２０５に記憶された単語ベクトルモデルＭ１と要求仕様分類モデルＭ２とを、分類支援装置３０へ出力する（ステップＳ２１８）。以上で、図９のフローチャートが示す、要求仕様分類モデルの生成における学習装置２０の動作が終了する。 If the process of creating learning input data has not been completed for all learning data (i.e., if there is learning data that has not been added to the learning input data) (step S216, NO), the learning device 20 continues to repeat the above-mentioned process of creating learning input data (steps S212 to S215). If the process of creating learning input data has been completed for all learning data (step S216, YES), the requirements specification classification learning unit 208 performs machine learning using the learning input data as training data to generate a requirements specification classification model M2, which is a trained learning model, and stores it in the trained model storage unit 205 (step S217). The trained model output unit 209 outputs the word vector model M1 and requirements specification classification model M2 stored in the trained model storage unit 205 to the classification assistance device 30 (step S218). This completes the operation of the learning device 20 in generating a requirements specification classification model, as shown in the flowchart in Figure 9.

以下、分類支援装置３０の動作の一例について説明する。
図１０は、分類支援装置３０の動作を示すフローチャートである。まず、学習済みモデル取得部３０１は、学習装置２０から出力された、単語ベクトルモデルＭ１と要求仕様分類モデルＭ２とを取得し、学習済みモデル記憶部３０２に記憶させる（ステップＳ３０１）。また、テキスト前処理部３０５は、前処理ルール記憶部３０４に記憶された前処理ルールＲ１を読み出す（ステップＳ３０２）。 An example of the operation of the classification support device 30 will now be described.
10 is a flowchart showing the operation of the classification support device 30. First, the trained model acquisition unit 301 acquires the word vector model M1 and the requirement specification classification model M2 output from the learning device 20 and stores them in the trained model storage unit 302 (step S301). In addition, the text preprocessing unit 305 reads the preprocessing rule R1 stored in the preprocessing rule storage unit 304 (step S302).

テキスト前処理部３０５は、要求仕様テキスト取得部３０３から出力された要求仕様テキストを取得する（ステップＳ３０３）。そして、テキスト前処理部３０５は、読み出した前処理ルールＲ１に基づいて、取得した要求仕様テキストに対して前処理（例えば自然言語処理）を行う（ステップＳ３０４）。 The text pre-processing unit 305 acquires the requirement specification text output from the requirement specification text acquisition unit 303 (step S303). Then, the text pre-processing unit 305 performs pre-processing (e.g., natural language processing) on the acquired requirement specification text based on the read pre-processing rule R1 (step S304).

単語ベクトル変換部３０６は、テキスト前処理部３０５から出力された、前処理がなされた要求仕様テキストを取得する。また、単語ベクトル変換部３０６は、学習済みモデル記憶部３０２に記憶された単語ベクトルモデルＭ１を読み出す。単語ベクトル変換部３０６は、取得した要求仕様テキストを、読み出した単語ベクトルモデルＭ１に入力することによって、要求仕様テキストを単語ベクトルに変換する（ステップＳ３０５）。 The word vector conversion unit 306 acquires the preprocessed requirement specification text output from the text preprocessing unit 305. The word vector conversion unit 306 also reads out the word vector model M1 stored in the trained model storage unit 302. The word vector conversion unit 306 converts the requirement specification text into a word vector by inputting the acquired requirement specification text into the read word vector model M1 (step S305).

要求仕様分類部３０７は、単語ベクトル変換部３０６から出力された、単語ベクトルに変換された要求仕様テキストを取得する。また、要求仕様分類部３０７は、学習済みモデル記憶部３０２に記憶された要求仕様分類モデルＭ２を読み出す。そして、要求仕様分類部３０７は、取得した単語ベクトルに変換された要求仕様テキストを、読み出した要求仕様分類モデルＭ２に入力することによって、要求仕様分類を実行する（ステップＳ３０６）。これにより、要求仕様分類部３０７は、要求仕様テキストの分類結果に関する情報を得る。 The requirement specification classification unit 307 acquires the requirement specification text converted into word vectors output from the word vector conversion unit 306. The requirement specification classification unit 307 also reads the requirement specification classification model M2 stored in the trained model storage unit 302. The requirement specification classification unit 307 then inputs the acquired requirement specification text converted into word vectors into the read requirement specification classification model M2, thereby performing requirement specification classification (step S306). As a result, the requirement specification classification unit 307 obtains information regarding the classification results of the requirement specification text.

分類結果解析部３０８は、要求仕様分類部３０７から出力された、要求仕様テキストの分類結果に関する情報を取得する。分類結果解析部３０８は、分類結果に関する情報を解析することにより、要求仕様テキストの分類結果を示す情報を生成する（ステップＳ３０７）。分類結果出力部３０９は、分類結果解析部３０８から出力された分類結果を示す情報を取得する。分類結果出力部３０９は、取得した分類結果を示す情報を外部の装置へ出力する（ステップＳ３０８）。 The classification result analysis unit 308 acquires information related to the classification results of the requirement specification text output from the requirement specification classification unit 307. The classification result analysis unit 308 analyzes the information related to the classification results to generate information indicating the classification results of the requirement specification text (step S307). The classification result output unit 309 acquires information indicating the classification results output from the classification result analysis unit 308. The classification result output unit 309 outputs the acquired information indicating the classification results to an external device (step S308).

全ての要求仕様テキストに対する要求仕様分類が完了していない場合（すなわち、要求仕様分類がなされていない要求仕様テキストが存在する場合）（ステップＳ３０９・ＮＯ）、分類支援装置３０は、引き続き上記の要求仕様分類処理を繰り返す（ステップＳ３０３～ステップＳ３０８）。全ての要求仕様テキストに対する要求仕様分類が完了が完了した場合（ステップＳ３０９・ＹＥＳ）、図１０のフローチャートが示す、分類支援装置３０の動作が終了する。 If requirement specification classification has not been completed for all requirement specification texts (i.e., if there is requirement specification text that has not been classified) (step S309, NO), the classification support device 30 continues to repeat the requirement specification classification process (steps S303 to S308). If requirement specification classification has been completed for all requirement specification texts (step S309, YES), the operation of the classification support device 30 shown in the flowchart in Figure 10 ends.

以上説明した少なくともひとつの実施形態によれば、学習用の要求仕様文書（半構造化された学習用の文書）に含まれる文字列に基づく学習用の要求仕様テキスト（学習用情報）と、学習用の要求仕様テキストの内容の分類を特定する分類情報と、が対応付けられた学習用データを取得し、学習用の要求仕様テキスト（学習用情報）に基づく単語ベクトルに変換された学習用の要求仕様テキスト（学習用入力情報）と、分類情報と、を入力して機械学習を行い、学習済みの学習モデルを示す要求仕様分類モデルＭ２（学習済みモデル）を学習済みモデル記憶部２０５（記憶部）に記憶させる要求仕様分類学習部２０８（分類学習部）と、分類対象の要求仕様文書（半構造化された文書）に含まれる文字列に基づく要求仕様テキスト（分類用情報）を取得し、要求仕様テキスト（分類用情報）に基づく単語ベクトルに変換された要求仕様テキスト（分類用入力情報）を要求仕様分類モデルＭ２に入力することによって要求仕様テキスト（文字列）の内容の分類結果に関する分類結果関連情報を取得する要求仕様分類部３０７（分類部）と、分類結果関連情報に基づく分類結果情報を出力する分類結果出力部３０９と、を持つことにより、半構造化された文書に含まれる情報の分類作業を作業品質を損なうことなく効率化することができる。 According to at least one embodiment described above, a requirements specification classification learning unit 208 ( ) acquires learning data in which a requirements specification text for training (learning information) based on character strings contained in a requirements specification document for training (semi-structured learning document) is associated with classification information that identifies the classification of the content of the requirements specification text for training, inputs the requirements specification text for training (learning input information) converted into a word vector based on the requirements specification text for training (learning information), and the classification information, performs machine learning, and stores a requirements specification classification model M2 (trained model) that indicates the trained learning model in the trained model storage unit 205 (storage unit). The system has a requirements specification classification unit 307 (classification unit) that acquires requirements specification text (classification information) based on character strings contained in the requirements specification document (semi-structured document) to be classified, and inputs the requirements specification text (classification input information) converted into word vectors based on the requirements specification text (classification information) into a requirements specification classification model M2 to acquire classification result-related information regarding the classification results of the contents of the requirements specification text (character strings), and a classification result output unit 309 that outputs classification result information based on the classification result-related information, making it possible to classify information contained in semi-structured documents more efficiently without compromising work quality.

なお、上述した実施形態における学習用データ作成装置１０、学習装置２０、及び分類支援装置３０の一部又は全部をコンピュータで実現するようにしてもよい。その場合、この制御機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、学習用データ作成装置１０、学習装置２０、及び分類支援装置３０に内蔵されたコンピュータシステムであって、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信回線のように、短時間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 In addition, some or all of the training data creation device 10, the learning device 20, and the classification support device 30 in the above-described embodiments may be implemented by a computer. In this case, the control functions may be realized by recording a program for realizing the control functions on a computer-readable recording medium and loading and executing the program recorded on the recording medium into a computer system. Note that the term "computer system" as used herein refers to the computer system built into the training data creation device 10, the learning device 20, and the classification support device 30, and includes hardware such as the OS and peripheral devices. Furthermore, "computer-readable recording medium" refers to portable media such as flexible disks, optical magnetic disks, ROMs, and CD-ROMs, as well as storage devices such as hard disks built into computer systems. Furthermore, "computer-readable recording medium" may also include media that dynamically store programs for short periods of time, such as communication lines when transmitting programs via networks such as the Internet or telephone lines, or media that store programs for a fixed period of time, such as volatile memory within the computer system that serves as the server or client in such cases. The program may also be designed to realize some of the functions described above, or may be capable of realizing the functions described above in combination with a program already stored in the computer system.

また、上述した実施形態における学習用データ作成装置１０、学習装置２０、及び分類支援装置３０を、ＬＳＩ（Large Scale Integration）等の集積回路として実現してもよい。学習用データ作成装置１０、学習装置２０、及び分類支援装置３０の各機能ブロックは個別にプロセッサ化してもよいし、一部、または全部を集積してプロセッサ化してもよい。また、集積回路化の手法はＬＳＩに限らず専用回路、または汎用プロセッサで実現してもよい。また、半導体技術の進歩によりＬＳＩに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いてもよい。 The learning data creation device 10, learning device 20, and classification support device 30 in the above-described embodiments may also be realized as integrated circuits such as LSI (Large Scale Integration). Each functional block of the learning data creation device 10, learning device 20, and classification support device 30 may be individually implemented as a processor, or some or all of them may be integrated into a processor. The integrated circuit implementation method is not limited to LSI, and may also be implemented using dedicated circuits or general-purpose processors. Furthermore, if an integrated circuit implementation technology that can replace LSI emerges due to advances in semiconductor technology, an integrated circuit based on that technology may also be used.

本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 While several embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These embodiments may be embodied in a variety of other forms, and various omissions, substitutions, and modifications may be made without departing from the spirit of the invention. These embodiments and their variations are within the scope of the invention and its equivalents as defined in the claims, as well as the scope and spirit of the invention.

１…分類支援システム、１０…学習用データ作成装置、２０…学習装置、３０…分類支援装置、１０１…学習用要求仕様テキスト取得部、１０２…要求仕様分類テーブル記憶部、１０３…学習用データ生成部、１０４…操作入力部、１０５…学習用データ記憶部、１０６…学習用データ出力部、２０１…コーパス取得部、２０２…前処理ルール記憶部、２０３…テキスト前処理部、２０４…単語ベクトル学習部、２０５…学習済みモデル記憶部、２０６…学習用データ取得部、２０７…単語ベクトル変換部、２０８…要求仕様分類学習部、２０９…学習済みモデル出力部、３０１…学習済みモデル取得部、３０２…学習済みモデル記憶部、３０３…要求仕様テキスト取得部、３０４…前処理ルール記憶部、３０５…テキスト前処理部、３０６…単語ベクトル変換部、３０７…要求仕様分類部、３０８…分類結果解析部、３０９…分類結果出力部 1...Classification support system, 10...Learning data creation device, 20...Learning device, 30...Classification support device, 101...Learning requirement specification text acquisition unit, 102...Requirement specification classification table storage unit, 103...Learning data generation unit, 104...Operation input unit, 105...Learning data storage unit, 106...Learning data output unit, 201...Corpus acquisition unit, 202...Preprocessing rule storage unit, 203...Text preprocessing unit, 204...Word vector learning unit, 205...Learning Trained model storage unit, 206... training data acquisition unit, 207... word vector conversion unit, 208... requirements classification training unit, 209... trained model output unit, 301... trained model acquisition unit, 302... trained model storage unit, 303... requirements specification text acquisition unit, 304... preprocessing rule storage unit, 305... text preprocessing unit, 306... word vector conversion unit, 307... requirements specification classification unit, 308... classification result analysis unit, 309... classification result output unit

Claims

a classification learning unit that acquires learning data in which learning information based on character strings contained in semi-structured learning documents that include at least text that can be classified into functional requirements that indicate requirements related to functions that must be installed in the system and non-functional requirements that indicate requirements other than the functional requirements, and classification information that identifies the classification of the content of the character strings, and performs machine learning by inputting learning input information based on the learning information and the classification information, and stores a learned model that indicates a learned learning model in a storage unit;
a classification unit that acquires classification information based on character strings included in a semi-structured document that includes at least text that can be classified into the functional requirements and the non-functional requirements, and inputs classification input information based on the classification information into the trained model to classify the contents of the character strings into the functional requirements and the non-functional requirements , thereby acquiring classification result-related information that indicates a probability value for each classification;
an analysis unit that generates classification result information indicating the classification result based on the classification result related information and a predetermined threshold;
a classification result output unit that outputs the classification result information;
Equipped with
The analysis unit updates the value of the threshold based on a result of analyzing the classification result related information.

a classification storage unit that stores a classification table that lists classification information that identifies the classification of the content of character strings included in semi-structured documents;
a training data generation unit that acquires character strings included in semi-structured training documents and generates training data in which the character strings are associated with specific categories selected from the category information included in the category table based on an operation input by a user;
The classification assistance system of claim 1 further comprising:

3. The classification support system according to claim 1, further comprising a word vector conversion unit that acquires the training input information representing the training information converted into word vectors by inputting the training information into a word vector model that has undergone machine learning using as input at least one of character strings contained in a corpus and character strings contained in the semi-structured training documents.

The classification support system according to claim 3, further comprising a word vector learning unit that performs machine learning using as input at least one of character strings contained in the corpus and character strings contained in the semi-structured learning documents, and stores the word vector model representing a learned learning model in the storage unit.

The word vector conversion unit
The classification support system according to claim 4 , wherein the classification input information indicating the classification information converted into the word vector is obtained by inputting the classification information into the word vector model.

The classification support system according to claim 1 , wherein the classification result information is information indicating a classification selected based on the probability value.

The classification support system according to claim 6 , wherein the classification result information is information indicating a plurality of classifications for which the probability value is equal to or greater than the threshold value.

The classification support system according to any one of claims 1 to 7 , wherein the trained model is a neural network training model that has undergone machine learning to output the classification result related information in response to input of the classification input information.

The classification support system according to claim 5 , wherein the word vector model is a learning model that has undergone machine learning to output the classification input information in response to input of the classification information.

The classification support system of claim 1 , wherein the semi-structured learning documents and the semi-structured documents further include text that can be classified into constraints that indicate matters related to constraints during development and operation of the system.

a classification unit that acquires classification information based on character strings contained in a semi-structured document that includes at least text that can be classified into functional requirements that indicate requirements related to functions that must be installed in the system and non-functional requirements that indicate requirements other than the functional requirements, and inputs classification input information based on the classification information into a trained model to classify the contents of the character strings into the functional requirements and the non-functional requirements , thereby acquiring classification result-related information that indicates a probability value for each classification ;
an analysis unit that generates classification result information indicating the classification result based on the classification result related information and a predetermined threshold;
a classification result output unit that outputs the classification result information;
Equipped with
The analysis unit updates the value of the threshold based on a result of analyzing the classification result related information.

a classification learning step in which a computer acquires learning data in which learning information based on character strings contained in semi-structured learning documents that include at least text that can be classified into functional requirements that indicate requirements related to functions that must be installed in the system and non-functional requirements that indicate requirements other than the functional requirements and classification information that identifies the classification of the content of the character strings is associated with each other, and the computer inputs learning input information based on the learning information and the classification information to perform machine learning, and stores a learned model that indicates the learned learning model in a memory unit;
a classification step in which a computer acquires classification information based on character strings contained in a semi-structured document that includes at least text that can be classified into the functional requirements and the non-functional requirements, and inputs classification input information based on the classification information into the trained model to acquire classification result -related information that indicates a probability value for each classification, as information on the classification result of the classification of the contents of the character strings into the functional requirements and the non-functional requirements;
an analyzing step of generating classification result information indicating the classification result based on the classification result related information and a predetermined threshold;
a classification result output step of outputting the classification result information;
an updating step of updating the value of the threshold based on a result of analyzing the classification result related information;
A classification assistance method having the following.

On the computer,
a classification learning step of acquiring learning data in which learning information based on character strings contained in semi-structured learning documents containing at least text that can be classified into functional requirements indicating requirements related to functions that must be installed in the system and non-functional requirements indicating requirements other than the functional requirements, and classification information that identifies the classification of the content of the character strings are associated with each other, inputting learning input information based on the learning information and the classification information to perform machine learning, and storing a learned model indicating the learned learning model in a storage unit;
a classification step of acquiring classification information based on character strings contained in a semi-structured document including at least text that can be classified into the functional requirements and the non-functional requirements, and inputting classification input information based on the classification information into the trained model to acquire classification result-related information that indicates a probability value for each classification, as information on the classification result of the classification of the contents of the character strings into the functional requirements and the non-functional requirements;
an analyzing step of generating classification result information indicating the classification result based on the classification result related information and a predetermined threshold;
a classification result output step of outputting the classification result information;
an updating step of updating the value of the threshold based on a result of analyzing the classification result related information;
A program to execute.