KR101793185B1

KR101793185B1 - Method for identifying patient personal information

Info

Publication number: KR101793185B1
Application number: KR1020150014853A
Authority: KR
Inventors: 황보택근; 은성종; 김재승
Original assignee: 가천대학교 산학협력단
Priority date: 2015-01-30
Filing date: 2015-01-30
Publication date: 2017-11-02
Anticipated expiration: 2035-01-30
Also published as: KR20160093922A

Abstract

본 발명의 실시예에 따른 환자 개인정보 식별 방법은 학습 과정을 수행하는 단계를 통해 패턴분석 과정을 수행하는 단계에서 사용되는 식별케이스 우선순위를 업데이트하여, 패턴분석 과정을 수행하는 단계를 더욱 빠르고 정확하게 수행하고 이에 따라 정확한 매칭 결과를 산출할 수 있는 효과가 있다. The method for identifying patient personal information according to an embodiment of the present invention updates the identification case priority used in performing the pattern analysis process through the step of performing the learning process so as to perform the pattern analysis process more quickly and accurately And the accurate matching result can be calculated accordingly.

Description

A method for identifying a patient '

본 발명은 환자 개인정보 식별 방법에 관한 것으로, 특히 병원 사이에 환자의 의료정보를 교류하기 위해 환자의 기본 정보를 이용하여 식별하는 환자 개인정보 식별 방법에 관한 것이다.
BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a method of identifying a patient personal information, and more particularly, to a method of identifying a patient personal information by using basic information of a patient to exchange medical information of the patient between hospitals.

종래에는 병원별로 병원정보 시스템이 운영되어, 이로 인해 병원 내 의료서비스 효율이 많이 향상되고 있다. 그러나 병원마다 운영되는 병원정보시스템의 수집 정보 및 포맷이 다르기 때문에, 병원간 의료정보 교류가 어려우며, 이로 인해 환자입장에서 같은 질병을 다른 병원에서 진료를 받을 경우 처음부터 검사 및 진단을 다시 받게 되는 등 이로 인한 여러 가지 시간적, 금전적 낭비가 발생되고 있다. Conventionally, a hospital information system is operated for each hospital, thereby improving the efficiency of the medical service in the hospital. However, it is difficult to exchange medical information between hospitals because the collection information and format of the hospital information system operated by each hospital are different. As a result, when the same disease is treated at another hospital, This causes various time and money waste.

미국과 같은 선진국에서는 MPI(Master Patient Index) 알고리즘이 연구 개발되어 서로 다른 병원 사이에 환자정보를 비교 분석하여 병원간 환자 정보를 사용할 때 환자 개인을 식별할 수 있도록 사용되고 있으며, 이를 통해 병원진료 서비스의 질을 높이는 동시에 시간적, 금전적 낭비를 줄이고 있다. In developed countries such as the United States, MPI (Master Patient Index) algorithms have been developed and used to identify and identify individual patients when using patient information between hospitals by comparing and analyzing patient information between different hospitals. While increasing quality and reducing waste of time and money.

국내에는 주민등록번호와 같이 고유한 개인식별 번호가 있으나, 최근 세계적으로 이슈가 되고 있는 개인정보보호 및 국내에 있는 개인정보 관련 법으로 인해 이러한 고유한 개인식별번호를 이용하여 환자 식별에 사용할 수 없다. In Korea, there is a unique identification number such as a resident registration number. However, due to the protection of personal information which is a global issue and personal information related laws in Korea, such unique identification number can not be used for patient identification.

또한, 현재 공개되어 있는 MPI 시스템의 알고리즘은 1바이트 문자인 영어권 환경기반에서 개발되었기 때문에 2바이트 문자이면서, 조합문자를 사용하는 국내의 환경에 바로 적용하기에는 어려움이 있다. In addition, since the algorithm of the currently disclosed MPI system is developed on the basis of a 1-byte English-language environment, it is difficult to directly apply it to a domestic environment using a 2-byte character and a combination character.

이러한 문제점을 개선하기 위해서는 주민등록번호와 같은 고유식별번호를 제외한 다른 정보들을 매칭하여 환자 개인을 식별하는 국내 상황에 적합한 환자 개인정보 식별 프로그램이 필요하다.
In order to solve such a problem, a patient personal information identification program suitable for the domestic situation in which a patient individual is identified by matching other information other than a unique identification number such as a resident registration number is needed.

특허문헌 1: 공개특허공보 제10-2009-0126241호Patent Document 1: Published Patent Application No. 10-2009-0126241

본 발명은 상기 문제점을 해소하기 위하여 안출된 것으로, 본 발명의 목적은 병원 사이에 환자의 의료정보를 교류하기 위해 환자의 주민등록번호와 같은 고유식별번호를 제외한 다른 정보들을 이용하여 식별하는 환자 개인정보 식별 방법을 제공하는 데 있다.
SUMMARY OF THE INVENTION The present invention has been made in order to solve the above problems, and it is an object of the present invention to provide a patient information management system and a patient information management system for identifying patient information by using other information except a unique identification number such as a patient's resident registration number Method.

본 발명의 실시예에 따른 환자 개인정보 식별 방법은 모듈부가 다양한 형태의 입력에 대응할 수 있는 입력 API로서 입력기능부를 통해 환자식별 항목을 입력받는 단계(S210); 상기 모듈부는 상기 환자식별 항목을 데이터 베이스부의 정규화 룰 DB에 의해 전처리하는 단계(S220); 상기 모듈부는 상기 환자식별 항목의 입력타입을 확인 분석하여 상기 데이터 베이스부의 마스터 테이블에서 가장 매칭률이 높은 식별케이스에 맞는 MPI 식별 케이스군을 선별하는 패턴분석 단계(S230); 상기 모듈부가 상기 MPI 식별 케이스군에 대해 정확매칭을 수행하는 단계(S240); 상기 모듈부가 환자 개인의 식별을 위한 환자식별 정보가 유일한지를 판단하는 단계(S250); 상기 환자식별정보 판단 단계(S250)에서 유일한 환자식별 정보가 없음에 따라, 상기 모듈부는 상기 MPI 식별 케이스군에 대해 확률매칭을 수행하는 단계(S260); 상기 정확매칭을 수행하는 단계(S240)의 매칭결과 정보와 상기 확률매칭을 수행하는 단계(S260)의 매칭결과 정보를 출력하고, 고유식별코드를 새로 생성하여 출력하는 단계(S270); 및 상기 모듈부는 최종적으로 환자 식별하는데 선택된 식별케이스 정보와 환자 정보가 새로 생성되어 축적된 상기 마스터 테이블의 정보를 학습하여, MPI 식별 케이스의 우선순위 결정 룰의 가중치를 업데이트하는 학습 과정을 수행하는 단계(S280);를 포함한다. A method for identifying a patient personal information according to an exemplary embodiment of the present invention includes: inputting (S210) a patient identification item through an input function unit as an input API capable of responding to various types of input by a module unit; The module unit preprocessing the patient identification item by the normalization rule DB of the database unit (S220); A module analyzing step (S230) of analyzing and analyzing the input type of the patient identification item to select a group of MPI identification cases matching the identification case with the highest matching rate in the master table of the database unit; Performing the exact matching on the MPI identification case group (S240); Determining whether the patient identification information for identifying the patient individual is unique (S250); In step S250, the module unit performs probability matching on the MPI identification case group in accordance with the absence of unique patient identification information in the patient identification information determination step S250. A step S270 of outputting the matching result information of the performing step S240 and the matching result information of performing the probability matching step S260, and generating and outputting a unique ID code; And the module unit learns information of the master table which is newly generated and accumulated in the patient information and the identification case information which is finally selected to identify the patient, and performs a learning process of updating the weight of the priority determination rule of the MPI identification case (S280).

본 발명의 실시예에 따른 환자 개인정보 식별 방법은 상기 환자식별 항목을 입력받는 단계(S210)에서 입력받는 환자식별 항목으로 성명, 생년월일, 성별, 주소, 핸드폰번호, 유선전화번호, 이메일주소, 보험종류, 주진료과 및 발생일자를 포함하는 것을 특징으로 한다. The method of identifying a patient personal information according to an exemplary embodiment of the present invention may include a step of inputting the patient identification item in step S210 in which the name, date of birth, sex, address, mobile phone number, Type, primary care department, and date of occurrence.

본 발명의 실시예에 따른 환자 개인정보 식별 방법에서 상기 전처리하는 단계(S220)는 상기 환자식별 항목을 입력받는 단계(S210)에서 입력된 데이터를 로드하는 단계(S221); 상기 로드된 입력 데이터에 맞는 전처리 룰을 전처리 룰 DB를 통해 로드하는 단계(S222); 상기 로드된 입력 데이터에 대해 항목별 데이터 타입 및 길이를 확인하는 단계(S223); 상기 데이터 타입 및 길이를 확인한 데이터를 상기 전처리 룰 DB를 통해 로드된 상기 전처리 룰에 따라 기설정한 정규표현 형식으로 정규화하는 단계(S224); 및 상기 정규화된 데이터를 전처리 결과 정보로 저장하는 단계(S225);를 포함하는 것을 특징으로 한다. In the patient personal information identification method according to an embodiment of the present invention, the preprocessing step (S220) includes loading (S221) the data input in the step S210 of receiving the patient identification item; A step (S222) of loading a pre-processing rule corresponding to the loaded input data through a pre-processing rule DB; Confirming (S223) the item-specific data type and length of the loaded input data; (S224) normalizing the data confirming the data type and the length into a predetermined regular expression format according to the pre-processing rule loaded through the pre-processing rule DB; And storing the normalized data as pre-processing result information (S225).

본 발명의 실시예에 따른 환자 개인정보 식별 방법에서 상기 정규화하는 단계(S224)는 한글을 표현하는 유니코드를 이용하여 초성, 중성, 종성을 분리하고 인덱스(index)를 산출하여 정규화하는 것을 특징으로 한다. In the patient personal information identification method according to an embodiment of the present invention, the normalizing step S224 is a method of separating the beginning, the middle, and the ending by using Unicode representing Hangul and calculating an index and normalizing the index do.

본 발명의 실시예에 따른 환자 개인정보 식별 방법에서 상기 패턴분석 단계(S230)는 상기 정규화하는 단계(S224)를 거쳐 정규화된 입력 항목에 데이터가 존재하는지를 확인하는 단계(S231); 상기 정규화된 입력 항목에 데이터가 존재하는 경우, 존재하는 데이터의 입력타입을 확인하는 단계(S232); 상기 상기 데이터 베이스부의 식별케이스 우선순위 DB에 접근하여 식별 케이스 우선순위를 로드하는 단계(S233); 확인된 입력타입에 해당하면서 우선순위가 가장 높은 식별 케이스를 선정하는 단계(S234); 및 해당하는 식별케이스 순번과 매칭률 임계값을 저장하는 단계(S235);를 포함하는 것을 특징으로 한다. In the patient personal information identification method according to an embodiment of the present invention, the pattern analysis step S230 may include a step S231 of checking whether there is data in the normalized input item through the normalizing step S224; Checking the input type of data existing in the normalized input item (S232); Accessing an identification case priority DB of the database unit and loading an identification case priority (S233); Selecting an identification case having the highest priority corresponding to the confirmed input type (S234); And storing the corresponding identification case sequence number and the matching rate threshold value (S235).

본 발명의 실시예에 따른 환자 개인정보 식별 방법에서 상기 정확매칭을 수행하는 단계(S240)는 상기 모듈부가 상기 데이터 베이스부의 마스터 테이블에 연결하여 정확매칭을 시동하는 단계(S241); 상기 모듈부가 상기 마스터 테이블에 연결하여 상기 패턴분석 과정을 수행하는 단계(S230)에서 선정된 식별 케이스와 해당하는 임계값을 로드하는 단계(S242); 상기 전처리 과정을 수행하는 단계(S220)에서 전처리된 입력 데이터를 로드하는 단계(S243); 매칭 알고리즘을 로드하는 단계(S244); 상기 환자식별 항목의 입력 데이터 발생일자와 타겟 레코드의 적용일자를 비교하여, 상기 발생일자 이전의 데이터를 타켓으로 선정하는 단계(S245); 상기 매칭 알고리즘을 사용하여 상기 식별 케이스 항목 순서대로 매칭률을 계산하는 단계(S246); 및 상기 식별케이스 항목 순서대로 계산한 매칭률을 기설정된 매칭률 임계값과 비교하여 고유식별정보를 판단하는 단계(S247);를 포함하는 것을 특징으로 한다. In the patient personal information identification method according to an embodiment of the present invention, the correct matching step S240 may include: (S241) starting the accurate matching by connecting the module unit to the master table of the database unit; A step (S242) of loading the identification case and the corresponding threshold in the step S230 of connecting the module to the master table and performing the pattern analysis process; A step (S243) of loading the preprocessed input data in the pre-processing step (S220); Loading a matching algorithm (S244); Comparing the date of occurrence of the input data of the patient identification item with the application date of the target record, and selecting the data before the occurrence date as the target (S245); Calculating a matching rate in the order of the identification case items using the matching algorithm (S246); And comparing the matching rate calculated in the order of the identification case item with a predetermined matching rate threshold value to determine unique identification information (S247).

본 발명의 실시예에 따른 환자 개인정보 식별 방법에서 상기 확률매칭을 수행하는 단계(S260)는 상기 패턴분석 과정을 수행하는 단계(S230)에서 기 선정된 식별케이스 우선순위를 로드하는 단계(S261); 상기 데이터 베이스부의 식별케이스 우선순위 DB에 접근하여 정확매칭 이후 미식별시 사용되어진 다음 우선순위의 식별케이스를 가져오는 단계(S262); 상기 환자식별 항목을 입력받는 단계(S210)에서 입력된 입력데이터의 발생일자와 적용일자를 비교하여, 상기 발생일자 이전의 적용일자 레코드를 타겟으로 판단하는 단계(S263); 매칭 알고리즘을 사용하여 식별케이스 항목 순서대로 매칭률을 계산하는 단계(S264); 및 상기 식별케이스 항목 순서대로 계산한 매칭률을 설정된 매칭률 임계값과 비교하여 고유식별정보를 판단하는 단계(S265);를 포함하는 것을 특징으로 한다. The performing step S260 of performing the probability matching in the patient personal information identifying method according to the embodiment of the present invention may include the step S261 of loading the priority of the identified case in step S230 of performing the pattern analysis process, ; Accessing the identification case priority DB of the database unit and obtaining an identification case of the next priority used in the unidentified after the accurate matching (S262); Comparing the generation date of the input data inputted in the step S210 of receiving the patient identification item with the application date, and judging the application date record before the generation date as the target (S263); A step (S264) of calculating a matching rate in the order of the identification case items using the matching algorithm; And comparing the matching rate calculated in the order of the identification case item with the set matching rate threshold value to determine the unique identification information (S265).

본 발명의 실시예에 따른 환자 개인정보 식별 방법에서 상기 고유식별코드를 생성하여 출력하는 단계(S270)는 상기 정확매칭 과정을 수행하는 단계(S240) 및 상기 확률매칭 과정을 수행하는 단계(S260)를 거쳐 처리된 매칭 결과들을 로드하는 단계(S271); 및 상기 로드된 매칭결과가 유일한 하나의 매칭값으로 나온 경우, 2개 이상의 매칭값으로 나온 경우 및 매칭된 값이 하나라도 없을 경우의 3가지 경우로 구분하여 판단하는 단계(S272);를 포함하는 것을 특징으로 한다. The step S270 of generating and outputting the unique ID code in the patient personal information identification method according to the embodiment of the present invention includes performing the correct matching process S240 and performing the probability matching process S260. A step S271 of loading the processed matching results through the step S271; And determining (S272) whether the loaded matching result is a unique matching value, a case where the matching result is at least two matching values, or a case where there is no matching value at step S272 .

본 발명의 실시예에 따른 환자 개인정보 식별 방법은 상기 3가지 경우로 구분하여 판단하는 단계(S272)에서 로드된 매칭결과가 상기 마스터 테이블에서 유일한 매칭값으로 나온 경우에, 검색된 환자 데이터를 MRN 번호와 같이 표시하고 출력하는 것을 특징으로 한다. In the method for identifying patient personal information according to the embodiment of the present invention, when the matching result loaded in the step S272 of discriminating the three cases is the unique matching value in the master table, And outputs the same.

본 발명의 실시예에 따른 환자 개인정보 식별 방법은 상기 3가지 경우로 구분하여 판단하는 단계(S272)에서 로드된 매칭결과가 2개 이상으로 나오는 경우에, 상기 2개 이상의 매칭결과를 화면에 출력하는 단계(S274-2); 및 상기 출력된 2개 이상의 매칭결과에 대해 사용자의 선택을 입력받는 단계(S275-2);를 더 포함하는 것을 특징으로 한다. In the method for identifying patient personal information according to an embodiment of the present invention, when the number of the loaded matching results is two or more in step S272, the two or more matching results are displayed on the screen (S274-2); And a step (S275-2) of receiving a selection of a user for the output results of the two or more matching operations.

본 발명의 실시예에 따른 환자 개인정보 식별 방법은 상기 3가지 경우로 구분하여 판단하는 단계(S272)에서 로드된 매칭결과가 하나라도 없을 경우에, 사용자의 입력에 따라 새로운 MRN 번호를 생성할지 여부를 판단하는 단계(S274-3); 사용자의 입력에 따라 새로운 MRN 번호를 생성하는 단계(S275-3); 및 생성된 MRN 번호와 관련된 정보를 상기 마스터 테이블에 추가 저장하고 출력하는 단계(S276-3);를 더 포함하는 것을 특징으로 한다. In the patient personal information identification method according to the embodiment of the present invention, when there is no matching result loaded in the step S272 of discriminating the three cases, it is determined whether to generate a new MRN number according to a user's input (S274-3); A step (S275-3) of generating a new MRN number according to a user's input; And a step (S276-3) of additionally storing and outputting information related to the generated MRN number to the master table.

본 발명의 실시예에 따른 환자 개인정보 식별 방법에서 상기 학습 과정을 수행하는 단계(S280)는 환자 식별하는데 선택된 매칭 결과를 로드하는 단계(S281); 상기 마스터 테이블에 연결하여 식별성공 케이스들을 카운트하고 식별케이스 우선순위를 재산출하는 단계(S283); 및 상기 재산출된 식별케이스 우선순위에 따라 상기 데이터 베이스부의 식별케이스 우선순위 DB를 업데이트하는 단계(S284);를 포함하는 것을 특징으로 한다.
In the patient personal information identification method according to an embodiment of the present invention, the step of performing the learning process (S280) includes loading (S281) a matching result selected for patient identification; Counting the successful cases of identification by connecting to the master table and restoring the identification case priority (S283); And updating the identification case priority DB of the database unit according to the re-calculated identification case priority (S284).

본 발명의 특징 및 이점들은 첨부도면에 의거한 다음의 상세한 설명으로 더욱 명백해질 것이다.The features and advantages of the present invention will become more apparent from the following detailed description based on the accompanying drawings.

이에 앞서, 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이고, 사전적인 의미로 해석되어서는 아니 되며, 발명자가 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념을 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합되는 의미와 개념으로 해석되어야만 한다.
Prior to this, terms and words used in the present specification and claims should not be construed in a conventional, dictionary sense, and should not be construed as defining the concept of a term appropriately in order to describe the inventor in his or her best way. It should be construed in accordance with the meaning and concept consistent with the technical idea of the present invention.

도 1은 본 발명의 실시예에 따른 환자 개인정보 식별 방법을 수행하는 장치의 구성도.
도 2는 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 전체 순서도.
도 3은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 전처리 단계를 설명하기 위한 순서도.
도 4는 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 패턴분석 단계를 설명하기 위한 순서도.
도 5는 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 정확매칭 단계를 설명하기 위한 순서도.
도 6은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 확률매칭 단계를 설명하기 위한 순서도.
도 7은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 출력 단계를 설명하기 위한 순서도.
도 8은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 학습 단계를 설명하기 위한 순서도.
도 9는 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 정확도를 나타낸 그래프.
도 10은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 민감도를 나타낸 그래프.
도 11은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 특이도를 나타낸 그래프.
도 12는 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 처리속도를 나타낸 그래프. 1 is a block diagram of an apparatus for performing a patient personal information identification method according to an embodiment of the present invention;
2 is an overall flowchart of a method for identifying patient personal information according to an embodiment of the present invention.
FIG. 3 is a flowchart illustrating a preprocessing step of a patient personal information identification method according to an embodiment of the present invention. FIG.
FIG. 4 is a flowchart illustrating a pattern analysis step of a patient personal information identification method according to an embodiment of the present invention. FIG.
FIG. 5 is a flowchart illustrating an accurate matching step of a patient personal information identification method according to an embodiment of the present invention; FIG.
FIG. 6 is a flowchart for explaining a probability matching step of a patient personal information identification method according to an embodiment of the present invention; FIG.
FIG. 7 is a flowchart illustrating an output step of a patient personal information identification method according to an embodiment of the present invention; FIG.
FIG. 8 is a flowchart for explaining a learning step of a patient personal information identification method according to an embodiment of the present invention; FIG.
FIG. 9 is a graph showing the accuracy of a patient personal information identification method according to an embodiment of the present invention. FIG.
FIG. 10 is a graph illustrating sensitivity of a patient personal information identification method according to an embodiment of the present invention. FIG.
11 is a graph showing the specificity of a patient personal information identification method according to an embodiment of the present invention.
12 is a graph showing the processing speed of a patient personal information identification method according to an embodiment of the present invention.

본 발명의 목적, 특정한 장점들 및 신규한 특징들은 첨부된 도면들과 연관되는 이하의 상세한 설명과 바람직한 실시예로부터 더욱 명백해질 것이다. 본 명세서에서 각 도면의 구성요소들에 참조번호를 부가함에 있어서, 동일한 구성 요소들에 한해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 번호를 가지도록 하고 있음에 유의하여야 한다. 또한, 제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 또한, 본 발명을 설명함에 있어서, 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우 그 상세한 설명은 생략한다.BRIEF DESCRIPTION OF THE DRAWINGS The objects, particular advantages and novel features of the present invention will become more apparent from the following detailed description taken in conjunction with the accompanying drawings, in which: FIG. It should be noted that, in the present specification, the reference numerals are added to the constituent elements of the drawings, and the same constituent elements are assigned the same number as much as possible even if they are displayed on different drawings. Also, the terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명하기로 한다. 도 1은 본 발명의 실시예에 따른 환자 개인정보 식별 방법을 수행하는 장치의 구성도이다. Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. 1 is a block diagram of an apparatus for performing a patient personal information identification method according to an embodiment of the present invention.

본 발명의 실시예에 따른 환자 개인정보 식별 방법을 수행하는 장치(100)는 대략 모듈부(110)와 데이터베이스부(120)를 포함한다. An apparatus 100 for performing a patient personal information identification method according to an embodiment of the present invention includes a module unit 110 and a database unit 120. [

구체적으로, 모듈부(110)는 도 1에 도시된 바와 같이 환자식별항목 정보를 입력받는 입력기능부, 입력된 환자식별항목 정보를 전처리 룰 DB와 비교하기 위한 전처리 기능부, 식별케이스 우선순위에 의한 패턴분석 기능부, 마스터 테이블의 정보와 비교하여 개인식별을 수행하기 위한 정확매칭 기능부, 정확매칭 후 유사한 결과에 대한 식별률을 높이기 위한 확률매칭 기능부, 최종 식별 결과를 출력하기 위한 출력 기능부 및 식별 완료 데이터의 축적에 따라서 정확도를 향상시키기 위한 학습기능을 수행하는 학습기능부를 포함한다. 1, the module unit 110 includes an input function unit for inputting the patient identification item information, a preprocessing function unit for comparing the entered patient identification item information with the preprocessing rule DB, A probability matching function unit for increasing the identification rate of similar results after accurate matching, and an output function for outputting the final identification result And a learning function unit for performing a learning function for improving the accuracy in accordance with the accumulation of the partial and identification completed data.

데이터베이스부(120)는 개인고유식별 코드를 저장 관리하는 마스터 테이블, 입력데이터의 정규화를 위한 전처리 룰을 저장한 전처리 룰 DB(Data Base) 및 식별가능 항목의 군을 설정하기 위한 식별케이스 우선순위 룰을 저장한 식별케이스 우선순위 DB로 구성된다. The database unit 120 includes a master table for storing and managing a personal unique identification code, a preprocessing rule database (DB) for storing a preprocessing rule for normalizing input data, and an identification case priority rule for setting a group of identifiable items And an identification case priority DB storing the IDs.

이때, 모듈부(110)의 입력기능부는 다양한 형태의 입력에 대응할 수 있는 입력 API(Application Programming Interface)를 통해 예를 들어 성명, 생년월일, 성별, 주소, 핸드폰번호, 유선전화번호, 이메일주소, 보험종류, 주진료과, 발생일자 등과 같은 환자식별 항목을 입력받을 수 있다. 여기서, 보험종류는 환자의 보험자격 코드(예컨대 직장가입, 지역가입, 의료보호 등)를 의미하고, 발생일자는 입력항목의 데이터가 실제 생성된 날짜(발생일자 미 입력시 입력기능부를 통해 입력되는 시점의 날짜)를 의미한다. At this time, the input function unit of the module unit 110 may include, for example, a name, a date of birth, a sex, an address, a mobile phone number, a fixed telephone number, an email address, insurance Type, main medical department, date of occurrence, and the like. Here, the type of insurance means an insurance qualification code of the patient (for example, a workplace subscription, a regional subscription, a medical care, etc.), and the occurrence date is a date when the data of the input item is actually generated Date of the time point).

모듈부(110)의 전처리 기능부는 입력받은 환자식별 항목을 정규화 룰에 의거하여 항목별 포맷팅과 교정하여 정규화할 수 있다. The preprocessing function unit of the module unit 110 can normalize the entered patient identification items by formatting and calibrating them according to the normalization rule.

모듈부(110)의 패턴분석 기능부는 환자식별 항목의 입력타입을 확인하고, 그에 맞는 식별케이스를 데이터베이스부(120)의 식별케이스 우선순위 DB에서 가장 우선순위가 높은 식별 케이스를 선택할 수 있다. The pattern analysis function unit of the module unit 110 may check the input type of the patient identification item and select the identification case having the highest priority in the identification case priority DB of the database unit 120. [

모듈부(110)의 정확매칭 기능부는 패턴분석 기능부에서 선택된 식별 케이스 항목을 기반으로 입력기능부를 통해 입력된 데이터와 마스터 테이블에 저장된 데이터 사이의 매칭을 수행하여 정확매칭률 임계값보다 크거나 같은 환자식별 정보를 선별할 수 있다. The accurate matching function of the module unit 110 performs matching between the data input through the input function unit and the data stored in the master table based on the identification case item selected by the pattern analysis function unit, Patient identification information can be selected.

모듈부(110)의 확률매칭 기능부는 정확매칭 기능부에서 유일한 식별레코드를 선별하지 못한 경우, 식별케이스에서 다음 우선순위에 해당하는 식별케이스의 매칭항목들을 하나씩 추가하여 매칭을 수행함으로써 매칭확률을 높여가며 선별하는 기능을 수행할 수 있다. If the probability matching function unit of the module unit 110 fails to select a unique identification record in the accurate matching function unit, matching items of the identification case corresponding to the next priority are added one by one in the identification case to increase the matching probability It is possible to carry out the function of sorting.

모듈부(110)의 출력 기능부는 매칭과정 수행 후 유일한 환자식별 정보를 선별한다면 해당하는 정보를 출력하고, 매칭율 임계값을 넘는 정보가 2개 이상이라면 모두를 출력하여 사용자가 판단 선택하게 하며, 선택 매칭율 임계값을 넘는 정보가 없다면 고유식별코드를 새로 생성하고 이를 출력하는 기능을 수행할 수 있다. The output function unit of the module unit 110 outputs the corresponding information if the unique patient identification information is selected after performing the matching process. If the information exceeds the matching rate threshold value, the output function unit outputs all of the information, If there is no information exceeding the selection matching rate threshold value, a unique identification code may be newly generated and output.

모듈부(110)의 학습기능부는 최종적으로 환자 식별하는데 선택된 식별케이스 정보와 환자 정보가 새로 생성되어 축적되는 마스터 테이블의 정보를 학습하여, 식별케이스의 우선순위 결정 룰의 가중치를 업데이트하는 기능을 수행할 수 있다. The learning function unit of the module unit 110 learns information of the master case table in which the identification case information and the patient information newly selected for patient identification are newly generated and stored and updates the weight of the priority order determination rule of the identification case can do.

이와 같이 구성된 본 발명의 실시예에 따른 환자 개인정보 식별 방법을 수행하는 장치(100)는 모듈부(110)와 데이터 베이스부(120)를 이용한 환자정보식별 알고리즘을 통해 환자식별정보를 판별하고 식별정확성을 업데이트하여 향상시킬 수 있다.
The apparatus 100 for performing the method for identifying patient personal information according to the embodiment of the present invention identifies and identifies the patient identification information through the patient information identification algorithm using the module unit 110 and the database unit 120, You can improve accuracy by updating.

이하, 본 발명의 실시예에 따른 환자 개인정보 식별 방법을 수행하는 장치(100)에서 이루어지는 환자 개인정보 식별 방법에 대해 도 2 내지 도 8을 참조하여 설명한다. 도 2는 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 전체 순서도이고, 도 3은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 전처리 단계를 설명하기 위한 순서도이며, 도 4는 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 패턴분석 단계를 설명하기 위한 순서도이며, 도 5는 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 정확매칭 단계를 설명하기 위한 순서도이며, 도 6은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 확률매칭 단계를 설명하기 위한 순서도이며, 도 7은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 출력 단계를 설명하기 위한 순서도이며, 도 8은 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 학습 단계를 설명하기 위한 순서도이다. Hereinafter, a patient personal information identification method performed by the apparatus 100 for performing a patient personal information identification method according to an embodiment of the present invention will be described with reference to FIG. 2 to FIG. 3 is a flowchart illustrating a preprocessing step of a patient personal information identification method according to an embodiment of the present invention. FIG. 4 is a flowchart FIG. 5 is a flow chart for explaining an accurate matching step of a patient personal information identification method according to an embodiment of the present invention, and FIG. 6 7 is a flowchart illustrating an output step of a patient personal information identification method according to an embodiment of the present invention, and FIG. 8 Is a flowchart for explaining a learning step of a patient personal information identification method according to an embodiment of the present invention.

본 발명의 실시예에 따른 환자 개인정보 식별 방법은 도 2에 도시된 바와 같이 크게 모듈부(110)의 입력기능부를 통해 입력받은 정보에 대해 데이터 베이스부(120)의 마스터 테이블에서 적합한 환자식별 정보로 판별하는 데이터 매칭 과정 및 데이터 매칭 과정에서 식별정확도를 향상하기 위한 학습 과정을 포함한다. 2, the method for identifying patient personal information according to an exemplary embodiment of the present invention includes a step of searching for a patient in the master table of the database unit 120 for the information received through the input function unit of the module unit 110, And a learning process for improving the identification accuracy in the data matching process.

먼저, 본 발명의 실시예에 따른 환자 개인정보 식별 방법을 수행하는 장치(100)의 모듈부(110)는 다양한 형태의 입력에 대응할 수 있는 입력 API로서 입력기능부를 통해 예를 들어 성명, 생년월일, 성별, 주소, 핸드폰번호, 유선전화번호, 이메일주소, 보험종류, 주진료과, 발생일자 등과 같은 환자식별 항목을 입력받는다(S210). First, a module 110 of an apparatus 100 for performing a patient personal information identification method according to an embodiment of the present invention is an input API capable of handling various types of input, for example, a name, a date of birth, A patient identification item such as a sex, an address, a mobile phone number, a fixed telephone number, an e-mail address, an insurance kind, a main medical care department, an occurrence date, and the like is inputted (S210).

여기서, 모듈부(110)의 입력기능부는 입력 API를 통해 환자식별 항목 각각에 정의된 데이터길이를 기준으로 사용자 입력을 받고, 이때 발생일자가 공란으로 입력되었을 경우 모듈부(110)의 입력기능부에 입력되는 시점의 날짜로 생성하며, 이후 최종적으로 입력받은 데이터를 저장한다. Here, the input function unit of the module unit 110 receives the user input based on the data length defined in each of the patient identification items through the input API, and when the occurrence date is entered in blank, the input function unit of the module unit 110 And then stores the finally input data.

모듈부(110)는 이렇게 입력받은 환자식별 항목을 데이터 베이스부(120)의 정규화 룰 DB에 의거하여 항목별 포맷팅하고 교정하여 정규화하는 전처리 과정을 수행한다(S220). The module unit 110 performs a preprocessing process of formatting and calibrating the received patient identification items based on the normalization rule DB of the database unit 120 and normalizing them (S220).

전처리 과정을 수행한 후, 모듈부(110)는 환자식별 항목의 입력타입을 확인 분석하고 데이터 베이스부(120)의 마스터 테이블에서 가장 매칭률이 높은 식별케이스에 맞는 MPI 식별 케이스군을 선별하는 패턴분석 과정을 수행한다(S230). After performing the preprocessing process, the module 110 confirms and analyzes the input type of the patient identification item and selects the MPI identification case group matching the identification case having the highest matching rate in the master table of the database unit 120 An analysis process is performed (S230).

패턴분석 과정을 수행한 후, 모듈부(110)는 MPI 식별 케이스군에서 정확매칭률 임계값보다 크거나 같은 환자식별 정보를 분석하는 정확매칭 과정을 수행한다(S240). After performing the pattern analysis process, the module 110 performs an accurate matching process for analyzing the patient identification information that is equal to or greater than the accurate matching rate threshold value in the MPI identification case group (S240).

이러한 정확매칭 과정에 따라, 모듈부(110)는 환자 개인의 식별을 위한 환자식별 정보가 유일한지를 판단한다(S250). In accordance with this accurate matching process, the module unit 110 determines whether the patient identification information for identifying the individual patient is unique (S250).

환자식별정보 판단 단계(S250)에서 유일한 환자식별 정보가 없다면, 모듈부(110)는 MPI 식별 케이스군을 가지고 확률매칭을 수행하는 확률매칭 과정을 수행한다(S260). If there is no unique patient identification information in the patient identification information determination step S250, the module unit 110 performs a probability matching process for performing probability matching with the MPI identification case group (S260).

여기서, 확률매칭 과정은 정확매칭 과정(S240)에서 판단되지 못한 MPI 식별 케이스군에 대해 매칭항목들을 추가하여 재매칭하면서 유사도가 높은 환자식별 정보를 분석한다. In the probability matching process, matching items are added to the MPI identification case group that is not judged in the accurate matching process (S240), and the patient identification information having high similarity is analyzed.

식별 케이스에 해당하지 않는 데이터들을 추가하면서 확률 매칭률을 계산하고, 식별케이스의 확률매칭률 임계값보다 크거나 같은 경우의 환자식별 정보를 모듈부(110)의 출력 기능부를 통해 출력하며, 식별케이스의 확률매칭률 임계값보다 크거나 같은 경우가 없다면 고유식별코드를 새로 생성하여 출력한다(S270). Calculates a probability matching rate by adding data not corresponding to the identification case, outputs the patient identification information when the probability matching rate threshold value of the identification case is equal to or greater than the threshold value, through the output function unit of the module unit 110, (S270), the unique ID code is newly generated and output.

이후, 모듈부(110)는 최종적으로 환자 식별하는데 선택된 식별케이스 정보와 환자 정보가 새로 생성되어 축적된 마스터 테이블의 정보를 학습하여, MPI 식별 케이스의 우선순위 결정 룰의 가중치를 업데이트 하는 학습 과정을 수행한다(S280). Then, the module unit 110 learns the information of the master table, which is newly generated and stored in the case information and the patient information selected to finally identify the patient, and updates the weight of the priority determination rule of the MPI identification case (S280).

이와 같은 과정을 포함하는 본 발명의 실시예에 따른 환자 개인정보 식별 방법에 대해 세부적으로 각 과정을 설명한다. The process for identifying the patient personal information according to the embodiment of the present invention including the above process will be described in detail.

먼저, 전처리 과정을 수행하는 단계(S220)는 도 3에 도시된 바와 같이 환자식별 항목을 입력받는 단계(S210)에서 입력된 데이터를 로드하는 단계(S221), 로드된 각 항목에 맞는 전처리 룰을 전처리 룰 DB를 통해 로드하는 단계(S222), 로드된 입력 데이터에 대해 항목별 데이터 타입 및 길이를 확인하는 단계(S223), 데이터 타입 및 길이를 확인한 데이터를 전처리 룰 DB를 통해 로드된 전처리 룰에 따라 기설정한 정규표현 형식으로 정규화하는 단계(S224) 및 이렇게 정규화된 전처리 데이터를 전처리 결과 정보로 저장하는 단계(S225)를 수행한다. 3, the step S220 of loading the input data in the step S210 of inputting the patient identification item (S221), the step of loading the pre-processing rule corresponding to each loaded item (S222), a step of checking the data type and length of each item with respect to the loaded input data (S223), a step of checking the data type and the length of the data by the preprocessing rule DB loaded in the preprocessing rule DB (Step S224), and storing the normalized pre-processing data as the pre-processing result information (step S225).

여기서, 입력된 데이터를 로드하는 단계(S221)에서 입력된 데이터는 [표 1]에 기재된 바와 같이 로드될 수 있다. Here, the data input in the step of inputting the input data (S221) can be loaded as described in [Table 1].

번호number 필드field 타입type 길이Length 설명Explanation 1One 성명name varcharvarchar 2020 공백없이, 특수문자없이, 숫자없이, 한글 완성형 글자만No spaces, no special characters, no numbers, only Hangul characters 22 생년월일date of birth varcharvarchar 88 공백없이, 특수문자없이, 숫자만
‘YYYYMMDD’No spaces, no special characters, only numbers
'YYYYMMDD' 33 성별gender varcharvarchar 1One 영어만, M or FEnglish only, M or F 44 주소address varcharvarchar 100100 지번주소 체계 사용, 한글 완성형 글자만
‘xx시_xx구_xx동_상세주소’Use of lot number address system, Hangul complete letter only
'xxxxxxxxxxxxxxxxxxxxxxaddress' 55 핸드폰 번호Cell Phone Number varcharvarchar 1111 공백없이, 숫자만No spaces, only numbers 66 집 전화번호Home phone number varcharvarchar 1111 공백없이, 숫자만No spaces, only numbers 77 이메일e-mail varcharvarchar 5050 공백없이Without blanks 88 보험종류Type of insurance varcharvarchar 1One 숫자만Only numbers 99 주진료과Department of Primary Care varcharvarchar 22 영어만English only 1010 발생일자Date of occurrence varcharvarchar 88 공백없이, 특수문자없이, 숫자만‘YYYYMMDD’No spaces, no special characters, only numbers 'YYYYMMDD'

이렇게 로드된 입력 데이터에 대해 정규화하는 단계(S224)에서 로드된 전처리 룰에 따라 기설정한 정규표현 형식으로 정규화하기 위해, 예를 들어 입력된 데이터중 ‘성명’항목에 대해 전처리 룰 DB에는 공백을 지우는 함수, 특수문자를 지우는 함수, 숫자를 지우는 함수, 한글만 남겨두는 함수, 20글자가 넘으면 삭제하는 함수를 차례대로 실행할 수 있다. In order to normalize the loaded input data in the regular expression format previously set in accordance with the pre-processing rule loaded in the step of normalizing (S224), for example, in the pre-processing rule DB for the 'name' You can execute the function to delete, the function to erase special characters, the function to erase numbers, the function to leave only Hangul, and the function to delete if there are more than 20 characters.

또한, 한글의 경우에 2Byte의 조합문자이기 때문에 매칭 알고리즘을 적용할 때에 제약이 발생한다. 이러한 문제를 해결하고, 영어와 같이 1byte로 변환하여 매칭 알고리즘에서 자유롭게 사용하기 위해서, 한글 자소를 초성, 중성 및 종성으로 분리하여 정규화할 수 있다. In addition, since a combination of 2 bytes is used in the case of Hangul, a restriction occurs when a matching algorithm is applied. In order to solve this problem and convert it into 1 byte like English and use it freely in the matching algorithm, it is possible to separate and normalize the Korean character into the initial, neutral, and longitudinal.

이때 한글을 표현하는 유니코드의 범위는 0xAC00 ~ 0xD7A3 사이 11,172자로 표현되므로, 유니코드를 이용한 아래의 [수학식 1]에 의해 초성, 중성, 종성을 분리하고 인덱스(index)를 산출할 수 있다. In this case, since the range of Unicode expressing Hangul is represented by 11,172 characters between 0xAC00 and 0xD7A3, it is possible to divide the beginning, the middle and the end and calculate the index by using the following Equation 1 using Unicode.

패턴분석 과정을 수행하는 단계(S230)는 도 4에 도시된 바와 같이 모듈부(110)가 정규화하는 단계(S224)를 거쳐 정규화된 입력 항목에 데이터가 존재하는지를 확인하는 단계(S231), 정규화된 입력 항목에 데이터가 존재하는 경우 존재하는 데이터의 입력타입을 확인하는 단계(S232), 식별케이스 우선순위 DB에 접근하여 식별케이스 우선순위를 로드하는 단계(S233), 확인된 입력타입에 해당하면서 우선순위가 가장 높은 식별케이스를 선정하는 단계(S234), 및 해당하는 식별케이스 순번과 매칭률 임계값을 저장하는 단계(S235)를 포함한다. The step S230 of performing the pattern analysis process may include a step S231 of checking whether there is data in the normalized input item through the step S224 of normalizing the module 110 as shown in FIG. A step S232 of checking the input type of the existing data when the data exists in the input item, a step S233 of accessing the identification case priority DB and loading the identification case priority (S233) A step S234 of selecting an identification case having the highest ranking, and a step S235 of storing the corresponding identification case sequence number and the matching rate threshold value.

예를 들어, 주소, 생년월일, 핸드폰번호의 환자 정보가 입력되었을 경우, [표 2]에 기재된 바와 같이 식별케이스의 우선순위 DB에서 입력 항목에 해당하는 식별케이스 중 맞는 것이 10번째에‘생년월일, 핸드폰번호’케이스이고 매칭률 임계값 80% 인 케이스가 있다면, 이에 해당하는 순번 10과 80%의 임계값을 저장할 수 있다. For example, when the patient information of the address, the date of birth, and the mobile phone number is inputted, as shown in [Table 2], among the identification cases corresponding to the input items in the priority DB of the identification case, Number 'case and a matching rate threshold of 80%, it is possible to store a corresponding threshold value of 10 and 80%.

우선순위Priority 케이스case 매칭률 임계값Match rate threshold 1One 성명, 생년월일Name, date of birth ((성명길이-1) + (생년월일길이-1))
/ 해당항목전체길이((Name length -1) + (date of birth -1))
/ Total length of the item 22 성명, 생년월일, 핸드폰번호Name, date of birth, cell phone number ((성명길이-1) + (생년월일길이-1) + (핸드폰번호길이-1)) / 해당항목전체길이((Name length -1) + (date of birth -1) + (cell phone number length -1)) / 33 성명, 생년월일, 주소Name, date of birth, address ((성명길이-1) + (생년월일길이-1) + (주소길이-)) / 해당항목전체길이((Name length -1) + (date of birth -1) + (address length -)) / total length of the item 44 성명, 생년월일, 집 전화번호Name, date of birth, home phone number ((성명길이-1) + (생년월일길이-1) + (집전화번호길이-1)) / 해당항목전체길이((Name length -1) + (date of birth -1) + (home phone number length -1)) / 55 성명, 생년월일, 이메일Name, date of birth, email ((성명길이-1) + (생년월일길이-1) + (이메일길이-2)) / 해당항목전체길이((Name length -1) + (date of birth -1 length) + (length of e-mail -2)) / 66 핸드폰번호, 주진료과Cell phone number, main department ((핸드폰번호길이-1) + (2))
/ 해당항목전체길이((Cell phone number length -1) + (2))
/ Total length of the item ...... ...... ......

정확매칭 과정을 수행하는 단계(S240)는 모듈부(110)가 입력된 데이터와 패턴분석 과정을 수행하는 단계(S230)에서 선택된 식별케이스 항목을 기반으로 정확 매칭을 수행하여 환자식별 정보를 선별해내는 과정이다. In step S240, the module 110 performs an accurate matching based on the input case and the identification case item selected in the step S230 of performing the pattern analysis process to select the patient identification information It is the process of issuing.

먼저, 도 5에 도시된 바와 같이 모듈부(110)가 데이터 베이스부(120)의 마스터 테이블에 연결하여 정확매칭을 시동한다(S241). First, as shown in FIG. 5, the module unit 110 connects to the master table of the database unit 120 to start accurate matching (S241).

이후 모듈부(110)는 마스터 테이블에 연결하여 패턴분석 과정을 수행하는 단계(S230)에서 선정된 식별 케이스와 그에 해당하는 임계값을 로드하는 단계(S242), 전처리 과정을 수행하는 단계(S220)에서 전처리된 입력 데이터를 로드하는 단계(S243), 매칭 알고리즘을 로드하는 단계(S244), 입력 데이터의 발생일자와 타겟 레코드의 적용일자를 비교하여, 발생일자 이전의 데이터를 타켓으로 선정하는 단계(S245), 매칭 알고리즘을 사용하여 식별케이스 항목 순서대로 매칭률을 계산하는 단계(S246), 식별케이스 항목 순서대로 계산한 매칭률을 기설정된 매칭률 임계값과 비교하여 고유식별정보를 판단하는 단계(S247)를 수행한다. Thereafter, the module unit 110 loads the identification case selected in the step S230 of performing a pattern analysis process by connecting to the master table, a step S242 of loading a corresponding threshold value, a step S220 of performing a preprocessing process, A step S244 of loading the pre-processed input data in step S243, a step S244 of loading a matching algorithm, a step of comparing the generation date of the input data with the application date of the target record, S245), calculating a matching rate in the order of the identification case items using the matching algorithm (S246), comparing the matching rate calculated in the order of the identification case items with the predetermined matching rate threshold value to determine the unique identification information S247).

이때, 고유식별정보를 판단하는 단계(S247)에 따라, 매칭률이 임계값과 같거나 높게 나온 정보의 개수가 한 개라면, 입력 값에 매칭되는 MPI 정보는 검색된 한 개의 정보로 인식하는 단계(S248-1)를 수행하고, 이러한 입력 값에 매칭되는 MPI 정보를 고유식별정보로 생성하는 단계(S249-1)를 수행한다. In this case, if the number of pieces of information in which the matching rate is equal to or higher than the threshold value is one in accordance with the step of determining the unique identification information (S247), the MPI information matched to the input value is recognized as one piece of the searched information S248-1), and performs step S249-1 of generating MPI information matched with the input value as unique identification information.

반면에, 고유식별정보를 판단하는 단계(S247)에 따라, 임계값을 넘는 레코드가 1개도 없거나 또는 임계값을 넘는 레코드가 2개 이상이라고 하면, 확률 매칭정보로 생성하는 단계(S249-2)를 수행한다. 여기서, 확률 매칭정보로 생성하는 단계(S249-2)에서 생성된 확률 매칭정보는 확률매칭 과정을 수행하는 단계(S260)에서 최종 결과값을 계산하는데 이용된다. On the other hand, if it is determined in step S247 that the unique identification information is not found, if there is no record exceeding the threshold value or if the number of records exceeding the threshold value is two or more, step S249-2 is generated as the probability matching information. . Here, the probability matching information generated in step S249-2 of generating the probability matching information is used to calculate the final result value in step S260 of performing the probability matching process.

이러한 정확매칭 과정을 수행하는 단계(S240)에서 매칭 알고리즘의 매칭률을 산출하는 과정에 대해 한글, 영어, 숫자로 나누어 다음과 같이 설명한다. The process of calculating the matching rate of the matching algorithm in step S240 of performing the exact matching process will be described below in Korean, English, and numbers.

예를 들어, 목적 레코드에 대해 입력된 입력 정보가 아래와 같을 때, For example, when the input information entered for the objective record is as follows,

목적 레코드 : 홍길동, gdhong@naver.com, 01047515661Purpose Record: Hong Kil Dong, gdhong@naver.com , 01047515661

입력 정보 : 홍길등, gghong@naver.com, 01047515231 Input information: Hong Gil, et al., Gghong@naver.com , 01047515231

식별케이스 우선순위 DB에서 입력 항목에 맞는 우선순위를 적용하여, 예컨대 성명-이메일-핸드폰번호를 매칭 항목으로 선택하였을 경우에 매칭 알고리즘을 적용하면, 각 항목에 유사도는 아래와 같이 된다. If the matching algorithm is applied to the case where the priority order matching the input item is applied to the identification case priority DB, for example, the name, e-mail, and cell phone number are selected as matching items, the similarity to each item is as follows.

s(x,y) = 1 d(x,y) / max(length(x),length(y))s (x, y) = 1 d (x, y) / max (length (x), length

String str1 = "ㅎㅗㅇㄱㅣㄹㄷㅗㅇ";String str1 = "ㅎ ㅗ ㄱ ㄱ";

String str2 = "ㅎㅗㅇㄱㅣㄹㄷㅡㅇ";String str2 = "ㅎ ㅗ ㄱ ㄱ";

str1의 8번째 “ㅗ”-> str2의 8번째 “ㅡ”로 변경이 되므로, distance는 1이되고, 전체 스트링의 길이는 9가 최대값이므로, Since the eighth "ㅗ" of str1 is changed to the eighth "ㅡ" of str2, the distance is 1, and the length of the entire string is 9,

1 - (1/9) = similarity score of 0.88888891 - (1/9) = similarity score of 0.8888889

String str1 = "gdhong@naver.com";String str1 = "gdhong@naver.com";

String str2 = "gghong@naver.com";String str2 = " gghong@naver.com ";

str1의 2번째 “d”-> str2의 2번째 “g”로 변경이 필요하므로, distance는 1이되고, 전체 스트링의 길이는 16이므로, Since the second "d" of str1 needs to be changed to the second "g" of str2, the distance is 1 and the length of the entire string is 16,

1 - (1/16) = similarity score of 0.93751 - (1/16) = similarity score of 0.9375

String str1 = "01047515661";String str1 = "01047515661";

String str2 = "01047515231";String str2 = "01047515231";

str1의 9번째 “6”-> str2의 9번째 “2”로 변경이 필요하고, str1의 10번째 “6” -> str2의 10번째 “3”으로 변경이 필요하므로, distance는 2가되고, 전체 스트링의 길이는 11이 최대값이므로, the distance needs to be changed from the 9th "6" of str1 to the 9th "2" of str2 and the 10th "6" of str1 to the 10th "3" of str2, Since the length of the entire string is 11 at its maximum,

1 - (2/11) = similarity score of 0.81818181 - (2/11) = similarity score of 0.8181818

으로 도출된다. Respectively.

확률매칭 과정을 수행하는 단계(S260)는 정확매칭 과정을 수행하는 단계(S240)에서 고유식별정보로 생성되는 유일한 식별레코드를 선별하지 못한 경우, MPI 식별 케이스에서 다음 우선순위에 해당하는 MPI 식별 케이스의 매칭항목들을 하나씩 추가하여 매칭을 수행함으로써 매칭확률을 높여가며 선별하는 과정이다. The step of performing the probability matching process (S260) may include a step of performing an accurate matching process (S240). If the unique identification record generated by the unique identification information is not selected, the MPI identification case Is added one by one and matching is performed to increase the matching probability.

구체적으로, 확률매칭 과정을 수행하는 단계(S260)는 패턴분석 과정을 수행하는 단계(S230)에서 기 선정된 식별케이스 우선순위를 로드하는 단계(S261), 식별케이스 우선순위 DB에 접근하여 정확매칭 이후 미식별시 사용되어진 다음 우선순위의 식별케이스를 가져오는 단계(S262), 입력데이터의 발생일자와 적용일자를 비교하여 발생일자 이전의 적용일자 레코드를 타겟으로 판단하는 단계(S263), 매칭 알고리즘을 사용하여 식별케이스 항목 순서대로 매칭률을 계산하는 단계(S264), 식별케이스 항목 순서대로 계산한 매칭률을 설정된 매칭률 임계값과 비교하여 고유식별정보를 판단하는 단계(S265)를 수행한다. Specifically, the step of performing the probability matching process (S260) includes a step (S261) of loading the priority order of the identified case in the step S230 of performing the pattern analysis process (S261) (S262), comparing the generation date of the input data with the application date, and judging the application date record before the occurrence date as the target (S263). The matching algorithm A matching rate is calculated in the order of the identification case items in step S264, and the matching rate calculated in the order of the identification case items is compared with the matching rate threshold value to determine the unique identification information in step S265.

이러한 고유식별정보를 판단하는 단계(S265)에 따라 임계값 이상의 레코드가 1개이면, 고유식별 정보로 생성하고 저장하는 단계(S267)를 수행한다. If it is determined in step S265 that the unique identification information is greater than or equal to the threshold value, the unique identification information is generated and stored in step S267.

고유식별코드를 생성하여 출력하는 단계(S270)는 정확매칭 과정을 수행하는 단계(S240) 및 확률매칭 과정을 수행하는 단계(S260)를 거쳐 처리된 결과들을 출력하고, 마스터 테이블에 해당 정보가 없다면, 새롭게 고유식별코드를 생성하는 과정이다. The step S270 of generating and outputting the unique identification code outputs the processed results through the step S240 of performing an accurate matching process and the step S260 of performing a probability matching process and if there is no corresponding information in the master table , And a new unique identification code is generated.

구체적으로, 고유식별코드를 생성하여 출력하는 단계(S270)는 도 7에 도시된 바와 같이 먼저 정확매칭 과정을 수행하는 단계(S240) 및 확률매칭 과정을 수행하는 단계(S260)를 거쳐 처리된 매칭 결과들을 로드하는 단계(S271), 로드된 매칭결과가 유일한 하나의 매칭값으로 나온 경우, 2개 이상의 매칭값으로 나온 경우 및 매칭된 값이 하나라도 없을 경우의 3가지 경우로 구분하여 판단하는 단계(S272)를 수행한다. Specifically, as shown in FIG. 7, the step of generating and outputting the unique identification code includes a step S240 of performing an accurate matching process and a step S260 of performing a probability matching process, A step of loading the results (step S271), a step of judging whether the loaded matching result is divided into three cases, that is, a case where the matching result is a unique matching value, a case where the matching result is two or more matching values, (S272).

이때, 3가지 경우로 구분하여 판단하는 단계(S272)에서 로드된 매칭결과가 마스터 테이블에서 유일한 매칭값으로 나온 경우에, 검색된 환자 데이터를 MRN 번호와 같이 표시하고 출력한다(S274-1). At this time, when the matching result loaded in the step S272 of discriminating the three cases is the unique matching value in the master table, the searched patient data is displayed as MRN number and outputted (S274-1).

또는, 3가지 경우로 구분하여 판단하는 단계(S272)에서 로드된 매칭결과가 2개 이상으로 나오는 경우에, 해당하는 모든 매칭결과를 화면에 출력하는 단계(S274-2) 및 이렇게 출력된 모든 매칭결과에 대해 사용자의 선택을 입력받는 단계(S275-2)를 수행한다. If the number of the matching results loaded in the step S272 is three or more, the step S274-2 of outputting all the matching results to the screen (step S274-2) And a step S275-2 of receiving the user's selection of the result is performed.

마지막으로 3가지 경우로 구분하여 판단하는 단계(S272)에서 로드된 매칭결과가 하나라도 없을 경우에, 사용자의 입력에 따라 새로운 MRN 번호를 생성할지 여부를 판단하는 단계(S274-3), 사용자의 입력에 따라 새로운 MRN 번호를 생성하는 단계(S275-3) 및 생성된 MRN 번호와 이에 관련된 정보를 마스터 테이블에 추가 저장하고 출력하는 단계(S276-3)를 수행한다. Finally, if there is no matching result loaded in the step S272 of discriminating the three cases, it is determined whether to generate a new MRN number according to the input of the user (S274-3) A step S275-3 of generating a new MRN number according to the input, and a step S276-3 of additionally storing and outputting the generated MRN number and information related thereto in the master table.

학습 과정을 수행하는 단계(S280)는 입력되어지는 패턴들을 누적 분석하기 위한 적응형 알고리즘을 이용하여, 최종적으로 환자 식별하는데 선택된 식별케이스 정보 및 환자 정보가 새로 생성되어 축적되는 마스터 테이블의 정보를 학습하고, 식별 케이스의 우선순위 결정 룰에 관한 가중치를 업데이트하는 과정이다. In the step of performing the learning process (S280), the identification case information finally selected for the patient identification and the information of the master table in which the patient information is newly generated and stored are learned using the adaptive algorithm for cumulatively analyzing the input patterns And updating a weight for the priority determination rule of the identification case.

구체적으로, 학습 과정을 수행하는 단계(S280)는 도 8에 도시된 바와 같이 먼저 최종적으로 환자 식별하는데 선택된 매칭 결과를 로드하는 단계(S281), 마스터 테이블에 연결하는 단계(S282), 연결된 마스터 테이블을 통해 식별성공 케이스들을 카운트하여 식별케이스 우선순위를 재산출하는 단계(S283) 및 재산출된 식별케이스 우선순위에 따라 식별케이스 우선순위 DB를 업데이트하는 단계(S284)를 수행한다. Specifically, the step of performing the learning process (S280) includes the steps of loading (S281) a matching result finally selected for patient identification, connecting to a master table (S282) (S283) of counting the identification case priorities by counting successful cases of identification through S231, and updating the identification case priority DB according to the re-calculated identification case priority (S284).

이러한 학습 과정을 수행하는 단계(S280)는 우선순위별로 초기 가중치 값을 주고, 결과 출력시에 사용된 패턴에 가중치 값을 증가시키며, 다음 패턴 분석시에는 증가된 가중치 값을 포함해서 새로운 우선순위를 적용한다. In step S280, the initial weight value is given for each priority, the weight value is increased for the pattern used at the time of outputting the result, and a new priority value including the increased weight value is used for the next pattern analysis. To be applied.

이에 따라, 학습 과정을 수행하는 단계(S280)는 패턴분석 과정을 수행하는 단계(S230)에서 사용되는 식별케이스 우선순위를 업데이트하여, 패턴분석 과정을 수행하는 단계(S230)를 더욱 빠르고 정확하게 수행하고 이에 따라 정확한 매칭 결과를 산출할 수 있다.
Accordingly, the step of performing the learning process (S280) may perform the pattern analysis process (S230) more quickly and accurately by updating the identification case priority used in the step S230 of performing the pattern analysis process Accordingly, an accurate matching result can be calculated.

실험 Experiment 비교예Comparative Example

본 발명의 실시예에 따른 환자 개인정보 식별 방법에 따른 식별 정확도, 민감도, 특이도 및 처리 속도에 대해 공지된 다른 매칭 알고리즘과 비교하기 위한 테스트를 수행하였다. Tests were conducted to compare with other known matching algorithms for identification accuracy, sensitivity, specificity, and processing speed according to the patient personal information identification method according to an embodiment of the present invention.

이에 따라, 비교예인 다른 매칭 알고리즘은 [표 3]에 기재된 Levenshtein 알고리즘, JaroDistance 알고리즘, NeedlemanWunsch 알고리즘을 선택하고, 본 발명의 실시예에 따른 환자 개인정보 식별 방법과 함께 45개의 동일 표본 샘플 케이스에 적용하여 테스트를 진행하였다. Accordingly, the other matching algorithm as the comparative example selects the Levenshtein algorithm, the JaroDistance algorithm, and the NeedlemanWunsch algorithm described in [Table 3], applies it to 45 identical sample cases together with the patient personal information identification method according to the embodiment of the present invention The test was conducted.

비교예Comparative Example 방법Way 특징Characteristic 1One Levenshtein
(Edit Distance)Levenshtein
(Edit Distance) 한 문자열을 다른 문자열로 바꿀 때 몇 번의 변경이 필요한지를 측정하는 방식How to measure how many changes are needed when changing one string to another 비교적 단순하며 범용적으로 사용이 가능Relatively simple and universally usable 22 JaroDistanceJaroDistance 매칭 문자와 전위 문자의 개수를 통해 정확도를 측정Measure accuracy with matching and number of potential characters 단문 문자열비교에 적합Suitable for short string comparison 33 NeedlemanWunschNeedlemanWunsch 패널티 갭을 사용하여 유사도를 측정Using penalty gaps to measure similarity 문장 단위의 유사도 비교에 적합Suitable for comparison of sentence unit similarity

이때, 테스트 결과를 정량적으로 평가하기 위하여 오차 행렬(Confusion Matrix)을 사용하였다. 해당 오차 행렬을 이용하여 본 발명의 실시예에 따른 환자 개인정보 식별 방법과 비교예들 사이의 상대적인 성능을 정확도, 민감도, 특이도를 통해 비교하기 위해, 오차 행렬에 사용되는 요소로서 TP(True Positive), FP(False Positive), FN(False Negative), TN(True Negative)을 아래와 같이 정의할 수 있다. At this time, an error matrix (Confusion Matrix) was used to quantitatively evaluate the test results. In order to compare the relative performance between the patient personal information identification method and the comparative examples according to the embodiment of the present invention by using the corresponding error matrix through accuracy, sensitivity, and specificity, TP (True Positive ), FP (False Positive), FN (False Negative), and TN (True Negative) can be defined as follows.

TP : 동일인 일 때, 동일한 결과로 매칭한 경우TP: When the same result is matched with the same result

FP : 동일인 일 때, 동일하지 않은 결과로 매칭한 경우FP: When the same person is matched by a result that is not the same

TN : 동일인이 아닐 때, 동일하지 않은 결과로 매칭한 경우TN: if they are not the same person, they are matched by the same result

FN : 동일인이 아닐 때, 동일한 결과로 매칭한 경우
FN: when not identical, matching with the same result

이러한 오차 행렬 요소를 이용한 정확도, 민감도 및 특이도는 아래의 [수학식 2]에 기재된 바와 같이 정의한다. The accuracy, sensitivity, and specificity using the error matrix elements are defined as shown in Equation (2) below.

정확도 테스트 결과Accuracy test result

정확도 테스트 결과는 도 9에 도시된 그래프에서처럼 Levenshtein의 경우 64.44%, Jaro의 경우 67.78%, Needleman의 경우 73.33%, 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 경우 94.44%의 정확도로 검출되어, 본 발명의 실시예에 따른 환자 개인정보 식별 방법이 비교예의 알고리즘보다 높은 정확도를 갖는 것을 알 수 있다. The accuracy test results were detected with an accuracy of 64.44% for Levenshtein, 67.78% for Jaro, 73.33% for Needleman, and 94.44% for the patient personal information identification method according to the embodiment of the present invention, as shown in the graph shown in FIG. , It can be seen that the patient personal information identification method according to the embodiment of the present invention has higher accuracy than the algorithm of the comparative example.

이러한 결과는 한글의 자소분리 없이 매칭이 수행되는 비교예의 알고리즘보다 자소분리를 수행하고 매칭을 수행하는 본 발명의 실시예에 따른 환자 개인정보 식별 방법이 더 높은 정확도로 도출된 것이다.
This result is obtained with a higher accuracy in the patient personal information identification method according to the embodiment of the present invention in which collation is performed and matching is performed as compared with the comparative example algorithm in which matching is performed without separating Hangul.

민감도 테스트 결과Sensitivity test results

민감도는 동일인인 경우에 동일인이라고 판단할 수 있는 기준 척도로서, 동일인인 경우에 얼마나 정확하게 동일인으로 판별할 수 있는지를 측정하기 위한 것이다. Sensitivity is a reference measure that can be judged to be the same in the case of the same person, and it is intended to measure how accurately the same person can be discriminated in the case of the same person.

민감도 테스트 결과, 도 10에 도시된 바와 같이 Levenshtein의 경우 62.26%, Jaro의 경우 64.81%, Needleman의 경우 68.42%, 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 경우 95.45%의 민감도로 본 발명의 실시예에 따른 환자 개인정보 식별 방법이 비교예의 방법들에 비해 높은 민감도가 도출됨을 알 수 있다. As a result of the sensitivity test, as shown in FIG. 10, sensitivity of 62.26% for Levenshtein, 64.81% for Jaro, 68.42% for Needleman, and 95.45% for patient identification method according to the embodiment of the present invention It can be seen that the patient personal information identification method according to the embodiment of the present invention leads to a higher sensitivity than the methods of the comparative example.

이러한 민감도 결과는 비교예의 방법들이 한 글자의 한 자소만 틀려도 매칭이 안 되는 부분으로 간주하는 반면에, 자소 분리를 수행하여 매칭을 처리하는 본 발명의 실시예에 따른 환자 개인정보 식별 방법이 높은 민감도를 얻는다는 것을 알 수 있다. Such a sensitivity result is regarded as a part that is not matched even if only one character of one letter of the comparative example is mistaken, while the method of identifying the patient personal information according to the embodiment of the present invention, . &Lt; / RTI >

이러한 차이는 전처리 작업을 통해 정규화 표현을 처리하지 못한 경우에 큰 차이를 보였다. 특히 비교예의 방법들과는 달리 한글의 오타와 같은 부분에서 정규화를 거치고 나서 확률 매칭을 적용하여 오타의 가능성을 매칭율에 반영하도록 하는 본 발명의 실시예에 따른 환자 개인정보 식별 방법이 효율적임을 확인할 수 있다.
The difference is significant when the normalization expression can not be processed through preprocessing. In particular, unlike the methods of the comparative examples, it can be confirmed that the method of identifying the patient personal information according to the embodiment of the present invention, in which probability of typos is reflected in the matching rate after normalization in the part such as the Korean alphabet, is applied.

특이도 테스트 결과Specificity test result

특이도는 동일인이 아닌 경우에 동일이 아니라고 판단하는 기준 척도로서, 동일인이 아닌 경우를 정확하게 측정하기 위한 것이다. The specificity is a reference measure to judge that it is not the same when not the same person, so as to accurately measure the case where the person is not the same person.

특이도 테스트 결과, 도 11에 도시된 바와 같이 Levenshtein의 경우 67.57%, Jaro의 경우 72.22%, Needleman의 경우 81.82%, 본 발명의 실시예에 따른 환자 개인정보 식별 방법의 경우 93.48%의 특이도로 검출되어, 본 발명의 실시예에 따른 환자 개인정보 식별 방법이 비교예의 방법들에 비해 높은 특이도가 도출됨을 알 수 있다. As a result of the specificity test, as shown in FIG. 11, 67.57% of Levenshtein, 72.22% of Jaro, 81.82% of Needleman, and 93.48% of the patient personal information identification method according to the embodiment of the present invention Thus, it can be seen that the method of identifying patient personal information according to the embodiment of the present invention has higher specificity than the methods of the comparative example.

이러한 특이도 테스트 결과는 비교예의 방법들에 비해 본 발명의 실시예에 따른 환자 개인정보 식별 방법이 학습 과정을 통해 전처리 상에서의 정규화 우선순위, 매칭시 항목의 추가가 되는 경우 항목의 우선 순위를 추천해주는 등의 학습 과정을 통해 해당 확률이 높거나 낮은 경우를 추천할 수 있어 높은 특이도를 도출할 수 있다는 것을 나타낸다. The results of this specificity test show that, in comparison with the methods of the comparative examples, the patient personal information identification method according to the embodiment of the present invention provides a normalization priority on the preprocessing through a learning process, And a high specificity can be derived by recommending a case where the probability is high or low.

이러한 특이도는 처리 연산 시간에도 밀접한 관계를 가질 수 있는 부분으로 비교예의 방법들과는 다른 차별성이라고 할 수 있다.
This specificity is a part that can be closely related to the processing operation time, which is different from the methods of the comparative examples.

처리 속도 테스트 결과Throughput test results

처리속도에 관한 테스트는 10,000개의 레코드와 매칭을 수행하면서 최종 결과가 도출될 때까지의 시간을 측정하였다. The test on the processing speed was performed by matching 10,000 records and measuring the time until the final result was obtained.

이에 따른 처리 속도 결과는 도 12에 도시된 바와 같이 본 발명의 실시예에 따른 환자 개인정보 식별 방법이 비교예의 방법들에 비해 적은 시간에 처리하여 처리속도가 높다는 것을 알 수 있다.
As shown in FIG. 12, the processing speed result is obtained by processing the patient personal information identification method according to the embodiment of the present invention in a shorter time than in the methods of the comparative example, and thus the processing speed is high.

본 발명의 기술사상은 상기 바람직한 실시예에 따라 구체적으로 기술되었으나, 전술한 실시예들은 그 설명을 위한 것이며, 그 제한을 위한 것이 아님을 주의하여야 한다. Although the technical idea of the present invention has been specifically described according to the above preferred embodiments, it is to be noted that the above-described embodiments are intended to be illustrative and not restrictive.

또한, 본 발명의 기술분야의 통상의 전문가라면 본 발명의 기술사상의 범위 내에서 다양한 실시가 가능함을 이해할 수 있을 것이다.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit and scope of the invention.

100: 환자 개인정보 식별 방법을 수행하는 장치
110: 모듈부 120: 데이터 베이스부 100: a device for performing a patient personal information identification method
110: module section 120: database section

Claims

A step (S210) of receiving a patient identification item through an input function unit as an input API capable of responding to various types of input by the module unit;
The patient identification item to which the module is input is normalized to a predetermined regular expression format according to a preprocessing rule loaded in the normalization rule DB of the database unit, and the preprocess, neutrality, and trailing are separated using Unicode representing the Korean language A step S220 of calculating an index by using the following Equation 1 and pre-processing to normalize the index;
[Equation 1]

A module analyzing step (S230) of analyzing and analyzing the input type of the patient identification item to select a group of MPI identification cases matching the identification case with the highest matching rate in the master table of the database unit;
Performing the exact matching on the MPI identification case group (S240);
Determining whether the patient identification information for identifying the patient individual is unique (S250);
In step S250, it is determined that there is no unique patient identification information. In step S240, the module adds data not corresponding to the MPI identification case group, A probability matching step (S260) for performing re-matching to calculate a matching rate;
If the matching result information of step S240 is greater than or equal to the probability matching rate threshold in step S260, the matching result information is output as matching result information. Generating and outputting a unique identification code when the probability matching rate threshold is not greater than or equal to the threshold (S270); And
Performing a learning process of updating the weight of the priority determination rule of the MPI identification case by learning the identification case information and the patient information newly selected and accumulated in the final patient identification and the information of the accumulated master table S280);
The method comprising the steps of:

The method according to claim 1,
The patient identification item input in the step S210 of receiving the patient identification item includes a name, a date of birth, a sex, an address, a mobile phone number, a wired phone number, an e-mail address, a type of insurance, How to Identify a Patient's Personal Information.

The method according to claim 1,
In the pre-processing step S220,
A step (S221) of loading the data input in the step S210 of receiving the patient identification item;
A step (S222) of loading a pre-processing rule corresponding to the loaded input data through a pre-processing rule DB;
Confirming (S223) the item-specific data type and length of the loaded input data; And
Storing the normalized data as preprocess result information (S225);
The method comprising the steps < RTI ID = 0.0 > of: < / RTI >

delete

The method according to claim 1,
The pattern analysis step (S230)
A step S231 of confirming whether there is data in the normalized input item through the normalizing step S224;
Checking the input type of data existing in the normalized input item (S232);
Accessing an identification case priority DB of the database unit and loading an identification case priority (S233);
Selecting an identification case having the highest priority corresponding to the confirmed input type (S234); And
Storing a corresponding identification case sequence number and a matching rate threshold value (S235);
The method comprising the steps < RTI ID = 0.0 > of: < / RTI >

The method according to claim 1,
The step S240 of performing the accurate matching
(S241) of connecting the module unit to the master table of the database unit to start accurate matching;
A step (S242) of connecting the module to the master table and loading a threshold value corresponding to the selected identification case in the pattern analysis step (S230);
A step (S243) of loading the preprocessed input data in the preprocessing step (S220);
Loading a matching algorithm (S244);
Comparing the date of occurrence of the input data of the patient identification item with the application date of the target record, and selecting the data before the occurrence date as the target (S245);
Calculating a matching rate in the order of the identification case items using the matching algorithm (S246); And
Comparing the matching rate calculated in the order of the identification case items with a preset matching rate threshold value to determine unique identification information (S247);
The method comprising the steps < RTI ID = 0.0 > of: < / RTI >

The method according to claim 1,
The step of performing the probability matching (S260)
A step (S261) of loading the identified case priority in the pattern analysis step (S230);
Accessing the identification case priority DB of the database unit and obtaining an identification case of the next priority used in the unidentified after the accurate matching (S262);
Comparing the generation date of the input data inputted in the step S210 of receiving the patient identification item with the application date, and judging the application date record before the generation date as the target (S263);
A step (S264) of calculating a matching rate in the order of the identification case items using the matching algorithm; And
Comparing the matching rate calculated in the order of the identification case items with the set matching rate threshold value to determine the unique identification information (S265);
The method comprising the steps < RTI ID = 0.0 > of: < / RTI >

The method according to claim 1,
The step (S270) of generating and outputting the unique identification code
A step (S271) of performing the precise matching (S240) and a matching result processed through the performing of the probability matching (S260); And
(S272) if the loaded matching result is a unique matching value, if the matching result is two or more matching values, and if there is no matching value;
The method comprising the steps < RTI ID = 0.0 > of: < / RTI >

9. The method of claim 8,
If the loaded matching result is a unique matching value in the master table, the searched patient data is displayed and output as MRN number. Identification method.

9. The method of claim 8,
(S274-2) of outputting the two or more matching results to the screen when the number of the loaded matching results is two or more in the step S272 of discriminating the three cases; And
Receiving (S275-2) a selection of a user for the two or more matching results;
Further comprising the steps of:

9. The method of claim 8,
If it is determined in step S272 that there is no matching result, step S274-3 is performed to determine whether to generate a new MRN number according to the input of the user.
A step (S275-3) of generating a new MRN number according to a user's input; And
Storing and outputting information related to the generated MRN number to the master table (S276-3);
Further comprising the steps of:

The method according to claim 1,
The step of performing the learning process (S280)
Loading the selected matching result to identify the patient (S281);
Counting the successful cases of identification by connecting to the master table and restoring the identification case priority (S283); And
Updating the identification case priority DB of the database unit according to the re-calculated identification case priority (S284);
The method comprising the steps < RTI ID = 0.0 > of: < / RTI >