KR101544560B1

KR101544560B1 - An online analytical processing system for big data by caching the results and generating 2-level queries by SQL parsing

Info

Publication number: KR101544560B1
Application number: KR1020140039470A
Authority: KR
Inventors: 배영근; 박민규; 이영균
Original assignee: (주)비아이매트릭스
Priority date: 2014-04-02
Filing date: 2014-04-02
Publication date: 2015-08-17
Anticipated expiration: 2034-04-02
Also published as: JP5926321B2; JP2015197909A

Abstract

본 발명은 클라이언트가 요청하는 데이터베이스에 대한 요청 쿼리를 처리하는 분석처리 서버의 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 관한 것으로서, (a) 상기 요청 쿼리를 파싱하여, 상기 요청 쿼리에 포함된 컬럼명을 추출하는 단계; (b) 추출된 컬럼명을 참조항목으로 하여 상기 요청 쿼리가 참조하는 테이블과 동일한 테이블을 참조하는 쿼리(이하 기초 쿼리)와, 상기 기초 쿼리의 결과 데이터를 참조하여 상기 요청 쿼리가 요청하는 결과 데이터를 가져오는 확장 쿼리를 생성하는 단계; (c) 상기 기초 쿼리의 결과 데이터를 상기 서버의 서버 캐시에서 검색하는 단계; (d) 상기 서버 캐시에 기초 쿼리의 결과 데이터가 없으면, 상기 기초 쿼리로 상기 데이터베이스에 데이터를 요청하고, 수신한 기초 쿼리의 결과 데이터를 상기 서버 캐시에 저장하는 단계; 및, (e) 상기 확장 쿼리를 상기 기초 쿼리의 결과 데이터에 적용하여 상기 확장 쿼리의 결과 데이터를 획득하고, 획득된 결과 데이터를 상기 클라이언트로 전송하는 단계를 포함하는 구성을 마련한다.
상기와 같은 방법에 의하여, 요청된 쿼리 중 기본 쿼리의 데이터를 캐싱함으로써, 비즈니스 인텔리전스의 분석 환경에서 쿼리 실행 속도를 획기적으로 개선하여 사용자에게 분석처리 결과를 실시간으로 제공해줄 수 있다.
The present invention relates to an online analysis processing method using a two-step query and result caching by SQL parsing of an analysis processing server for processing a request query for a database requested by a client, the method comprising: (a) Extracting a column name included in the request query; (b) a query (hereinafter referred to as a base query) referring to the same table as the table referred to by the request query, with the extracted column name as a reference item, and a query result data Generating an extended query that retrieves a query; (c) retrieving result data of the base query from the server cache of the server; (d) requesting data from the database with the basic query if there is no result data of the basic query in the server cache, and storing result data of the received basic query in the server cache; And (e) applying the extended query to result data of the base query to obtain result data of the extended query, and transmitting the obtained result data to the client.
By caching the basic query data among the requested queries, the query execution speed is greatly improved in the business intelligence analysis environment, and the analysis processing result can be provided to the user in real time by the above-described method.

Description

[0001] The present invention relates to an online analytical processing system for processing large amounts of data, and more particularly, to an online analytical processing system using two-level queries and result caching using SQL parsing.

본 발명은 클라이언트가 질의요청시 SQL파싱을 통해 기초 쿼리(Base Query)와 확장 쿼리(Extend Query)로 단계를 분리하여, 대용량 데이터 또는 빅데이터를 저장하는 데이터베이스로부터 상기 기초 쿼리에 의한 데이터를 가져와서 인메모리기반 서버 캐시에 저장하고, 상기 서버 캐시의 데이터로부터 확장 쿼리를 실행하여 필요 데이터를 추출하는, ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 관한 것이다.
The present invention separates the steps of a base query and an extender query through SQL parsing when a client requests a query to fetch data based on the basic query from a database storing large amounts of data or big data Memory-based server cache, and executing an extended query from the server cache data to extract necessary data. The present invention relates to a method for online analysis processing using a two-step query by SQL parsing and result caching.

일반적으로 비즈니스 인텔리전스(BI, Business Intelligence)는 기업의 방대한 데이터를 통계분석과 같은 정형 또는 비정형적인 방법으로 다양하게 분석하여 주거나 분석된 정보를 이해하기 쉽고 보기 좋은 보고서 형태로 가공하여, 비즈니스를 보다 합리적으로 진행시킬 수 있도록 지원하는 일련의 도구들을 말한다.Generally, business intelligence (BI) analyzes a company's vast amount of data in a formal or informal manner, such as statistical analysis, or processes the analyzed information into an easy-to-understand and easy-to-understand report, To a set of tools that allow you to proceed to

기업이 비즈니스를 하면서 쌓이는 데이터는 수없이 많다. 이러한 데이터는 비즈니스 현장의 생생한 내용을 전달하는 것으로서, 제대로 분석된다면 그 안에서 비즈니스에 필요한 정보를 뽑아낼 수 있다. 그러나 현장에서 축적된 상당량의 데이터로부터 의미가 있는 분석결과를 도출한다는 것은 그리 쉬운 작업이 아니다.There are a lot of data that companies accumulate as they do business. These data convey the vividness of the business scene, and if it is properly analyzed, it can extract the information needed for the business. However, it is not so easy to draw meaningful analysis results from the large amount of data accumulated in the field.

이러한 분석을 위해 많은 도구들이 개별적으로 개발되어 왔다. 예를 들어, 데이터 추출 및 변형(ETT) 도구, 다차원 데이터 분석을 위한 온라인 분석처리(OLAP) 도구, 보고서 작성을 위한 리포팅 도구, 데이터간의 숨겨진 연관성을 찾아주는 데이터 마이닝 도구 등이 대표적이다. 이들 일련의 도구들을 하나의 소프트웨어 제품군으로 형성한 것이 일종의 비즈니스 인텔리전스(BI)이다.Many tools have been developed individually for this analysis. Examples include data extraction and transformation (ETT) tools, online analytical processing (OLAP) tools for multidimensional data analysis, reporting tools for generating reports, and data mining tools for finding hidden associations between data. It is a type of business intelligence (BI) that forms a suite of tools into a suite of software.

그러나 종래의 비즈니스 인텔리전스(BI)는 다양한 분석도구들을 모아 놓았으나, 사용자들은 다양한 분석도구들을 다루기 위해서 숙련된 지식을 갖추어야 했기 때문에 특정 분석이외에는 보편적으로 이용되기 어려웠다. 이런 점들을 개선하여, 웹 환경에서 데이터베이스를 조회하여 분석하는 레포팅 기술들이 제시되고 있다[특허문헌 1]. 또한, 온라인상에서 엑셀 인터페이스를 기반으로 하는 분석 보고서 작성 시스템 등도 제시되고 있다[특허문헌 2].However, conventional business intelligence (BI) has gathered a variety of analytical tools, but it was difficult to use universally, except for specific analysis, because users had to be skilled in dealing with various analytical tools. In order to improve these points, there have been proposed reporting techniques for searching and analyzing a database in a web environment [Patent Document 1]. In addition, an analysis report writing system based on an Excel interface is also proposed online [Patent Document 2].

그런데, 최근, SNS, 쇼셜 미디어 등의 데이터에 대한 분석의 중요성이 계속적으로 커지면서 기업체의 제품에 대한 고객관리나 제품 홍보 등을 위한 빅데이터(Big data)를 수집하여 분석을 필요로 하는 기업들이 많아지고 있다. 빅데이터라는 용어는, 어느 정도 경과한 시간 내에 속한 데이터를 수집, 관리, 저장, 검색, 공유, 분석, 및 시각화하기 위한 보통의 소프트웨어 툴 및 컴퓨터 시스템으로는 다루기 어려운 수준의 데이터양을 갖는 데이터 셋(data set)에 대하여 주로 적용된다. 빅데이터의 사이즈 테라바이트, 엑사바이트, 또는 제타바이트의 범위를 가질 수도 있다. 빅데이터는 다양한 분야에 존재할 수 있는데, 웹로그(web logs), RFID, 센서 네트워크, 소셜 네트워크, 소셜 데이터, 인터넷 텍스트와 문서, 인터넷 검색 인덱싱, POS(point of sales) 데이터, 판매 기록, 의료 기록, 사진 기록, 비디오 기록, 및 전자상거래 등이 그 예이다.Recently, as the importance of analysis of data such as SNS and social media continues to grow, there are many companies that need to analyze and collect big data for customer management and product promotion of products of companies ought. The term big data refers to data sets that have common software tools and computer systems to collect, manage, store, search, share, analyze, and visualize data belonging to a certain amount of time (data set). The size of the big data may have a range of terabytes, exabytes, or zeta bytes. Big data can exist in a variety of fields, including web logs, RFID, sensor networks, social networks, social data, Internet text and documents, Internet search indexing, POS (point of sale) data, , Photo recording, video recording, and electronic commerce.

이러한 빅데이터를 이용하여 분석하기 위하여 온라인 분석 프로세싱(OLAP; on-line analytical processing) 시스템이 도입되고 사용되는데, 이때 발생되는 가장 큰 문제점 중 하나는 데이터 처리 속도의 지연이다. 즉, 수많은 데이터를 처리하기 위한 시간이 길어짐으로 인해, 온라인 상에서 사용자가 체감적으로 굉장히 긴 시간을 기다린다.On-line analytical processing (OLAP) systems are introduced and used to analyze these big data. One of the biggest problems is the delay in data processing speed. That is, since the time for processing a large amount of data becomes long, the user waits for an extremely long time in an on-line experience.

도 1에서 보는 바와 같이, 종래 기술에 의한 온라인 분석 프로세싱 시스템은 사용자 단말에 설치되는 클라이언트, 상기 클라이언트의 데이터 요구사항을 처리하는 BI 서버, 및, 빅데이터를 저장하는 데이터베이스로 구성된다.As shown in FIG. 1, the online analysis processing system according to the related art comprises a client installed in a user terminal, a BI server processing the data requirements of the client, and a database storing big data.

사용자는 웹브라우저 상에서 클라이언트를 통해 보고서 형태(또는 템플릿)를 만들고 해당 보고서 형태에 들어갈 데이터를 BI 서버에 요청한다(① 단계). 즉, 상기 클라이언트에서 작성된 보고서에서 추출한 데이터베이스 코드(DB코드), 쿼리(SQL 쿼리) 등 필요한 정보를 BI 서버로 전송한다. 다음으로, BI 서버는 데이터베이스를 연결하여 필요한 데이터를 요청한다(② 단계). 데이터베이스는 요청된 데이터의 셋(또는 쿼리 결과, 큐브 데이터 셋) 등을 검색하고 추출하여, 추출된 결과 데이터를 BI 서버로 전송한다(③ 단계). BI 서버는 데이터베이스로부터 수신한 필드 정보와 데이터를 압축하여 클라이언트로 전송한다(④ 단계).The user creates a report form (or a template) through a client on a web browser and requests data to be included in the report form to the BI server (step 1). That is, the server transmits necessary information such as a database code (DB code) and a query (SQL query) extracted from the report created by the client to the BI server. Next, the BI server connects the database and requests the necessary data (step 2). The database retrieves and extracts the set of requested data (or query result, cube data set), and transmits the extracted result data to the BI server (step 3). The BI server compresses the field information and data received from the database and transmits them to the client (step 4).

상기와 같은 종래 온라인 분석 프로세싱 시스템은 원천 데이터가 천만 건이 넘는 순간부터, 앞서 쿼리 결과를 수신하기 위하여 10분 이상이 소요되는 경우가 빈번히 발생한다. 예를 들어, 특정 사이트의 경우, 4억 건 결과 조회만으로 5분이 넘게 소요된다. 데이터베이스의 데이터를 포맷하는 데에도 15~30초 사이의 시간이 소요된다.The conventional online analysis processing system as described above frequently takes 10 minutes or longer from the moment when the source data exceeds 10 million to receive the query result. For example, for a given site, 400 million results view only takes more than five minutes. It takes 15-30 seconds to format the data in the database.

이와 같이 데이터 처리 속도가 느린 이유는 데이터베이스에 요청하는 처리 속도가 급격히 떨어지기 때문이다. 데이터베이스는 통상 상용화되어 표준적인 DB(Database) 기능을 처리하는 데이터 서버를 이용한다. 이러한 상용화된 데이터베이스는 원천 테이블이 거대한 경우, 예를 들어, 데이터가 1억 개 이상되는 경우, 많은 데이터를 처리하기 위해 쿼리 처리 속도가 급격히 떨어진다.The reason for this slow data processing is that the processing speed of the request to the database drops sharply. The database uses a data server that is normally commercialized and handles standard DB (Database) functions. Such a commercialized database, if the source table is huge, for example, if there are more than 100 million data, the query processing speed drops drastically in order to process a large amount of data.

특히, 뷰(View)의 기능을 사용하는 경우에도, 쿼리 처리 속도가 매우 느려진다. 일반적으로 뷰(View)란 하나 이상의 테이블로부터 데이터를 부분집합을 논리적으로 표현하는 것으로 실제 데이터를 가지고 있는 것이 아니라 결과를 하나의 SQL로 가지고 있다. 뷰(View)는 액세스를 제한하기 위해 사용하거나 복잡한 질의를 쉽게 만들 수 있지만 요청할 때마다 내부적으로 SQL를 실행한다. 따라서 원천의 뷰(View)가 거대하거나 복잡한 경우, 연결된 뷰(View)도 느려지는 경우가 발생한다. 또한, 쿼리 내에 조인(Join) 함수 등의 기능을 사용하여 쿼리 자체가 복잡한 경우에도, 그 처리 속도가 매우 느려진다.In particular, even when using the view function, the query processing speed is very slow. In general, a view is a logical representation of a subset of data from one or more tables. It does not have actual data, but has a single SQL result. Views can be used to restrict access or to make complex queries easier, but they execute SQL internally on every request. Therefore, if the source's view is large or complex, the linked view may also be slowed down. In addition, even if the query itself is complicated by using functions such as a join function in the query, the processing speed is very slow.

상용화된 데이터베이스는 상기와 같은 문제점을 해결하고자 자체적으로 쿼리를 튜닝하여 보다 빠르게 쿼리를 처리하는 솔루션을 가지고 있다. 그러나 이러한 튜닝도 일반적인 상황을 대비한 것이므로, 자체 시스템에 대한 튜닝만으로는 어느 정도 한계를 갖기 때문에, 쿼리 속도 자체를 획기적으로 개선할 수 없다.A commercially available database has a solution for processing queries faster by tuning the query itself in order to solve the above problems. However, since this tuning is also prepared for general situations, the tuning of the own system only has a certain limit, so the query speed itself can not be dramatically improved.

예를 들어, 상용화된 데이터베이스는 일반적이고 표준화된 경우만을 대처하기 때문에, 동일하거나 비슷한 쿼리 요청에 대해서 동일한 작업을 반복적으로 수행한다.For example, a commercialized database will only deal with normal, standardized cases, so it will perform the same task repeatedly for the same or similar query requests.

상기와 같은 문제로 인해, 종래기술에 의한 온라인 분석 프로세싱(OLAP) 시스템은 온라인 상에서 매우 긴 대기 시간을 발생시키고, 사용자에게 사용상 큰 불편함을 야기한다.
Due to the above problems, the online analysis processing (OLAP) system according to the related art generates a very long waiting time on-line, and causes a great inconvenience to the user.

[특허문헌 1] 한국등록특허 제10-0497811호 (2005.06.18.공고)[Patent Document 1] Korean Patent No. 10-0497811 (published on Jun. 18, 2005) [특허문헌 2] 한국등록특허 제10-0969656호 (2010.07.14.공고)[Patent Document 2] Korean Patent No. 10-0969656 (published on July 14, 2010)

본 발명의 목적은 상술한 바와 같은 문제점을 해결하기 위한 것으로, 클라이언트가 질의 요청시 SQL 파싱을 통해 기초 쿼리와 확장 쿼리로 분리하여, 대용량 데이터 또는 빅데이터를 저장하는 데이터베이스로부터 상기 기초 쿼리에 의한 데이터를 가져와서 인메모리 기반 서버 캐시에 저장하고, 상기 서버 캐시의 데이터로부터 확장 쿼리를 실행하여 의해 필요 데이터를 추출하는, ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법을 제공하는 것이다.SUMMARY OF THE INVENTION The object of the present invention is to solve the above problems, and it is an object of the present invention to provide a method and system for separating large data or big data from basic data And an on-line analysis processing method using a two-step query by SQL parsing and result caching which extracts necessary data by storing the fetched data in an in-memory based server cache and executing an extended query from the data of the server cache .

상기 목적을 달성하기 위해, 본 발명은 클라이언트가 요청하는 데이터베이스에 대한 요청 쿼리를 처리하는 분석처리 서버의 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 관한 것으로서, (a) 상기 요청 쿼리를 파싱하여, 상기 요청 쿼리에 포함된 컬럼명을 추출하는 단계; (b) 추출된 컬럼명을 참조항목으로 하여 상기 요청 쿼리가 참조하는 테이블과 동일한 테이블을 참조하는 쿼리(이하 기초 쿼리)와, 상기 기초 쿼리의 결과 데이터를 참조하여 상기 요청 쿼리가 요청하는 결과 데이터를 가져오는 확장 쿼리를 생성하는 단계; (c) 상기 기초 쿼리의 결과 데이터를 상기 서버의 서버 캐시에서 검색하는 단계; (d) 상기 서버 캐시에 기초 쿼리의 결과 데이터가 없으면, 상기 기초 쿼리로 상기 데이터베이스에 데이터를 요청하고, 수신한 기초 쿼리의 결과 데이터를 상기 서버 캐시에 저장하는 단계; 및, (e) 상기 확장 쿼리를 상기 기초 쿼리의 결과 데이터에 적용하여 상기 확장 쿼리의 결과 데이터를 획득하고, 획득된 결과 데이터를 상기 클라이언트로 전송하는 단계를 포함하는 것을 특징으로 한다.In order to achieve the above object, the present invention relates to an online analysis processing method using a two-step query and result caching by SQL parsing of an analysis processing server for processing a request query for a database requested by a client, Parsing the request query to extract a column name included in the request query; (b) a query (hereinafter referred to as a base query) referring to the same table as the table referred to by the request query, with the extracted column name as a reference item, and a query result data Generating an extended query that retrieves a query; (c) retrieving result data of the base query from the server cache of the server; (d) requesting data from the database with the basic query if there is no result data of the basic query in the server cache, and storing result data of the received basic query in the server cache; And (e) applying the extended query to the result data of the basic query to obtain result data of the extended query, and transmitting the obtained result data to the client.

또한, 본 발명은 클라이언트가 요청하는 데이터베이스에 대한 요청 쿼리를 처리하는 분석처리 서버의 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 관한 것으로서, (a) 상기 요청 쿼리를 파싱하여, 상기 요청 쿼리에 포함된 컬럼명을 추출하는 단계; (b) 추출된 컬럼명을 참조항목으로 하여 상기 요청 쿼리가 참조하는 테이블과 동일한 테이블을 참조하는 쿼리(이하 기초 쿼리)와, 상기 기초 쿼리의 결과 데이터를 참조하여 상기 요청 쿼리가 요청하는 결과 데이터를 가져오는 확장 쿼리를 생성하는 단계; (c) 상기 기초 쿼리의 결과 데이터를 상기 서버의 서버 캐시에서 검색하는 단계; (d) 상기 서버 캐시에 기초 쿼리의 결과 데이터가 없으면, 상기 요청 쿼리로 상기 데이터베이스에 데이터를 요청하고, 수신한 요청 쿼리의 결과 데이터를 상기 클라이언트로 전송하는 단계; 및, (e) 상기 기초 쿼리로 상기 데이터베이스에 데이터를 요청하고, 수신한 기초 쿼리의 결과 데이터를 상기 서버 캐시에 저장하는 단계를 포함하는 것을 특징으로 한다.The present invention also relates to an online analysis processing method using a two-step query and result caching by SQL parsing of an analysis processing server for processing a request query for a database requested by a client, the method comprising: (a) Extracting a column name included in the request query; (b) a query (hereinafter referred to as a base query) referring to the same table as the table referred to by the request query, with the extracted column name as a reference item, and a query result data Generating an extended query that retrieves a query; (c) retrieving result data of the base query from the server cache of the server; (d) requesting data from the database with the request query if there is no result data of the base query in the server cache, and transmitting result data of the received request query to the client; And (e) requesting data from the database using the basic query, and storing result data of the received basic query in the server cache.

또한, 본 발명은 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 있어서, 상기 서버는 상기 확장 쿼리의 결과 데이터를 캐시파일로 상기 서버 캐시에 저장하고, 상기 방법은, (f) 상기 (b)단계 이후, 상기 확장 쿼리의 캐시파일이 상기 서버 캐시에서 검색되는 경우, 검색된 캐시 파일을 클라이언트로 전송하는 단계를 더 포함하는 것을 특징으로 한다.The present invention also provides an on-line analysis processing method using a two-step query and result caching by SQL parsing, wherein the server saves the result data of the extended query as a cache file in the server cache, ) If the cache file of the extended query is retrieved from the server cache after the step (b), transmitting the retrieved cache file to the client.

또한, 본 발명은 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 있어서, 상기 (a)단계에서, 상기 컬럼명을 식별할 수 있는 고유키를 생성하고, 상기 (b)단계에서, 상기 기초 쿼리의 참조항목 절에서 상기 컬럼명에 대하여 상기 고유키로 앨리어스(alias)를 정의하고, 상기 확장 쿼리는 상기 앨리어스를 이용하여 컬럼을 참조하는 것을 특징으로 한다.In addition, the present invention provides an on-line analysis processing method using a two-step query and result caching by SQL parsing, the method comprising: generating a unique key capable of identifying the column name in the step (a) , An alias is defined with respect to the column name in the reference item section of the basic query, and the extended query refers to the column using the alias.

또한, 본 발명은 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 있어서, 상기 고유키는 해당 컬럼명의 데이터베이스의 이름, 참조 테이블의 이름, 및 컬럼명을 해쉬하여 구하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided an online analysis processing method using a two-step query and result caching by SQL parsing, wherein the unique key is obtained by hashing a database name, a reference table name, do.

또한, 본 발명은 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 있어서, 상기 (b)단계에서, 상기 기초 쿼리는 참조항목 절, 테이블 참조 절, 및 조건 절로 구성되고, 상기 기초 쿼리의 테이블 참조 절과 조건 절은 상기 요청 쿼리의 테이블 참조 절과 조건 절과 동일한 구조를 갖는 것을 특징으로 한다.According to another aspect of the present invention, there is provided an online analysis processing method using a two-step query and result caching by SQL parsing, wherein the basic query is composed of a reference item clause, a table reference clause and a condition clause, The table reference clause and the condition clause of the basic query have the same structure as the table reference clause and condition clause of the request query.

또한, 본 발명은 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 있어서, 상기 (b)단계에서, 상기 확장 쿼리는 테이블 참조 절에서 상기 기초 쿼리 또는 상기 기초쿼리의 결과 데이터를 참조하고, 상기 테이블 참조 절 이외의 절이 상기 요청 쿼리의 절과 동일한 구조를 갖도록 생성되는 것을 특징으로 한다.According to another aspect of the present invention, there is provided an online analysis processing method using a two-step query and result caching by SQL parsing, wherein, in the step (b), the extended query extracts result data of the basic query or the basic query And a clause other than the table reference clause is generated so as to have the same structure as the clause of the request query.

또한, 본 발명은 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 있어서, 상기 (b)단계에서, 상기 요청 쿼리에서 테이블에 대한 앨리어스(alias)가 정의된 경우, 상기 테이블의 앨리어스를 삭제하고 상기 테이블의 앨리어스를 상기 테이블의 이름으로 대체하여 상기 확장 쿼리를 생성하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided an online analysis processing method using a two-step query and result caching by SQL parsing, wherein, in the step (b), when an alias for a table is defined in the request query, And generates the extended query by deleting the alias and replacing the alias of the table with the name of the table.

또한, 본 발명은 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 있어서, 상기 서버 캐시는 인메모리 스토리지와 캐시 디스크로 구성되고,상기 요청 기초 쿼리의 결과 데이터를 상기 인메모리 스토리지에 저장하는 특징으로 한다.
In addition, the present invention provides an on-line analysis processing method using a two-step query and result caching by SQL parsing, the server cache comprising an in-memory storage and a cache disk, .

상술한 바와 같이, 본 발명에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법에 의하면, 요청된 쿼리 중 기본 쿼리의 데이터를 캐싱함으로써, 비즈니스 인텔리전스의 분석 환경에서 쿼리 실행 속도를 획기적으로 개선하여 사용자에게 분석처리 결과를 실시간으로 제공해줄 수 있는 효과가 얻어진다.
As described above, according to the online analysis processing method using the two-step query and result caching by SQL parsing according to the present invention, by caching the basic query data among the requested queries, the query execution speed is improved in the business intelligence analysis environment It is possible to provide an analysis processing result in real time to the user.

도 1은 종래 기술에 따른 온라인 분석 프로세싱 시스템의 구성도.
도 2는 본 발명에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법을 실시하기 위한 전체 시스템의 구성에 대한 블록도.
도 3은 본 발명의 제1 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법을 설명하는 흐름도.
도 4는 본 발명의 제1 실시예에 따른 요청 쿼리의 일례.
도 5는 본 발명의 제1 실시예에 따른 기초 쿼리 및 확장 쿼리의 일례.
도 6은 본 발명의 제2 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법을 설명하는 흐름도.
도 7은 본 발명의 제3 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법을 설명하는 흐름도.
도 8은 본 발명의 제4 실시예에 따른 서버 캐시의 구성도.
도 9는 본 발명에 따른 제1 상황을 설명하기 위한 흐름도.
도 10은 본 발명에 따른 제2 상황을 설명하기 위한 흐름도.
도 11은 본 발명에 따른 제3 상황을 설명하기 위한 흐름도.
도 12는 본 발명의 상황에 따른 처리 결과에 대한 비교 표.1 is a block diagram of an online analysis processing system according to the prior art;
FIG. 2 is a block diagram of a configuration of an overall system for performing an online analysis processing method using a two-step query and result caching by SQL parsing according to the present invention.
3 is a flowchart illustrating an online analysis processing method using a two-step query and result caching by SQL parsing according to a first embodiment of the present invention.
4 is an example of a request query according to the first embodiment of the present invention;
5 is an example of a basic query and an extended query according to the first embodiment of the present invention;
6 is a flowchart illustrating an online analysis processing method using a two-step query and result caching by SQL parsing according to a second embodiment of the present invention.
7 is a flowchart illustrating an online analysis processing method using a two-step query and result caching by SQL parsing according to a third embodiment of the present invention.
8 is a configuration diagram of a server cache according to a fourth embodiment of the present invention;
9 is a flow chart for explaining a first situation according to the present invention;
10 is a flowchart for explaining a second situation according to the present invention;
11 is a flowchart for explaining a third situation according to the present invention.
12 is a comparison table of the processing results according to the present invention.

이하, 본 발명의 실시를 위한 구체적인 내용을 도면에 따라서 설명한다.DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail with reference to the drawings.

또한, 본 발명을 설명하는데 있어서 동일 부분은 동일 부호를 붙이고, 그 반복 설명은 생략한다.
In the description of the present invention, the same parts are denoted by the same reference numerals, and repetitive description thereof will be omitted.

먼저, 본 발명에 따른 기초 쿼리의 결과 캐싱 기반 온라인 분석 프로세싱 시스템 및 방법을 실시하기 위한 전체 시스템을 도 2를 참조하여 설명한다.First, an overall system for implementing a resultant caching-based online analysis processing system and method according to the present invention will be described with reference to FIG.

도 2에서 보는 바와 같이, 본 발명을 실시하기 위한 전체 시스템은 클라이언트(20), 분석처리 서버(30), BI 서버(50), 및 데이터베이스(60)로 구성된다. 특히, 분석처리 서버(30)는 데이터베이스(60)로부터 수신한 일부 데이터를 저장하기 위한 서버 캐시(40)를 구비한다.As shown in FIG. 2, the overall system for implementing the present invention comprises a client 20, an analysis processing server 30, a BI server 50, and a database 60. In particular, the analysis processing server 30 has a server cache 40 for storing some data received from the database 60. [

클라이언트(20)는 사용자 단말(10)에 설치되는 클라이언트용 프로그램 시스템으로서, 웹브라우저를 통해 사용자 인터페이스를 갖는다. 즉, 사용자는 웹브라우저 또는 웹브라우저와 같은 화면의 인터페이스를 통해, 온라인상으로 데이터 분석 처리 작업을 수행한다. 이때, 클라이언트(10)는 사용자의 명령 등을 입력받아 해당 명령을 수행하고, 처리 결과를 화면 상 또는 웹브라우저 상에 표시한다. 한편, 사용자 단말(10)은 개인용 컴퓨터(PC), PDA, 스마트폰 등 컴퓨팅 기능을 가지는 컴퓨터 단말이다.The client 20 is a client program system installed in the user terminal 10 and has a user interface through a web browser. That is, the user performs a data analysis processing operation on-line via a screen interface such as a web browser or a web browser. At this time, the client 10 receives a command or the like of the user, executes the command, and displays the processing result on the screen or the web browser. On the other hand, the user terminal 10 is a computer terminal having a computing function such as a personal computer (PC), a PDA, and a smart phone.

또한, 클라이언트(20)는 데이터 요청, 데이터 분석 등 온라인 상으로 분석 처리하는 작업을 분석처리 서버(30)에 요청하고, 그 결과를 서버(30)로부터 가져와서 웹브라우저 상에 표시한다.Further, the client 20 requests the analysis processing server 30 to analyze and process data on-line, such as data request and data analysis, and obtains the result from the server 30 and displays the result on the web browser.

다음으로, 분석처리 서버(30)는 온라인 분석 프로세싱(OLAP)을 처리하는 서버로서, 클라이언트(20)로부터 데이터 분석에 대한 요청을 수신하여, 해당 분석 요청을 처리하여 그 결과를 클라이언트(20)로 전송하는 서버이다.Next, the analysis processing server 30 is a server that processes online analysis processing (OLAP), receives a request for data analysis from the client 20, processes the analysis request, and sends the result to the client 20 Lt; / RTI >

특히, 분석처리 서버(30)는 데이터를 요청하는 쿼리를 이용하여, 데이터베이스(50)에 저장된 데이터를 가져온다. 쿼리는 데이터베이스에 저장된 데이터의 검색 또는 갱신 시 발생하는 질문 또는 문의를 기술하는 데이터 조작언어를 의미하며, 데이터베이스에서 쿼리는 일종의 명령어와 같은 역할을 수행한다. 관계 데이터베이스의 구조적 질의 언어(Structured Query Language : 이하 SQL)의 형식으로 표현되지만, 경우에 따라서는 SQL 이외의 형식으로 표현될 수도 있다.In particular, the analysis processing server 30 fetches the data stored in the database 50 using the query requesting the data. A query is a data manipulation language that describes a question or inquiry that occurs when retrieving or updating data stored in a database. In a database, a query acts as a kind of command. It is expressed in the form of a structured query language (SQL) of a relational database, but in some cases, it may be expressed in a format other than SQL.

또한, 분석처리 서버(30)는 서버 캐시(40)를 구비하여, 데이터베이스(60)로부터 가져온 데이터 전체 또는 일부를 임시로 저장한다. 서버 캐시(40)는 분석처리 서버의 메모리(RAM 등) 상에 구현되어 캐시 메모리로서 구성되거나, 하드 디스크 또는 SSD(solid state disk) 디스크 등으로 구현되어 캐시 디스크로 구성될 수 있다. 또는 모든 데이터를 디스크로 저장하고, 일부의 데이터, 즉, 필요한 데이터를 캐시 메모리로 올려놓고 사용할 수 있다.The analysis processing server 30 also includes a server cache 40 to temporarily store all or a part of the data fetched from the database 60. [ The server cache 40 may be implemented as a cache memory implemented on a memory (RAM or the like) of the analysis processing server, or may be implemented as a hard disk, a solid state disk (SSD) disk, or the like, and configured as a cache disk. Alternatively, all the data can be stored on a disk, and some data, that is, necessary data, can be loaded into the cache memory and used.

다음으로, BI 서버(50)는 데이터베이스(50)를 중계하는 데이터베이스(DB) 인터페이스 서버 역할을 수행한다. 즉, BI 서버(50)는 분석처리 서버(30)로부터 쿼리를 수신하여, 해당 쿼리를 이용하여 데이터베이스(60)의 데이터를 가져온다. 또는 데이터베이스(60)의 DBMS에 요청하여 해당 데이터를 가져온다.Next, the BI server 50 serves as a database (DB) interface server for relaying the database 50. [ That is, the BI server 50 receives the query from the analysis processing server 30 and fetches the data of the database 60 using the query. Or the DBMS of the database 60 to fetch the corresponding data.

또한, BI 서버(50)는 이질적인 다수의 데이터베이스(60)로 구성되더라도 해당 데이터베이스와의 인터페이스 방식에 맞추어, 쿼리를 요청하거나 데이터를 수신한다. 또한, BI 서버(50)는 데이터를 송수신할 때 암호화하거나, 데이터 압축, 또는 파일 압축 등 데이터 송수신을 위한 부가적인 작업들도 수행한다.In addition, the BI server 50 may request a query or receive data in accordance with a method of interfacing with the database, even if the BI server 50 is composed of a plurality of heterogeneous databases 60. In addition, the BI server 50 performs additional tasks for transmitting and receiving data, such as encrypting data, compressing data, and compressing files.

다음으로, 데이터베이스(60)는 데이터를 저장하기 위한 통상의 데이터베이스(DB)로서, 데이터를 관리하기 위한 DBMS를 구비하고, 데이터의 저장, 삭제, 검색 등의 작업들을 쿼리를 통해 수행한다. 특히, 데이터베이스(60)는 상용화된 데이터베이스로서, 데이터를 처리하기 위한 일반적인 쿼리 기능을 이용하여, 데이터 쿼리 서비스를 수행한다.Next, the database 60 is a normal database (DB) for storing data, and has a DBMS for managing data, and performs operations such as data storage, deletion, and retrieval through queries. In particular, the database 60 is a commercialized database, and performs a data query service using a general query function for processing data.

특히, 데이터베이스(60)는 빅데이터를 저장하는 데이터베이스이다. 또한, 바람직하게는, 데이터베이스(60)는 관계형 데이터베이스(RDB)로 구성된다.
In particular, the database 60 is a database for storing big data. Also, preferably, the database 60 comprises a relational database (RDB).

다음으로, 본 발명의 제1 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법을 도 3을 참조하여 보다 구체적으로 설명한다.Next, an online analysis processing method using a two-step query and result caching by SQL parsing according to a first embodiment of the present invention will be described in more detail with reference to FIG.

도 3에서 보는 바와 같이, 본 발명의 제1 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법은 (a) 요청 쿼리를 수신하여 파싱하는 단계(S11); (b) 기초 쿼리 및 확장 쿼리를 생성하는 단계(S12); (c) 기초 쿼리의 결과를 서버 캐시에서 검색하는 단계(S13); (d) 검색되지 않으면, 기초 쿼리를 데이터베이스에서 가져와서 서버 캐시에 저장하는 단계(S14); (e) 상기 기초 쿼리의 결과에 확장 쿼리를 적용하여 요청 쿼리의 결과를 획득하는 단계(S15); 및, (f) 요청 쿼리의 결과를 전송하는 단계(S16)로 구성된다.As shown in FIG. 3, the online analysis processing method using the two-step query and result caching by SQL parsing according to the first embodiment of the present invention includes (a) receiving and parsing a request query (S11); (b) generating a basic query and an extended query (S12); (c) searching the server cache for the result of the base query (S13); (d) if not retrieved, retrieving the base query from the database and storing it in the server cache (S14); (e) acquiring a result of the request query by applying an extended query to the result of the basic query (S15); And (f) transmitting the result of the request query (S16).

먼저, 요청 쿼리를 수신하여 파싱하는 단계(S11)를 설명한다. 분석처리 서버(30)는 클라이언트(20)로부터 요청 쿼리를 수신하고, 상기 요청 쿼리를 파싱한다(S11).First, a step S11 of receiving and parsing a request query will be described. The analysis processing server 30 receives the request query from the client 20 and parses the request query (S11).

사용자 단말(10)에 설치된 클라이언트(20)에서, 필요한 데이터를 쿼리로 분석처리 서버(30)에 요청한다. 바람직하게는 요청 쿼리는 SQL 쿼리로 작성된다. 도 4는 요청 쿼리의 일례를 나타내고 있다.The client 20 installed in the user terminal 10 requests the analysis processing server 30 to query the necessary data. Preferably, the request query is written as an SQL query. Figure 4 shows an example of a request query.

SQL 쿼리로 작성된 요청 쿼리는 참조항목 절(SELECT 절), 테이블 절 및 조인 절(FROM 절), 조건 절(WHERE 절), 그룹 절(GROUP BY), 순서 절(ORDER BY) 등으로 구성된다. 참조항목 절(select list)은 원하는 데이터 테이블의 필드/컬럼을 정의하는 절이고, 테이블 절(table reference)은 데이터를 가져올 테이블을 정의하는 절이고, 조인 절(join clause)은 테이블 간의 조인을 정의하는 절이고, 조건 절(where clause)는 조건을 정의하는 절이다. 그리고 그룹 절(group by) 이나 순서 절(order by clause)는 집계나 표시 형태를 정의하는 절이다. 요청 쿼리에서 정의된 데이터 필드, 참조 테이블, 조건문에서의 변수 등은 모두 데이터베이스(60)에 있는 원천 데이터의 필드, 테이블 등을 참조한 것이다.A request query created with an SQL query consists of a reference item clause (SELECT clause), a table clause and a join clause (FROM clause), a condition clause (WHERE clause), a group clause (GROUP BY) and an order clause (ORDER BY). A select list is a clause that defines the fields / columns of a desired data table. A table reference is a clause that defines a table from which data is to be retrieved. A join clause defines a join between tables. And the where clause is a clause that defines the condition. And a group by or order by clause is a clause that defines an aggregate or display form. The data fields defined in the request query, the reference tables, the variables in the conditional statements, etc. all refer to the fields, tables, etc. of the source data in the database 60.

요청 SQL 쿼리의 파싱은 요청 SQL 쿼리의 구문을 분석하여, 컬럼 리스트(select list), 테이블 참조(table reference), 조인 절(join clause), 조건 절(where clause), 그룹 절(group by clause), 순서 절(order by clause) 등을 집합 형태로 추출하는 것이다.The parsing of the request SQL query parses the request SQL query and returns the result of the select list, table reference, join clause, where clause, group by clause, , Order by clause, and so on.

특히, SELECT 절(또는 참조항목 절)의 참조항목에서 컬럼명을 추출한다. 또한, 조인 절, 조건 절, 그룹 절, 순서 절 등에서 참조하는 컬럼명들을 모두 추출한다.In particular, extract the column name from the reference clause of the SELECT clause (or reference clause clause). Also, all the column names referenced in the join clause, condition clause, group clause, order clause, and the like are extracted.

요청 쿼리가 SQL 쿼리인 경우, 파싱을 위한 SQL 구문은 다음과 같다.If the request query is an SQL query, the SQL statement for parsing is:

query_blockquery_block

: (subquery_factoring_clause)? : (subquery_factoring_clause)?

SELECT (((DISTINCT | UNIQUE)| ALL)?) select_list SELECT (((DISTINCT | UNIQUE) | ALL)?) Select_list

FROM ( table_reference | join_clause) (COMMA (table_reference | join_clause))*FROM (table_reference | join_clause) (COMMA (table_reference | join_clause)) *

where_clause?where_clause?

hierarchical_query_clause?hierarchical_query_clause?

group_by_clause?group_by_clause?

order_by_clause?order_by_clause?

| LPAREN query_block RPAREN
| LPAREN query_block RPAREN

특히, 참조항목이 계산식인 경우에 계산식 내에 포함된 컬럼명을 추출한다. 또한, 조건 절 등 다른 절에서 참조하는 조건이나 수식 등에서 사용되는 컬럼명들도 추출한다.In particular, when the reference item is a calculation formula, the column name included in the calculation formula is extracted. It also extracts the column names used in conditions or expressions referenced in other clauses, such as condition clauses.

또한, 추출된 컬럼명에 대하여 다른 컬럼명과 식별할 수 있는 식별자 또는 고유키를 생성한다. 이때의 고유키는 컬럼의 절대이름에 대한 식별자(또는 고유키)이다. 절대이름이란 참조 데이터베이스의 이름, 참조 테이블의 이름, 컬럼명으로 구성된 이름을 말한다. 따라서 컬럼의 절대이름은 다음과 같이 표현될 수 있다.Also, an identifier or unique key that can be distinguished from the other column names is generated for the extracted column name. The unique key is an identifier (or unique key) for the absolute name of the column. An absolute name is a name consisting of the name of the reference database, the name of the reference table, and the name of the column. Thus, the absolute name of a column can be expressed as:

컬럼의 절대이름 = <데이터베이스의 이름>.<테이블의 이름>.<컬럼명>Absolute name of column = <name of database>. <Name of table>. <Column name>

또는 데이터베이스를 굳이 식별하지 않으면 다음과 같이 표현될 수도 있다.Or, if you do not really know the database, you might say:

컬럼의 절대이름 = <테이블의 이름>.<컬럼명>Absolute name of column = <name of table>. <Column name>

절대이름과 대비하여, 컬럼명을 컬럼의 상대이름이라고도 부르기로 한다.In contrast to the absolute name, the column name is also called the relative name of the column.

컬럼명의 고유키는 컬럼명의 절대이름을 이용하여 해싱을 통해 구한다. 고유키를 생성하는 수식은 다음과 같다.The unique key of a column name is obtained by hashing using the absolute name of the column name. The formula for generating a unique key is:

[수학식 1][Equation 1]

고유키 = hash( (domain name) + database name + table name + column name + function name)Unique key = hash ((domain name) + database name + table name + column name + function name)

따라서 컬럼명의 고유키는 컬럼들을 식별할 수 있는 식별자의 기능을 수행한다. 즉, 고유키로 컬럼을 식별할 수 있다.Therefore, the unique key of the column name performs the function of an identifier to identify the columns. That is, you can identify the column with a unique key.

컬럼명의 고유키는 기초쿼리 또는 요청쿼리를 생성할 때 앨리어싱(aliasing, 별칭)을 통해 각 컬럼명을 식별하는데 이용된다. 컬럼명의 고유키를 이용하여 기초쿼리의 컬럼명을 모두 별칭(alias)으로 기재하여, 기초쿼리에 의한 결과 데이터의 테이블에서의 컬럼명을 모두 고유키로 생성되도록 한다. 즉, 컬럼명을 식별자(또는 고유키)로 앨리어스(alias)하는 이유는 자동 생성된 쿼리에서 컬럼을 식별하기 위한 것이다.The unique key of a column name is used to identify each column name through aliasing (alias) when generating a base query or a request query. All the column names of the basic query are written as aliases using the unique key of the column name so that all the column names in the table of the result data by the basic query are generated with the unique key. That is, the reason for aliasing a column name to an identifier (or unique key) is to identify the column in an automatically generated query.

예를 들면, 1번 쿼리과 2번 쿼리가 다음과 같다고 가정한다.For example, assume that query # 1 and query # 2 are as follows.

[1번 쿼리][Query # 1]

select t1. customerselect t1. customer

from matirx_demo t1from matirx_demo t1

[2번 쿼리][Query # 2]

select m.customerselect m.customer

from matrix_demo mfrom matrix_demo m

이 경우, 1번과 2번 쿼리에서 customer는 동일한 테이블의 동일한 컬럼이다. 하지만 alias키가 없기 때문에 t1.customer와 m.customer를 다르다고 판단할 수 있다.In this case, in the first and second queries, customer is the same column in the same table. However, since there is no alias key, we can determine that t1.customer and m.customer are different.

또한, 1번과 2번 쿼리가 다음과 같다고 가정한다.It is also assumed that queries # 1 and # 2 are as follows.

[1번 쿼리][Query # 1]

select t.idselect t.id

from matrix_demo1 tfrom matrix_demo1 t

[2번 쿼리][Query # 2]

select t.idselect t.id

from matrix_demo2 tfrom matrix_demo2 t

이 경우에도, 앨리어스(alias, 별칭)의 고유키가 없기 때문에 같게 보이나 실제로는 다른 컬럼들이다.In this case, it looks the same because there is no unique key of an alias, but it is actually another column.

이때, 고유키를 적용하여 앨리어싱하면 다음과 같이 쿼리가 생성된다.At this time, aliasing by applying a unique key generates a query as follows.

[1번 쿼리][Query # 1]

select t1. customer C9A59FD7B select t1. customer C9A59FD7B

from matirx_demo t1from matirx_demo t1

[2번 쿼리][Query # 2]

2번 쿼리Query # 2

select m.customer C9A59FD7Bselect m.customer C9A59FD7B

from matrix_demo mfrom matrix_demo m

따라서, 고유키 “C9A59FD7B“만 보면 동일 데이터베이스명, 동일 테이블명, 동일 컬럼명, 동일 함수명인 것을 확인할 수 있다.Therefore, if you look at the unique key "C9A59FD7B", you can see that it is the same database name, same table name, same column name, and same function name.

또한, 바람직하게는, 테이블 절에서 테이블명의 앨리어스(alias)를 제거하고, 테이블명의 앨리어스(alias)로 명명된 구문을 모두 원래의 테이블명으로 변경한다. 예를 들어, 앞서의 예에서, MATRIX_DEMO가 T로 앨리어스(alias)되어 명명되어 있다면 원테이블명(MATRIX_DEMO)으로 변경한다.Also, preferably, the alias of the table name is removed from the table section, and all of the aliases of the table name are changed to the original table name. For example, in the previous example, if the MATRIX_DEMO is aliased to T, rename it to the original table name (MATRIX_DEMO).

또한, 바람직하게는, 다수의 데이터베이스를 이용하는 경우, 테이블 참조절에서의 각 테이블명에 대한 테이블의 절대이름을 구한다. 테이블의 절대이름은 데이터베이스 이름과 테이블 이름으로 구성되고, <데이터베이스 이름>.<테이블 이름>으로 표현된다.Preferably, when a plurality of databases are used, an absolute name of a table for each table name in the table true adjustment is obtained. The absolute name of the table consists of the database name and the table name, and is expressed as <database name>. <Table name>.

한편, 테이블 절과 조인(join) 절은 SQL 구문에서 "FROM"절에 포함된다. 즉, FROM 절은 테이블을 참조하기 위한 절로서, 테이블과 조인으로 구성된다. 따라서 이하에서 테이블 절과 조인 절을 포함된 절을 "테이블 참조 절"이라 부르기로 한다.
On the other hand, table clauses and join clauses are included in the "FROM" clause in SQL statements. That is, the FROM clause is a clause for referencing a table, which consists of a table and a join. Therefore, the clauses that include table clauses and join clauses are referred to as "table reference clauses" below.

다음으로, 분석처리 서버(30)는 파싱한 요청 쿼리를 이용하여, 기초 쿼리와 확장 쿼리를 생성한다(S12).Next, the analysis processing server 30 generates a basic query and an extended query using the parsed request query (S12).

기초 쿼리는 데이터베이스(60)의 데이터를 참조하여 요청하는 쿼리이고, 확장 쿼리는 기초 쿼리에 의해 추출된 데이터(또는 기초 쿼리의 결과 데이터)를 참조하여 요청하는 쿼리이다. 기초 쿼리와 확장 쿼리의 일례가 도 5에 도시되고 있다.The basic query is a query that refers to the data in the database 60, and the extended query is a query that refers to the data extracted by the basic query (or the result data of the basic query). An example of a basic query and an extended query is shown in FIG.

도 5에서 보는 바와 같이, 기초 쿼리(BaseSQL)에서, SELECT문에 기재된 "CUSTOMER", "PRODUCT", "C", "C2" 등 데이터 필드의 명칭나, "MATRIX_DEMO" 등 참조 테이블의 명칭이나, "YYYY" 등의 조건문의 변수(또는 데이터 필드의 변수)는 모두 데이터베이스(60)의 원천 데이터를 직접 참조한다.5, the name of the data field such as " CUSTOMER ", "PRODUCT "," C ", or "C2 " described in the SELECT statement, the name of the reference table such as" MATRIX_DEMO " (Or variables in the data field) such as "YYYY" directly refer to the source data of the database 60 directly.

먼저, 기초쿼리는 다음과 같이 생성한다.First, the basic query is generated as follows.

SQL 파싱에서 추출한 컬럼명(또는 컬럼 리스트)으로 참조항목을 만든다. 이때, 바람직하게는, 요청쿼리의 참조항목이 계산식인 경우, 요청쿼리의 계산식에 포함된 컬럼명으로 기초쿼리의 참조항목을 생성한다. 또한, 조건 절 등 다른 절에서 참조하는 컬럼명들도 기초쿼리의 참조항목으로 생성한다.Create a reference to the column name (or column list) extracted from SQL parsing. At this time, preferably, when the reference item of the request query is an equation, a reference item of the base query is generated with the column name included in the calculation of the request query. Also, column names referenced in other clauses, such as condition clauses, are also created as reference items in the base query.

도 4의 예에서, 요청쿼리의 3번째 참조항목은 "SUM(T.H_VAL)"의 계산식이다. 또한, 조건 절(where clause)에서 "T.YYYY = '2013'"은 조건이나 그 조건 내에 "T.YYYY" 컬럼명이 포함되어 있다. 따라서 도 5의 기초쿼리에서의 참조항목에는 계산식 "SUM(T.H_VAL)" 내의 컬럼명 "T.H_VAL"과, 조건 "T.YYYY = '2013'" 내의 컬럼명 "T.YYYY"을 참조항목 절의 참조항목으로 생성한다.In the example of FIG. 4, the third reference item of the request query is a formula of "SUM (T.H_VAL)". Also, in the where clause, "T.YYYY = '2013'" contains the column name "T.YYYY" in the condition or its condition. Therefore, the column name "T.H_VAL" in the calculation formula "SUM (T.H_VAL)" and the column name "T.YYYY" in the condition "T.YYYY = '2013" It is created as a reference item in the item section.

또한, 참조항목들에 컬럼명의 고유키를 앨리어싱(aliasing)을 한다. 즉, 고유키를 별칭으로 정의한다.Also, aliasing the unique key of the column name to the reference items is performed. That is, define the unique key as an alias.

그리고 기초쿼리의 테이블 절(또는 테이블 참조절), 조인 절, 조건 절은 요청쿼리의 테이블 절, 조인 절, 조건 절과 동일하게 구성한다.In addition, the table clause (or table true control) of the basic query, the join clause, and the condition clause are configured in the same manner as the table clause, join clause, and condition clause of the request query.

다만, 바람직하게는, 요청쿼리의 테이블 절에서 테이블명이 앨리어스(alias)되어 있으면, 앨리어스된 별칭을 삭제한다. 그리고 참조항목 절, 조인절, 조건 절에서 별칭이 기재된 테이블명을 모두 테이블의 이름(또는 절대이름)으로 변경한다.Preferably, however, if the table name is aliased in the table section of the request query, the alias alias is deleted. In the reference clause, join clause, and condition clause, change all the table names with aliases to the table names (or absolute names).

예를 들어, 요청쿼리가 다음과 같은 경우를 설명한다.For example, a request query might look like the following:

[요청쿼리 1][Request Query 1]

select t.customer, sum(t.h_val)select t.customer, sum (t.h_val)

from matrix_demo tfrom matrix_demo t

where t.yyyy = ‘2013’where t.yyyy = '2013'

group by t.cusomergroup by t.cusomer

이때, 상기 요청쿼리 1로부터 생성한 기초쿼리는 다음과 같다.At this time, the basic query generated from the request query 1 is as follows.

[기초쿼리 1][Basic Query 1]

SELECT MATRIX_DEMO.CUSTOMER C9A59FD7B, MATRIX_DEMO.YYYY CEB41FFF7, MATRIX_DEMO.H_VAL CB165E5C5SELECT MATRIX_DEMO.CUSTOMER C9A59FD7B, MATRIX_DEMO.YYYY CEB41FFF7, MATRIX_DEMO.H_VAL CB165E5C5

FROM MATRIX.MATRIX_DEMOFROM MATRIX.MATRIX_DEMO

WHERE MATRIX_DEMO.YYYY = '2013’WHERE MATRIX_DEMO.YYYY = '2013'

즉, 요청쿼리에서 테이블 matrix_demo의 별칭(alias)을 "t"로 선언하였는데, 기초쿼리에서는 모두 테이블의 이름인 matrix_demo로 변경되었다.
That is, in the request query, we declare alias of table matrix_demo as "t", and all of the base queries are changed to matrix_demo, which is the name of the table.

다음으로, 확장쿼리를 생성한다.Next, an extended query is created.

확장쿼리는 테이블 절을 제외하고 요청쿼리와 동일한 구조를 갖고, 테이블 절에서 참조하는 테이블 대신 기초쿼리 또는 기초쿼리의 결과 데이터 테이블(기초쿼리가 원천 데이터베이스에서 쿼리하여 가져온 결과 테이블)을 참조한다.The extended query has the same structure as the request query except for the table clause, and refers to the result data table of the base query or basic query (the result table obtained by querying the underlying query from the origin database) instead of the table referenced in the table section.

또한, 컬럼명들을 모두 컬럼들의 고유키로 변환한다. 즉, 확장쿼리가 참조하는 테이블이 기초쿼리의 결과 테이블이므로, 참조하는 테이블의 컬럼들은 모두 기초쿼리에서 선언한 고유키로 참조하여야 한다.Also, all the column names are converted into the unique keys of the columns. That is, since the table referenced by the extended query is the result table of the basic query, all the columns of the referencing table should be referred to by the inherent key declared in the basic query.

앞서 [요청쿼리 1]과 [기초쿼리 1]에 의한 확장쿼리는 다음과 같다.The extended query by [request query 1] and [basic query 1] is as follows.

[확장쿼리 1][Extended Query 1]

SELECT MHC.C9A59FD7B, SUM(MHC.CB165E5C5) AS "CB165E5C5"SELECT MHC.C9A59FD7B, SUM (MHC.CB165E5C5) AS "CB165E5C5"

FROM ( {@ORIGINAL_SQL@} ) MHC FROM ({@ ORIGINAL_SQL @}) MHC

WHERE MHC.CEB41FFF7 = '2013'WHERE MHC.CEB41FFF7 = '2013'

GROUP BY GROUP BY

MHC.C9A59FD7BMHC.C9A59FD7B

여기서, "{@ORIGINAL_SQL@}" 는 기초쿼리 또는, 기초쿼리의 결과 테이블을 참조하는 것을 표시한다.Here, "{@ ORIGINAL_SQL @}" indicates that the base query or the result table of the base query is referred to.

따라서 기초 쿼리에 의해 추출된 데이터가 획득되면, 확장 쿼리는 획득된 기초 쿼리의 결과 데이터를 참조하여 요청되는 쿼리이다. 확장 쿼리에 의해 구해지는 결과는 원래 요청 쿼리에 의해 구해지는 결과와 동일하다. 또한, 확장 쿼리에 의해 구해지는 결과 데이터의 집합은 항상 기초 쿼리에 의해 구해지는 결과 데이터의 집합 보다 작다. 즉, 확장 쿼리의 데이터 집합은 기초 쿼리의 데이터 집합의 부분 집합이라 할 수 있다.Therefore, when the data extracted by the basic query is acquired, the extended query is a query that is requested by referring to the result data of the acquired basic query. The result obtained by the extended query is identical to the result obtained by the original request query. Also, the set of result data obtained by the extended query is always smaller than the set of result data obtained by the basic query. That is, the data set of the extended query is a subset of the data set of the underlying query.

또한, 기초쿼리의 결과 데이터가 없는 경우, [확장쿼리 1]에서, {@ORIGINAL_SQL@}에 기초쿼리를 대치하고 원천 데이비베이스의 질의하면, 요청쿼리의 결과를 얻을 수 있다.
In addition, if there is no result data of the basic query, in [Extended Query 1], it is possible to obtain the result of the request query by replacing the base query with {@ ORIGINAL_SQL @} and querying the source database.

다음으로, 기초 쿼리의 결과 데이터가 서버 캐시(40)에 저장되어 있는지를 검색한다(S13).Next, it is searched whether the result data of the basic query is stored in the server cache 40 (S13).

기초 쿼리에 의해 데이터베이스(60)로부터 가져온 데이터(또는 기초 쿼리의 결과 데이터)는 서버 캐시(40)에 저장하여 보관된다. 분석처리 서버(30)는 앞서 구한 기초 쿼리를 서버 캐시(40)에서 검색한다. 즉, 저장해둔 기초쿼리에 앞서 구한 기초쿼리가 존재하는지를 검색한다.Data (or result data of the base query) fetched from the database 60 by the base query is stored and stored in the server cache 40. The analysis processing server 30 searches the server cache 40 for the basic query obtained previously. That is, it is checked whether there is a basic query obtained before the stored basic query.

검색을 위한 비교 과정을 설명하면, 참조항목 절, 테이블 절/조인 절, 및, 조건 절이 동일한지 여부로 판단한다. 다면, 참조하는 테이블들(또는 테이블간의 조인도 포함됨)이 동일한 경우에는, 참조항목 절(SELECT 절)과 조건 절(where clasue)만 동일한가를 비교한다. 특히, 참조항목 절에서는 컬럼명의 고유키들만 비교하면 된다. 즉, 앨리어스(alias)들이 동일한지만 비교한다.When a comparison process for searching is described, it is judged whether the reference item clause, the table clause / join clause, and the condition clause are the same. If the referencing tables (or joins between tables) are the same, compare the reference clause clause (SELECT clause) and the condition clause (where clause) equally. In particular, only the unique keys of a column name need to be compared in the REFERENCES section. That is, the aliases are the same but are compared.

예를 들어, 본 출원인의 매트릭스(matrix)에서 사용되는 SQL는 메타를 이용해서 자동 생성된 SQL이다. 메타 아이템1과 메타 아이템 2를 선택했다면, 이미 해당 메타 아이템의 테이블 코드, 컬럼 코드, 조인 조건을 자동으로 생성할 수 있다. 같은 메타에서 자동 생성된 SQL는 필드 alias 비교만으로도 같다는 것을 알 수 있다. 즉, 추가로 조건 비교만 수행하면 기초쿼리를 재사용 여부를 체크할 수 있다. 하지만, 메타 없이 쿼리만 비교한다면 테이블 및 그들간의 조인 관계도 비교해야 한다.
For example, the SQL used in the applicant's matrix is SQL that is automatically generated using meta. If you select meta-item 1 and meta-item 2, you can automatically generate the table code, column code, and join condition of the corresponding meta-item. You can see that the auto-generated SQL in the same meta is the same with field alias comparisons. That is, if the conditional comparison is further performed, it is possible to check whether or not the basic query is reused. However, if you only compare queries without a meta, you should also compare the join relationships between the table and them.

다음으로, 기초 쿼리의 결과 데이터가 서버 캐시에 저장되어 있지 않으면, 상기 기초 쿼리로 데이터베이스(60)에 데이터를 요청하고, 기초 쿼리의 결과 데이터를 수신하면, 이를 서버 캐시(40)에 저장한다(S14).Next, if the result data of the basic query is not stored in the server cache, the data is requested to the database 60 with the basic query, and when the result data of the basic query is received, the data is stored in the server cache 40 S14).

앞서 설명한 바와 같이, 기초 쿼리는 데이터베이스(60)에 저장된 데이터를 직접 참조하는 쿼리이므로, 해당 기초 쿼리로 데이터베이스(60)에 쿼리 요청을 한다. 쿼리 요청은 BI 서버(50)를 통해 데이터베이스(60)에 요청되고, 데이터베이스(60)에서 상기 기초 쿼리에 의해 추출된 데이터는 BI 서버(50)를 통해 분석처리 서버(30)로 리턴된다. 분석처리 서버(30)는 수신한 상기 기초 쿼리의 결과 데이터를 서버 캐시(50)에 저장한다.As described above, since the basic query is a query directly referring to the data stored in the database 60, the basic query is used to make a query request to the database 60 using the basic query. The query request is requested to the database 60 via the BI server 50 and the data extracted by the basic query in the database 60 is returned to the analysis processing server 30 via the BI server 50. [ The analysis processing server 30 stores the received result data of the basic query in the server cache 50.

한편, 기초 쿼리의 결과 데이터는 데이터베이스(60)의 데이터 구조와 동일한 형태 또는 동일한 구조로 저장된다. 즉, 데이터베이스(60)의 데이터들이 테이블 형태로 저장된다면, 기초 쿼리의 결과 데이터도 테이블 형식으로 저장된다. 또한, 서버 캐시(40)에서 저장되는 결과 데이터의 각 필드의 타입이나 크기 등이 데이터베이스(60)에서 구성된 필드의 타입이나 크기와 동일하게 구성된다. 이것은 확장 쿼리가 데이터베이스(60)에 저장된 데이터 대신, 기초 쿼리의 결과 데이터를 참조하여도 쿼리가 실행되게 하기 위함이다.On the other hand, the result data of the basic query is stored in the same form or the same structure as the data structure of the database 60. That is, if the data of the database 60 is stored in the form of a table, the result data of the basic query is also stored in the form of a table. The types and sizes of the fields of the result data stored in the server cache 40 are the same as the types and sizes of fields configured in the database 60. This is to allow the query to be executed even if the extended query refers to the result data of the basic query instead of the data stored in the database 60. [

이때, 결과 테이블의 컬럼명은 컬럼의 고유키로 변환된다.
At this time, the column names in the result table are converted into the unique keys of the columns.

다음으로, 분석처리 서버(30)는 기초 쿼리의 결과에 확장 쿼리를 적용하여 요청 쿼리의 결과를 획득한다(S15).Next, the analysis processing server 30 applies the extended query to the result of the basic query to obtain the result of the request query (S15).

확장 쿼리는 요청 쿼리와 동일한 구조를 가지고 데이터베이스(60)를 참조하는 대신 기초 쿼리를 참조하는 쿼리이다. 따라서 확장 쿼리에서 데이터베이스(60)를 참조하는 명칭(이하 데이터베이스 참조 명칭)들을 기초 쿼리의 결과 데이터를 참조하는 명칭(이하 기초 쿼리 참조 명칭)들로 변경하여 생성된다. 앞서 설명한 바와 같이, 확장 쿼리에서 참조하는 테이블의 명칭은 기초 쿼리에 의해 생성된 테이블(또는 결과 테이블)을 참조하도록 모두 변경되고, 확장 쿼리에서 참조하는 데이터 필드의 명칭(또는 컬럼명)은 모두 기초 쿼리에 의해 생성된 데이터 필드의 명칭(또는 컬럼명의 고유키)들로 모두 변경된다.The extended query has the same structure as the request query and is a query referring to the base query instead of referring to the database (60). Therefore, names (hereinafter referred to as database reference names) referencing the database 60 in the extended query are generated by changing them to names (hereinafter referred to as basic query reference names) referencing the result data of the basic query. As described above, the names of the tables referenced in the extended query are all changed to refer to the table (or result table) generated by the basic query, and the names (or column names) of the data fields referenced in the extended query are all based And the name of the data field generated by the query (or the unique key of the column name).

앞서 S13 단계에서, 기초 쿼리의 결과 데이터가 서버 캐시(40)에 저장되어 있을 수도 있고, 없을 수도 있다. 그러나 저장되어 있지 않은 경우, S14 단계에서, 기초 쿼리로 데이터베이스(60)에 쿼리 요청하여 데이터를 받아 서버 캐시(40)에 저장한다. 따라서 이번 단계(S15)에서는, 기초 쿼리의 결과 데이터는 서버 캐시(40)에 반드시 저장되어 있다.In step S13, the result data of the basic query may or may not be stored in the server cache 40. [ However, if it is not stored, in step S14, a query is sent to the database 60 as a basic query, and the data is received and stored in the server cache 40. Therefore, in this step S15, the result data of the basic query is always stored in the server cache 40. [

또한, 확장 쿼리는 기초 쿼리의 데이터들을 참조하는 쿼리이다. 따라서 확장 쿼리를 기초 쿼리의 결과 데이터에 적용할 수 있다. 기초 쿼리의 결과 데이터를 참조하여 확장 쿼리를 적용하면, 원래 요청된 요청 쿼리의 결과 데이터를 구할 수 있다.In addition, the extended query is a query referring to the data of the basic query. Therefore, an extended query can be applied to the result data of the underlying query. By applying the extended query with reference to the result data of the basic query, the result data of the originally requested request query can be obtained.

마지막으로, 앞서 확장 쿼리를 적용하여 구한 결과 데이터를 요청 쿼리의 결과로서, 클라이언트(20)에 전송한다(S16).
Finally, the resultant data obtained by applying the extended query is transmitted to the client 20 as a result of the request query (S16).

다음으로, 본 발명의 제2 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법을 도 6을 참조하여 보다 구체적으로 설명한다.Next, an online analysis processing method using a two-step query by SQL parsing and result caching according to a second embodiment of the present invention will be described in more detail with reference to FIG.

도 6에서 보는 바와 같이, 본 발명의 제2 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법은 (a) 요청 쿼리를 수신하여 파싱하는 단계(S21); (b) 기초 쿼리와 확장 쿼리를 생성하는 단계(S22); (c) 기초 쿼리의 결과를 서버 캐시에서 검색하는 단계(S23); (d) 검색되지 않으면, 요청 쿼리를 데이터베이스에서 가져와서 전송하는 단계(S24); (h) 기초 쿼리를 데이터베이스에서 가져와서 서버 캐시에 저장하는 단계(S28); (e) 검색되면, 확장 쿼리를 실행하는 단계(S25); (f) 상기 기초 쿼리의 결과에 확장 쿼리를 적용하여 요청 쿼리의 결과를 획득하는 단계(S26); 및, (g) 요청 쿼리의 결과를 전송하는 단계(S27)로 구성된다.As shown in FIG. 6, an online analysis processing method using a two-step query and result caching by SQL parsing according to a second embodiment of the present invention includes (a) receiving and parsing a request query (S21); (b) generating a basic query and an extended query (S22); (c) retrieving the result of the basic query in the server cache (S23); (d) if not retrieved, retrieving the request query from the database and transmitting (S24); (h) fetching the underlying query from the database and storing it in the server cache (S28); (e) if found, executing an extended query (S25); (f) applying an extended query to the result of the basic query to obtain a result of the request query (S26); And (g) transmitting the result of the request query (S27).

앞서 설명한 제1 실시예와 비교하면, 기초 쿼리를 서버 캐시에 검색하였을 때, 서버 캐시에서 기초 쿼리를 검색하지 못하면 요청 쿼리로 데이터베이스(60)에 요청하여 그 결과를 바로 클라이언트(20)에 전송하는 점(S24)에서 차이가 있다. 그리고 요청쿼리의 결과 데이터를 전송한 후, 기초 쿼리를 다시 데이터베이스(60)에 요청하여 기초 쿼리의 결과 데이터를 서버 캐시(40)에 저장한다(S28). 이하에서, 설명 중 생략된 부분은 앞서 설명한 제1 실시예의 설명을 참조한다.In comparison with the first embodiment described above, when the basic query is retrieved in the server cache, if the basic query can not be retrieved from the server cache, the request is sent to the database 60 as a request query and the result is sent to the client 20 immediately There is a difference at point S24. After transmitting the result data of the request query, the base query is requested again to the database 60 to store the result data of the base query in the server cache 40 (S28). Hereinafter, the parts omitted from the description will be referred to the description of the first embodiment described above.

먼저, 분석처리 서버(30)는 클라이언트(20)로부터 요청 쿼리를 수신하여 파싱한다(S21). 앞서 제1 실시예와 동일하다. 다음으로, 분석처리 서버(30)는 파싱 결과를 이용하여 기초 쿼리 및 확장 쿼리를 생성한다(S22).First, the analysis processing server 30 receives and parses the request query from the client 20 (S21). This is the same as the first embodiment. Next, the analysis processing server 30 generates a basic query and an extended query using the parsing result (S22).

그리고 기초 쿼리의 결과 데이터가 서버 캐시(40)에 저장되어 있는지를 검색한다(S23). 기초 쿼리에 의해 데이터베이스(60)로부터 가져온 데이터(또는 기초 쿼리의 결과 데이터)는 서버 캐시(40)에 저장하여 보관된다. 분석처리 서버(30)는 앞서 구한 기초 쿼리를, 서버 캐시(40)에 결과 데이터를 저장해둔 기초 쿼리들과 대비하여, 검색한다.It is checked whether the result data of the basic query is stored in the server cache 40 (S23). Data (or result data of the base query) fetched from the database 60 by the base query is stored and stored in the server cache 40. The analysis processing server 30 searches the base query obtained previously in comparison with the base queries storing the result data in the server cache 40. [

다음으로, 기초 쿼리의 결과 데이터가 서버 캐시에 저장되어 있지 않으면, 상기 요청 쿼리로 데이터베이스(60)에 데이터를 요청하고, 상기 요청 쿼리의 결과 데이터를 수신하면, 이를 클라이언트(20)에 전송한다(S24). 이때, 확장 쿼리 내에서 테이블 절을 기초쿼리로 대치한 후, 확장쿼리를 바로 데이터베이스(60)에 요청하여도, 원하는 결과 데이터(또는 결과 테이블)를 획득할 수 있다.Next, if the result data of the basic query is not stored in the server cache, the data is requested to the database 60 by the request query, and when the result data of the request query is received, it is transmitted to the client 20 S24). At this time, desired result data (or result table) can be acquired even if the extended query is directly requested to the database 60 after replacing the table section with the basic query in the extended query.

클라이언트(20)에 요청 쿼리의 결과 데이터를 전송한 후, 분석처리 서버(30)는 상기 기초 쿼리로 데이터베이스(60)에 데이터를 요청하고, 기초 쿼리의 결과 데이터를 수신하면, 이를 서버 캐시(40)에 저장한다(S28). 특히, 분석처리 서버(30)는 스케쥴러에 의해, 데이터베이스(60)의 요청이 많지 않고 트래픽에 여유가 있는 시간에 상기 기초 쿼리에 대한 데이터를 요청하여 그 결과를 서버 캐시(40)에 저장한다.After transmitting the result data of the request query to the client 20, the analysis processing server 30 requests data from the database 60 with the basic query, and when receiving the result data of the basic query, (S28). In particular, the analysis processing server 30 requests the data of the basic query at a time when there is not much request from the database 60 and there is a margin for traffic by the scheduler, and stores the result in the server cache 40.

다음으로, 기초 쿼리의 결과 데이터가 서버 캐시에 저장된 경우를 설명한다. 분석처리 서버(30)는 기초 쿼리의 결과 데이터에 확장 쿼리를 적용하여 요청 쿼리의 결과를 획득한다(S25). 획득된 결과 데이터를 클라이언트(20)에 전송한다(S26).
Next, the case where the result data of the basic query is stored in the server cache will be described. The analysis processing server 30 acquires the result of the request query by applying an extended query to the result data of the base query (S25). And transmits the obtained result data to the client 20 (S26).

다음으로, 본 발명의 제3 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법을 도 7를 참조하여 보다 구체적으로 설명한다.Next, an online analysis processing method using a two-step query and result caching by SQL parsing according to a third embodiment of the present invention will be described in more detail with reference to FIG.

도 3에서 보는 바와 같이, 본 발명의 제3 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법은 (a) 요청 쿼리를 수신하여 파싱하는 단계(S31); (b) 기초 쿼리와 확장 쿼리로 생성하는 단계(S31); (c) 상기 기초 쿼리와 확장 쿼리를 조합하여 캐시파일에서 검색하는 단계(S32); (d) 검색되면, 캐시파일을 클라이언트로 전송하는 단계(S33); (e) 검색되지 않으면, 제1 또는 제2 실시예를 수행하는 단계(34); 및 (f) 요청 쿼리의 결과 데이터를 캐시파일로 저장하는 단계(S35)로 구성된다.As shown in FIG. 3, an online analysis processing method using a two-step query and result caching by SQL parsing according to a third embodiment of the present invention includes (a) receiving and parsing a request query (S31); (b) a step S31 of generating a basic query and an extended query; (c) searching the cache file by combining the basic query and the extended query (S32); (d) if found, transmitting the cache file to the client (S33); (e) if not retrieved, performing (34) the first or second embodiment; And (f) storing the result data of the request query as a cache file (S35).

본 발명의 제3 실시예는 앞서 설명한 제1 또는 제2 실시예를 보완하는 실시예이다. 즉, 요청 쿼리의 결과 데이터를 캐시파일로 바이너리 형태로 저장하였다가 동일한 쿼리로 다시 요청되면, 해당 캐시파일을 바로 클라이언트(20)에 전송한다.The third embodiment of the present invention is an embodiment that complements the first or second embodiment described above. That is, if the result data of the request query is stored in the cache file in a binary form and is requested again with the same query, the cache file is directly transmitted to the client 20.

캐시파일이란 요청 쿼리의 결과 데이터를 파일로 저장한 것을 말한다. 분석처리 서버(30)가 클라이언트(20)가 요청한 결과 데이터를 만들어 최종적으로 전송할 때, 파일 형태로 전송한다. 캐시파일은 전송할 때와 동일한 파일이다. 따라서 캐시파일의 쿼리와 동일한 요청 쿼리로 요청하면, 해당 캐시파일을 바로 전송해주면 된다.The cache file is the result data of the request query is saved as a file. When the analysis processing server 30 creates the result data requested by the client 20 and finally transmits the data, the analysis processing server 30 transmits the result data in the form of a file. The cache file is the same file as when transmitting. Therefore, if a request is made with the same request query as that of the cache file, the cache file can be transferred immediately.

바람직하게는, 캐시파일은 분석처리 서버(40)의 서버 캐시(40)에 저장된다.Preferably, the cache file is stored in the server cache 40 of the analysis processing server 40.

구체적으로, 분석처리 서버(30)는 클라이언트(20)로부터 요청 쿼리를 수신하여 파싱한다(S30). 앞서 제1 또는 제2 실시예와 동일하다. 분석처리 서버(30)는 기초쿼리와 확장쿼리를 조합하여, 저장된 캐시파일의 쿼리를 비교하여, 동일한 쿼리가 있는지를 검색한다(S32).Specifically, the analysis processing server 30 receives and parses the request query from the client 20 (S30). This is the same as the first or second embodiment. The analysis processing server 30 combines the basic query and the extended query to compare the query of the stored cache file to search for the same query (S32).

만약 동일한 쿼리가 캐시파일에 있다면, 검색된 캐시파일을 바로 클라이언트(20)에 전송한다(S33). If the same query is in the cache file, the retrieved cache file is immediately transmitted to the client 20 (S33).

만약 동일한 쿼리가 없다면, 앞서 설명한 제1 또는 제2 실시예의 3번째 검색 단계(S13,S23)를 수행한다(S34). 즉, 서버 캐시에 기초 쿼리의 결과 데이터가 있는지를 검색한다. 서버 캐시에 기초 쿼리의 결과 데이터가 있으면, 기초 쿼리를 대상으로 확장쿼리를 만들어 결과 데이터를 획득한다. 획득된 결과 데이터를 클라이언트에 전송한다. 서버 캐시에 기초 쿼리의 결과 데이터가 없으면, 기초 쿼리 또는 요청 쿼리로 데이터베이스(60)로부터 결과 데이터를 가져온다. 가져온 결과 데이터가 기초쿼리 데이터이면 확장 쿼리를 통해 요청 쿼리의 결과 데이터를 생성한다. 최종적으로 요청 쿼리의 결과 데이터를 클라이언트(20)에 전송한다.If there is no identical query, the third search step (S13, S23) of the first or second embodiment described above is performed (S34). That is, it searches the server cache for the result data of the underlying query. If the server cache has the result data of the underlying query, it creates an extended query against the underlying query to obtain the resulting data. And transmits the obtained result data to the client. If there is no result data of the underlying query in the server cache, the result data is retrieved from the database 60 with a base query or a request query. If the fetched result data is basic query data, the result data of the request query is generated through an extended query. And finally transmits the result data of the request query to the client 20.

제1 또는 제2 실시예를 완료하면, 생성된 기초쿼리 및 확장쿼리의 조합의 결과 데이터(또는 클라이언트에 전송한 결과 데이터)를 캐시파일로 서버 캐시(40)에 저장한다(S36).
Upon completion of the first or second embodiment, the result data (or result data transmitted to the client) of the generated base query and the extended query is stored in the server cache 40 as a cache file (S36).

다음으로, 본 발명의 제4 실시예에 따른 ＳＱＬ 파싱에 의한 2단계 쿼리 및 결과 캐싱을 이용한 온라인 분석 프로세싱 방법을 도 8을 참조하여 구체적으로 설명한다.Next, an online analysis processing method using a two-step query and result caching by SQL parsing according to a fourth embodiment of the present invention will be described in detail with reference to FIG.

본 발명의 제4 실시예는 앞서 설명한 제1 내지 제3 실시예와 동일한 구성을 가진다. 다만, 서버 캐시(40)의 구성이 보다 세분화된다.The fourth embodiment of the present invention has the same configuration as the first to third embodiments described above. However, the configuration of the server cache 40 is further subdivided.

도 8에서 보는 바와 같이, 본 발명의 제4 실시예는 서버 캐시(40)를 캐시 메모리(41)와 캐시 디스크(42)로 구성한다.As shown in FIG. 8, the fourth embodiment of the present invention comprises a cache memory 41 and a cache disk 42 in the server cache 40.

캐시 메모리(41)는 분석처리 서버(30)의 RAM(Random access memory)으로 구성된다. 특히, 캐시 메모리(41)는 인메모리 스토리지로 구성된다. 캐시 디스크(42)는 분석처리 서버(30)의 하드 디스크 또는 SSD(Solid State Disk) 등으로 구성된다.The cache memory 41 is constituted by a RAM (Random Access Memory) of the analysis processing server 30. In particular, the cache memory 41 is comprised of in-memory storage. The cache disk 42 is constituted by a hard disk of the analysis processing server 30 or a solid state disk (SSD).

앞서 본 발명의 제1 내지 제3 실시예에서, 서버 캐시(40)에 저장되는 기초 쿼리의 결과 데이터는 모두 캐시 메모리(41)에 저장된다. 다만, 캐시 메모리(41)의 저장 용량 보다 기초 쿼리의 결과 데이터가 더 많은 경우, 캐시 메모리의 용량을 초과하는 결과 데이터는 캐시 디스크(42)에 저장된다.In the first to third embodiments of the present invention, the result data of the basic query stored in the server cache 40 are all stored in the cache memory 41. [ However, when the result data of the basic query is larger than the storage capacity of the cache memory 41, the result data exceeding the capacity of the cache memory is stored in the cache disk 42. [

이때, 캐시 디스크(42)로 옮겨지는 기초 쿼리의 결과 데이터는 사전에 정해진 정책에 의해 선별된다. 선별 정책의 예로서, 결과 데이터에 대한 접근 빈도, 최근 접근 시각 등을 기초로 하여, 접근 빈도가 낮거나 최근 접근 시각이 가장 오래된 결과 데이들을 선별한다.At this time, the result data of the basic query transferred to the cache disk 42 is selected by a predetermined policy. As an example of a sorting policy, based on the frequency of access to result data, recent access time, and so on, the resultant data with the lowest access frequency or the most recent access time are selected.

또한, 제3 실시예의 캐시파일은 캐시 디스크(42)에 저장된다.
In addition, the cache file of the third embodiment is stored in the cache disk 42.

본 발명의 효과를 도 9 내지 도 12를 참조하여 보다 구체적으로 설명한다.The effects of the present invention will be described in more detail with reference to Figs. 9 to 12. Fig.

본 발명은 비즈니스 인텔리전스(BI, Business Intelligence) 기반의 빅데이터를 처리를 위한 플랫폼에 관한 것이다. 특히, 빅데이터를 요청하였을 때 응답시간을 10초 이내의 빠른 시간 내에 처리함으로써, 실시간에 가까운 처리를 위한 것이다. 본 발명은 실시간에 가까운 처리를 위한 캐시 파일과 캐시 메모리 테이블을 사용한다. 이를 위해 요청 퀴리를 파싱하여, 기초 쿼리와 확장 쿼리로 분리한다.The present invention relates to a platform for processing big data based on business intelligence (BI). In particular, when the big data is requested, the response time is processed within a short time of 10 seconds or less, thereby realizing close processing in real time. The present invention uses a cache file and a cache memory table for near-real-time processing. To do this, we parse the request query and separate it into basic query and extended query.

또한, 메모리의 한계에 따른 파일 형태의 로딩/저장/필터링의 구조를 정의한다. 예를 들면 1억 건 정도 메모리 테이블에 로딩하는데 5G 정도 소요된다면, 32G 서버 환경이라면 10억 건 정도를 메모리에 가지고 있을 수 없다. 그러므로 일부 데이터를 파일 형태로 빠르게 저장하고 필요시 다시 메모리로 로딩하는 구조가 필요하다. 파일 자체에 조건(필터링과 쿼리를 파싱하여 해당 컬럼에서 조건 추출)으로 원하는 데이터만 처리한다.It also defines the structure of file type loading / storing / filtering according to memory limitations. For example, if it takes up to 100M to load 5G on a memory table, then a 32G server environment can not have a billion in memory. Therefore, there is a need for a structure for quickly storing some data in a file form and loading it back into memory if necessary. We process only the desired data with the condition of the file itself (filtering and query parsing to extract the condition from the corresponding column).

또한, 요청 쿼리를 기초 쿼리(BaseSQL)와 확장 쿼리로 단계를 분리하는 이유는 상대적으로 속도가 떨어지는 관계형 데이터베이스(RDB)를 사용하는 것이 아니라, 분석처리 서버(30)에 구비된 서버 캐시(인메모리 데이터베이스)를 사용하기 위해서다. 이를 통해, 속도가 획기적으로 개선된다. 인메모리 데이터베이스(In-memory Database)는 데이터 스토리지의 메인 메모리에 설치되어 운영되는 방식의 데이터베이스 관리 시스템이다. 디스크에 설치되는 방식에 비해 처리 속도가 빠르다.The reason for separating the request query into the base query and the extended query is not to use a relatively relational database (RDB), but rather to use a server cache (in-memory) provided in the analysis processing server 30, Database). This dramatically improves speed. In-memory database is a database management system that is installed and operated in main memory of data storage. The processing speed is faster than that installed on the disk.

구체적으로, 본 발명의 제1 내지 제4 실시예를 적용하는 경우, 각 상황에서의 처리 속도를 설명한다.Specifically, when the first to fourth embodiments of the present invention are applied, the processing speed in each situation will be described.

먼저, 제1 상황은 기초 쿼리와 확장 쿼리가 모두 불일치하는 경우이다. 즉, 제1 상황은 최초로 쿼리가 실행되는 경우에 해당되며, 전체 처리속도는 종래 기술에 의한 시스템과 동일하게 소요된다.First, the first situation is that both the basic query and the extended query are inconsistent. That is, the first situation corresponds to the case where the query is executed for the first time, and the overall processing speed is the same as that of the system according to the prior art.

도 9에서 보는 바와 같이, 먼저, 매트릭스(Matrix) 보고서에서 추출한 DB 코드, SQL 정보를 기초 쿼리(BaseSQL)와 확장 쿼리(ExtendSQL)로 분리하여 분석처리 서버(SOLAP 서버)에 요청한다(① 단계). 일치하는 기초 쿼리(BaseSQL)와 확장 쿼리(ExtendSQL)가 없기 때문에, 쿼리를 원래 요청 쿼리로 하여 BI 서버(Matrix Server)에 요청한다(② 단계). BI 서버에서 타겟 DB를 연결하여 데이터를 요청한다(③ 단계). 타겟 DB에서 큐브(cube) 데이터를 보내 준다(④ 단계). 그리고 필드 정보와 데이터를 압축해서 보내 준다(⑤ 단계). 보내준 파일을 캐시 파일로 저장한다(⑥ 단계). 캐시 파일을 브라우저(또는 클라이언트)에 보내 준다(⑦ 단계). 마지막으로, 스케줄러가 기초 쿼리(BaseSQL)를 원천 DB에서 실행하여 캐시 메모리(40)에 저장(Background 실행)한다(⑧ 단계).9, a DB code and SQL information extracted from a matrix report are separated into a base query and an extended query and are requested to an analysis processing server (SOLAP server) (step 1) . Since there is no matching base query (BaseSQL) and an extended query (ExtendSQL), the query is requested to the BI server (Matrix Server) as the original request query (step ②). The BI server connects the target DB and requests data (step ③). The cube data is sent from the target DB (step ④). Then, field information and data are compressed and sent (step ⑤). And saves the sent file as a cache file (step 6). The cache file is sent to the browser (or client) (step 7). Finally, the scheduler executes the base query (BaseSQL) in the source DB and stores it in the cache memory 40 (Background execution) (step 8).

다음으로, 제2 상황은 기초 쿼리는 일치하고, 확장 쿼리는 불일치하는 경우이다. 기초 쿼리(BaseSQL)에 해당하는 캐시 메모리 테이블을 생성된 경우에 해당되며 속도는 10초 내외 로서, 제1 상황(또는 종래 기술)에 비하여 10 ~ 50배 향상된다. Next, the second situation is that the base query is matched and the extended query is inconsistent. This corresponds to a case where a cache memory table corresponding to a base query is created. The speed is about 10 seconds, which is improved by 10 to 50 times as compared with the first situation (or the prior art).

도 10에서 보는 바와 같이, 먼저, 매트릭스(Matrix) 보고서에서 추출한 DB 코드, SQL 정보를 기초 쿼리(BaseSQL)와 확장 쿼리(ExtendSQL)로 분리하여 SOLAP 서버에 요청한다(① 단계). 기초 쿼리(BaseSQL)가 일치하고 확장 쿼리(ExtendSQL)가 없는 경우 확장 쿼리(ExtendSQL)의 타겟 테이블은 서버 캐시에 저장된 테이블 명으로 변경하여 실행한다(② 단계). 그리고 서버 캐시에서 cube 데이터를 보내준다(③ 단계). 보내준 파일을 캐시 파일로 저장한다(④ 단계). 마지막으로,캐시 파일을 브라우저에 보내준다(⑤ 단계).As shown in FIG. 10, first, a DB code and SQL information extracted from a matrix report are separated into a base query and an extended query (Step 1). If the base query is matched and there is no extended query, the target table of the extended query is changed to the table name stored in the server cache (step 2). Then, the cube data is sent from the server cache (step ③). Save the sent file as a cache file (step ④). Finally, the cache file is sent to the browser (step 5).

다음으로, 제3 상황은 기초 쿼리와 확장 쿼리가 모두 일치하는 경우이다. 기초 쿼리(BaseSQL)와 확장 쿼리(ExtendSQL) 모두 일치하는 경우에 해당되며 속도는 3초 내외로서, 제1 상황 또는 종래 기술에 비하여, 100배 이상 향상된다.Next, the third situation is that both the base query and the extended query match. This corresponds to a case in which both the base query (BaseSQL) and the extended query (ExtendSQL) coincide with each other, and the speed is about 3 seconds, which is improved by more than 100 times as compared with the first situation or the prior art.

도 11에서 보는 바와 같이, Matrix 보고서에서 추출한 DB코드, SQL 정보를 기초 쿼리(BaseSQL)와 확장 쿼리(ExtendSQL)로 분리하여 분석처리 서버에 요청한다(① 단계). 기초 쿼리(BaseSQL)와 확장 쿼리(ExtendSQL)가 모두 일치하는 경우 해시키값이 존재한다(② 단계). 그리고 캐시 파일을 브라우저에 보내준다(③ 단계).As shown in FIG. 11, the DB code and SQL information extracted from the Matrix report are separated into a base query and an extended query (Step 1). If both the base query (BaseSQL) and the extended query (ExtendSQL) match, a hash key value exists (step ②). Then, the cache file is sent to the browser (step 3).

상기 제1 내지 제3 상황의 처리 속도 등을 비교한 표가 도 12에 도시되고 있다. 동일한 쿼리가 실행되면 상태가 자동으로 넘어간다. 즉, 제1 상황에서, 제2 상황으로, 제2 상황에서 제3 상황으로 넘어간다. 또한, 동시 사용자 수가 증가하면 캐시 파일 사용 빈도가 급격하게 증가한다(90% 이상으로 예상된다). 대다수의 사용자들은 5분에서 3초로 쿼리 시간 감소를 경험하게 된다. 또한, 최초의 실행에서, 기초 쿼리에 대한 데이터는 스케줄러로 실행할 수 있다. 즉, 스케줄러를 통해 여유있는 시간에 데이터를 가져올 수 있어, 체감 속도에 영향을 주지 않는다.
A table comparing the processing speeds of the first to third situations is shown in Fig. When the same query is executed, the state is automatically passed. That is, in the first situation, in the second situation, in the second situation, to the third situation. Also, as the number of concurrent users increases, the frequency of cache file usage increases dramatically (more than 90% is expected). The majority of users will experience a reduction in query time from five to three seconds. Also, in the first execution, the data for the underlying query can be executed by the scheduler. That is, the data can be fetched at a leisure time through the scheduler, and it does not affect the perceived speed.

이상, 본 발명자에 의해서 이루어진 발명을 실시 예에 따라 구체적으로 설명하였지만, 본 발명은 실시 예에 한정되는 것은 아니고, 그 요지를 이탈하지 않는 범위에서 여러 가지로 변경 가능한 것은 물론이다.The invention made by the present inventors has been described concretely with reference to the embodiments. However, it is needless to say that the present invention is not limited to the embodiments, and that various changes can be made without departing from the gist of the present invention.

10 : 사용자 단말 20 : 클라이언트
30 : 분석처리 서버 40 : 서버 캐시
41 : 캐시 메모리 42 : 캐시 디스크
50 : BI 서버 60 : 데이터베이스10: user terminal 20: client
30: Analysis processing server 40: Server cache
41: cache memory 42: cache disk
50: BI server 60: database

Claims

A method for on-line analysis processing using a two-step query and result caching by SQL parsing of an analysis processing server that processes a request query for a database requested by a client,
(a) parsing the request query and extracting a column name included in the request query;
(b) a query (hereinafter referred to as a base query) referring to the same table as the table referred to by the request query, with the extracted column name as a reference item, and a query result data Generating an extended query that retrieves a query;
(c) retrieving result data of the base query from the server cache of the server;
(d) requesting data from the database with the basic query if there is no result data of the basic query in the server cache, and storing result data of the received basic query in the server cache; And
(e) applying the extended query to result data of the base query to obtain result data of the extended query, and transmitting the obtained result data to the client. How to process online analytics using query and result caching.

A method for on-line analysis processing using a two-step query and result caching by SQL parsing of an analysis processing server that processes a request query for a database requested by a client,
(a) parsing the request query and extracting a column name included in the request query;
(b) a query (hereinafter referred to as a base query) referring to the same table as the table referred to by the request query, with the extracted column name as a reference item, and a query result data Generating an extended query that retrieves a query;
(c) retrieving result data of the base query from the server cache of the server;
(d) requesting data from the database with the request query if there is no result data of the base query in the server cache, and transmitting result data of the received request query to the client; And
(e) requesting data from the database with the basic query, and storing result data of the received basic query in the server cache,
Wherein the server stores the resultant data of the extended query as a cache file in the server cache,
The method further comprises: (f) after the step (b), if the cache file of the extended query is retrieved from the server cache, transmitting the retrieved cache file to the client How to process online analytics using two-step query and result caching.

The method according to claim 1,
Wherein the server stores the resultant data of the extended query as a cache file in the server cache,
The method comprises:
(f) if the cache file of the extended query is searched from the server cache after the step (b), transmitting the retrieved cache file to the client. An on - line analytic processing method using result caching.

3. The method according to claim 1 or 2,
In the step (a), a unique key capable of identifying the column name is generated,
Wherein in the step (b), an alias is defined with respect to the column name in the reference item section of the basic query, and the extended query refers to the column using the alias. A method for on - line analysis processing using two - step query and result caching.

5. The method of claim 4,
Wherein the inherent key is obtained by hashing a database name of a corresponding column name, a name of a reference table, and a column name by using a two-step query by SQL parsing and result caching.

3. The method according to claim 1 or 2,
In the step (b), the basic query is composed of a reference item clause, a table reference clause, and a condition clause, and the table reference clause and the condition clause of the base query have the same structure as the table reference clause and the condition clause of the request query And a second step query and result caching by SQL parsing.

The method according to claim 6,
In the step (b), the extended query refers to the result data of the basic query or the basic query in the table reference section, and the clause other than the table reference clause is generated so as to have the same structure as the clause of the request query A method for on-line analysis processing using two-step query and result caching by SQL parsing.

8. The method of claim 7,
Wherein, in the step (b), when an alias for a table is defined in the request query, an alias of the table is deleted and an alias of the table is replaced with a name of the table to generate the extended query A method for on-line analysis processing using two-step query and result caching by SQL parsing.

3. The method according to claim 1 or 2,
Wherein the server cache comprises in-memory storage and a cache disk,
And storing result data of the basic query in the in-memory storage. 2. The online analysis processing method of claim 1, further comprising: