TWI860074B - Data processing method and data processing device - Google Patents
Data processing method and data processing device Download PDFInfo
- Publication number
- TWI860074B TWI860074B TW112133169A TW112133169A TWI860074B TW I860074 B TWI860074 B TW I860074B TW 112133169 A TW112133169 A TW 112133169A TW 112133169 A TW112133169 A TW 112133169A TW I860074 B TWI860074 B TW I860074B
- Authority
- TW
- Taiwan
- Prior art keywords
- environment
- data
- data table
- formal
- quality verification
- Prior art date
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Stored Programmes (AREA)
Abstract
Description
本發明係關於一種資料處理方法及資料處理裝置,尤指一種可即時比對不同環境的資料表血緣關係之差異之資料處理方法及資料處理裝置。The present invention relates to a data processing method and a data processing device, and more particularly to a data processing method and a data processing device capable of instantly comparing the differences in blood relationship between data tables in different environments.
資料中心通常包括資料表(table)和預存程序(stored procedure)。資料表可做為資料的載體,用以儲存資料。預存程序負責資料表與資料之間的轉換及關連邏輯。當資料中心的軟體工程師或開發人員增修資料表和資料的邏輯關係時需要對預存程序進行增修作業。由於資料中心中的預存程序數量龐大且彼此之間的邏輯又具有連帶關聯,當資料中心的軟體工程師新增或修改某一預存程序時就可能對資料中心的資料表血緣關係造成牽一髮而動全身的變動。現有的驗證方式就是對修改預存程序前的資料與修改預存程序後的資料進行抽樣驗證。不過此種驗證方式既費時,又無法能確認完整的正確性。而且,很多錯誤的情況往往都仰賴資料使用者發現異常而反饋給資料中心的軟體工程師。然而,錯誤的資料將會使公司業者蒙受潛在且無法評估的損失,所以修改預存程序對於資料中心的軟體工程師而言是非常危險的工作,且隨著預存程序越來越多,修改風險也會成比例的增加,進而影響到公司推進數位轉型的速度。因此,現有的技術實有改進之必要。The data center usually includes data tables and stored procedures. Data tables can be used as data carriers to store data. Stored procedures are responsible for the conversion and related logic between data tables and data. When the software engineers or developers in the data center modify the logical relationship between data tables and data, they need to modify the stored procedures. Since the number of stored procedures in the data center is huge and the logic between them is interrelated, when the software engineers in the data center add or modify a stored procedure, it may cause changes to the blood relationship of the data tables in the data center. The existing verification method is to perform sampling verification on the data before and after the stored procedure is modified. However, this verification method is time-consuming and cannot confirm complete correctness. Moreover, many errors often rely on data users to discover anomalies and feedback to the software engineers in the data center. However, erroneous data will cause companies to suffer potential and unassessed losses, so modifying stored procedures is a very dangerous job for software engineers in the data center. And as the number of stored procedures increases, the risk of modification will increase proportionally, which will affect the speed of the company's digital transformation. Therefore, the existing technology really needs to be improved.
為了解決上述之問題,本發明提供一種可即時比對不同環境的資料表血緣關係之差異之資料處理方法及資料處理裝置,以解決上述問題。In order to solve the above problems, the present invention provides a data processing method and a data processing device that can instantly compare the differences in blood relationship between data tables in different environments to solve the above problems.
本發明提供一種資料處理方法,用於一資料中心,包括;由該資料中心取得一開發環境、一資料品質驗證環境以及一正式環境之預存程序;根據該開發環境、該資料品質驗證環境以及該正式環境之預存程序分別產生該開發環境、該資料品質驗證環境以及該正式環境之資料表血緣關係圖,其中該步驟包括:針對該開發環境、該資料品質驗證環境以及該正式環境中之每一環境,解析每一預存程序以判斷出該每一預存程序之有向圖;以及合併每一環境中之所有預存程序之有向圖以產生每一環境之一資料表血緣關係圖,其中該步驟包括當判斷出一第一預存程序之一來源資料表與一第二預存程序之一目標資料表為相同資料表時將用以表示該第一預存程序之該來源資料表之一第一節點與用以表示該第二預存程序之該目標資料表之一第二節點合併;比較該開發環境、該資料品質驗證環境以及該正式環境之資料表血緣關係圖以產生一比較結果;以及根據該比較結果,執行一通知功能。The present invention provides a data processing method for a data center, comprising: obtaining a development environment, a data quality verification environment, and a pre-stored program of a formal environment from the data center; generating a data table lineage relationship graph of the development environment, the data quality verification environment, and the formal environment according to the pre-stored programs of the development environment, the data quality verification environment, and the formal environment, wherein the step includes: for each environment in the development environment, the data quality verification environment, and the formal environment, parsing each pre-stored program to determine a directed graph of each pre-stored program; and merging A directed graph of all stored programs in each environment is used to generate a data table lineage relationship graph for each environment, wherein the steps include merging a first node of the source data table representing the first stored program and a second node of the target data table representing the second stored program when it is determined that a source data table of the first stored program and a target data table of the second stored program are the same data table; comparing the data table lineage relationship graphs of the development environment, the data quality verification environment and the formal environment to generate a comparison result; and executing a notification function based on the comparison result.
本發明提供一種資料處理裝置,用於一資料中心,包括︰一儲存裝置,用以儲存指令;以及一處理電路,經配置用以執行所述指令,其中所述指令包括︰由該資料中心取得一開發環境、一資料品質驗證環境以及一正式環境之預存程序;根據該開發環境、該資料品質驗證環境以及該正式環境之預存程序分別產生該開發環境、該資料品質驗證環境以及該正式環境之資料表血緣關係圖,其中該步驟包括:針對該開發環境、該資料品質驗證環境以及該正式環境中之每一環境,解析每一預存程序以判斷出該每一預存程序之有向圖;以及合併每一環境中之所有預存程序之有向圖以產生每一環境之一資料表血緣關係圖,其中該步驟包括當判斷出一第一預存程序之一來源資料表與一第二預存程序之一目標資料表為相同資料表時將用以表示該第一預存程序之該來源資料表之一第一節點與用以表示該第二預存程序之該目標資料表之一第二節點合併;比較該開發環境、該資料品質驗證環境以及該正式環境之資料表血緣關係圖以產生一比較結果;以及根據該比較結果,執行一通知功能。The present invention provides a data processing device for use in a data center, comprising: a storage device for storing instructions; and a processing circuit configured to execute the instructions, wherein the instructions include: obtaining a development environment, a data quality verification environment, and a pre-stored program of a formal environment from the data center; generating a data table blood relationship diagram of the development environment, the data quality verification environment, and the formal environment according to the pre-stored programs of the development environment, the data quality verification environment, and the formal environment, wherein the step includes: parsing each pre-stored program for each environment in the development environment, the data quality verification environment, and the formal environment; sequence to determine the directed graph of each stored procedure; and merge the directed graphs of all stored procedures in each environment to generate a data table lineage relationship graph for each environment, wherein the step includes merging a first node of the source data table representing the first stored procedure with a second node of the target data table representing the second stored procedure when it is determined that a source data table of a first stored procedure and a target data table of a second stored procedure are the same data table; comparing the data table lineage relationship graphs of the development environment, the data quality verification environment, and the formal environment to generate a comparison result; and executing a notification function based on the comparison result.
本發明實施例可應用於一資料中心或一資料庫系統。為了預防因增修預存程序而導致錯誤,本發明實施例可於資料中心建置一開發環境(development environment)、一資料品質驗證環境(data quality assurance system environment)以及一正式環境(production environment)。開發環境可用來進行增修預存程序,並確認增修後預存程序可以正常執行,沒有程式編碼上的錯誤。在開發環境中,開發人員或軟體工程師可載入原始資料與預存程序並依據需求來增修預存程序。於增修預存程序之後,開發環境之經過增修的新版本資料與預存程序可被複製至資料品質驗證環境。資料品質驗證環境可提供資料使用者確認增修預存程序後,資料變動情形是否符合需求。在資料品質驗證環境中提供資料使用者利用各式測試方式來進行測試,以判斷資料變動是否符合需求。當當資料使用者測試發現任何問題瑕疵或缺陷時,開發人員或軟體工程師可在開發環境對於資料品質驗證環境所發現的問題進行處理,之後再將處理後的資料與預存程序複製至資料品質驗證環境以再度進行相關測試,直到獲得預期結果或是沒有問題發生。正式環境是真正運作的資料中心環境。所以當資料品質驗證環境完成資料使用者測試並確認資料變動符合需求時,資料中心之開發人員或軟體工程師可將經過增修的新版本預存程序同步更新至正式環境中。The embodiment of the present invention can be applied to a data center or a database system. In order to prevent errors caused by adding or modifying stored programs, the embodiment of the present invention can establish a development environment, a data quality assurance system environment, and a formal environment in the data center. The development environment can be used to add or modify stored programs and confirm that the stored programs can be run normally after the addition or modification without any coding errors. In the development environment, developers or software engineers can load original data and stored programs and add or modify the stored programs according to requirements. After adding or modifying the stored programs, the modified new version of the data and stored programs in the development environment can be copied to the data quality assurance environment. The data quality verification environment allows data users to confirm whether the data changes meet their requirements after adding or modifying stored procedures. In the data quality verification environment, data users are provided with various testing methods to conduct tests to determine whether the data changes meet their requirements. When data users find any problems, flaws or defects during testing, developers or software engineers can process the problems found in the data quality verification environment in the development environment, and then copy the processed data and stored procedures to the data quality verification environment to conduct related tests again until the expected results are obtained or no problems occur. The formal environment is a truly operational data center environment. Therefore, when the data quality verification environment completes the data user test and confirms that the data changes meet the requirements, the data center developers or software engineers can synchronize the modified new version of the stored program to the formal environment.
請參考第1圖,第1圖為本發明實施例之一資料處理方法之一流程10之示意圖。流程10包含以下步驟:Please refer to FIG. 1, which is a schematic diagram of a process 10 of a data processing method according to an embodiment of the present invention. Process 10 includes the following steps:
步驟S100:開始。Step S100: Start.
步驟S102:由資料中心取得開發環境、資料品質驗證環境以及正式環境之預存程序。Step S102: Obtain the stored procedures of the development environment, data quality verification environment, and formal environment from the data center.
步驟S104:根據開發環境、資料品質驗證環境以及正式環境之預存程序分別產生開發環境、資料品質驗證環境以及正式環境之資料表血緣關係圖。Step S104: Generate data table lineage relationship diagrams of the development environment, data quality verification environment, and formal environment respectively according to the pre-stored procedures of the development environment, data quality verification environment, and formal environment.
步驟S106:比較開發環境、資料品質驗證環境以及正式環境之資料表血緣關係圖以產生一比較結果。Step S106: Compare the data table lineage relationship diagrams of the development environment, the data quality verification environment, and the formal environment to generate a comparison result.
步驟S108:根據比較結果,執行一通知功能。Step S108: Execute a notification function according to the comparison result.
步驟S110:結束。Step S110: End.
根據流程10,於步驟S102中,本發明實施例可自資料中心取得開發環境之預存程序、資料品質驗證環境之預存程序以及正式環境之預存程序。接著,於步驟S104中,根據開發環境之預存程序、資料品質驗證環境之預存程序以及正式環境之預存程序分別產生開發環境、資料品質驗證環境以及正式環境之資料表血緣關係圖。在一實施例中,根據開發環境之預存程序產生開發環境之資料表血緣關係圖。可解析開發環境之每一預存程序以判斷出每一預存程序之有向圖(directed graph)。例如,可利用預存程序解析引擎來解析預存程序並產生預存程序之有向圖。例如,可利用Python程式軟體之SQL parser解析引擎來解析預存程序並產生預存程序之有向圖。每一預存程序可包括至少一來源資料表以及至少一目標資料表。每一預存程序可僅包括至少一來源資料表。每一預存程序亦可僅包括至少一目標資料表。預存程序之有向圖可包括用以表示來源資料表及/或目標資料表的節點。預存程序之有向圖可包括用以表示預存程序之有向邊,有向邊的方向係從來源資料表指向目標資料表。每一預存程序可對應於至少一個有向圖。According to process 10, in step S102, the embodiment of the present invention can obtain the pre-stored programs of the development environment, the pre-stored programs of the data quality verification environment, and the pre-stored programs of the formal environment from the data center. Then, in step S104, the data table lineage relationship graphs of the development environment, the data quality verification environment, and the formal environment are generated respectively according to the pre-stored programs of the development environment, the pre-stored programs of the data quality verification environment, and the pre-stored programs of the formal environment. In one embodiment, the data table lineage relationship graph of the development environment is generated according to the pre-stored programs of the development environment. Each pre-stored program of the development environment can be parsed to determine the directed graph of each pre-stored program. For example, a stored procedure parsing engine can be used to parse stored procedures and generate a directed graph of stored procedures. For example, the SQL parser parsing engine of Python program software can be used to parse stored procedures and generate a directed graph of stored procedures. Each stored procedure may include at least one source data table and at least one target data table. Each stored procedure may only include at least one source data table. Each stored procedure may also only include at least one target data table. The directed graph of stored procedures may include nodes used to represent source data tables and/or target data tables. The directed graph of stored procedures may include directed edges used to represent stored procedures, and the direction of the directed edges is from the source data table to the target data table. Each stored procedure may correspond to at least one directed graph.
例如,請參考第2圖,如第2圖左半部分所示,第一預存程序為INT_CUSTOMER_SATISFACTION_SP.STOREDPROCEDURE.SQL。第一預存程序之一有向圖包含有節點202、節點206以及有向邊204。節點202用以表示第一預存程序之一目標資料表,其中第一預存程序之目標資料表為資料表TP1_INT_CUSTOMER_SATISFACTION。節點206用以表示第一預存程序之一來源資料表,其中第一預存程序之來源資料表為資料表ODS_CUSTOMER_SATISFACTION。有向邊204用以表示前述第一預存程序,其中有向邊204的方向係自來源資料表的節點202指向目標資料表的節點206。請繼續參考第2圖,如第2圖左半部分所示,第二預存程序為ODS_CUSTOMER_SATISFACTION_SP.STOREDPROCEDURE.SQL,第二預存程序之一有向圖包含有節點208、節點212以及有向邊210。節點208用以表示第二預存程序之一目標資料表,其中第二預存程序之目標資料表為資料表ODS_CUSTOMER_SATISFACTION。節點212用以表示第二預存程序之一來源資料表,其中第二預存程序之來源資料表為資料表TP1_ODS_CUSTOMER_SATISFACTION。有向邊208用以表示第二預存程序有向邊208的方向係自來源資料表的節點212指向目標資料表的節點208。For example, please refer to FIG. 2. As shown in the left half of FIG. 2, the first stored procedure is INT_CUSTOMER_SATISFACTION_SP.STOREDPROCEDURE.SQL. A directed graph of the first stored procedure includes a node 202, a node 206, and a directed edge 204. Node 202 is used to represent a target data table of the first stored procedure, wherein the target data table of the first stored procedure is the data table TP1_INT_CUSTOMER_SATISFACTION. Node 206 is used to represent a source data table of the first stored procedure, wherein the source data table of the first stored procedure is the data table ODS_CUSTOMER_SATISFACTION. Directed edge 204 is used to represent the aforementioned first stored procedure, wherein the direction of directed edge 204 is from node 202 of the source data table to node 206 of the target data table. Please continue to refer to Figure 2. As shown in the left half of Figure 2, the second stored procedure is ODS_CUSTOMER_SATISFACTION_SP.STOREDPROCEDURE.SQL, and a directed graph of the second stored procedure includes node 208, node 212, and directed edge 210. Node 208 is used to represent a target data table of the second stored procedure, wherein the target data table of the second stored procedure is the data table ODS_CUSTOMER_SATISFACTION. Node 212 is used to represent a source data table of the second stored procedure, wherein the source data table of the second stored procedure is the data table TP1_ODS_CUSTOMER_SATISFACTION. Directed edge 208 is used to represent that the direction of directed edge 208 of the second stored procedure is from node 212 of the source data table to node 208 of the target data table.
於步驟S104中,於解析出開發環境之所有預存程序之有向圖之後,可透過合併開發環境之所有預存程序之有向圖以產生開發環境之一資料表血緣關係圖。資料表血緣關係圖可表示資料表之間的關係。例如,可對開發環境之所有預存程序之有向圖進行比對。當判斷出一預存程序之一來源資料表與另一預存程序之一目標資料表為相同資料表時,可將用以表示該預存程序之來源資料表之節點與用以表示另一預存程序之目標資料表之節點合併以形成相應資料表血緣關係圖。例如,如第2圖左半部分所示,當判斷出第一預存程序之來源資料表與第二預存程序之目標資料表為相同的資料表(即資料表ODS_CUSTOMER_SATISFACTION_CUSTOMER_SATISFACTION)時,可將節點206以及節點208合併以形成節點214,藉以組合兩有向圖以形成資料表血緣關係圖之一部分。在此情況下,節點214用以表示第一預存程序之來源資料表以及第二預存程序之目標資料表,第一預存程序之有向圖與第二預存程序之有向圖便完成連接合併。透過前述合併的方式可將開發環境之所有預存程序中有向圖合併轉換成開發環境之資料表血緣關係圖。資料表血緣關係圖可包括有至少一來源有向子圖及/或至少一目標有向子圖。其中來源有向子圖可為以一資料表為終點之有向子圖。目標有向子圖可為以一資料表為出發點之有向子圖。例如,請參考第3圖,第3圖為本發明實施例之資料表血緣關係圖中之來源有向子圖之示意圖。第3圖中間部分顯示了對應於資料表INT_CUSTOMER_SATISFACTION的來源有向子圖302。在來源有向子圖302中用以表示資料表INT_CUSTOMER_SATISFACTION的節點為終點節點。來源有向子圖302包括與資料表INT_CUSTOMER_SATISFACTION有關的父節點。例如,請參考第4圖,第4圖為本發明實施例之資料表血緣關係圖中之目標有向子圖之示意圖。第4圖中間部分顯示對應於資料表INT_CUSTOMER_SATISFACTION的目標有向子圖402。在目標有向子圖402中用以表示資料表INT_CUSTOMER_SATISFACTION的節點為出發節點。目標有向子圖402包括與資料表INT_CUSTOMER_SATISFACTION有關的子節點。In step S104, after parsing the directed graph of all stored programs in the development environment, a data table lineage graph of the development environment can be generated by merging the directed graphs of all stored programs in the development environment. The data table lineage graph can represent the relationship between data tables. For example, the directed graphs of all stored programs in the development environment can be compared. When it is determined that a source data table of a stored program and a target data table of another stored program are the same data table, the nodes used to represent the source data table of the stored program and the nodes used to represent the target data table of another stored program can be merged to form a corresponding data table lineage graph. For example, as shown in the left half of FIG. 2, when it is determined that the source data table of the first stored program and the target data table of the second stored program are the same data table (i.e., data table ODS_CUSTOMER_SATISFACTION_CUSTOMER_SATISFACTION), the node 206 and the node 208 can be merged to form a node 214, so as to combine the two directed graphs to form a part of the data table lineage relationship graph. In this case, the node 214 is used to represent the source data table of the first stored program and the target data table of the second stored program, and the directed graph of the first stored program and the directed graph of the second stored program are connected and merged. Through the aforementioned merging method, the directed graphs in all stored programs of the development environment can be merged and converted into the data table lineage relationship graph of the development environment. The data table blood relationship graph may include at least one source directed subgraph and/or at least one target directed subgraph. The source directed subgraph may be a directed subgraph with a data table as the end point. The target directed subgraph may be a directed subgraph with a data table as the starting point. For example, please refer to Figure 3, which is a schematic diagram of the source directed subgraph in the data table blood relationship graph of an embodiment of the present invention. The middle part of Figure 3 shows the source directed subgraph 302 corresponding to the data table INT_CUSTOMER_SATISFACTION. The node used to represent the data table INT_CUSTOMER_SATISFACTION in the source directed subgraph 302 is the end node. The source directed subgraph 302 includes parent nodes related to the data table INT_CUSTOMER_SATISFACTION. For example, please refer to FIG. 4, which is a schematic diagram of a target directed subgraph in a data table blood relationship graph of an embodiment of the present invention. The middle portion of FIG. 4 shows a target directed subgraph 402 corresponding to the data table INT_CUSTOMER_SATISFACTION. The node used to represent the data table INT_CUSTOMER_SATISFACTION in the target directed subgraph 402 is a starting node. The target directed subgraph 402 includes child nodes related to the data table INT_CUSTOMER_SATISFACTION.
同樣地,依據前述產生開發環境之資料表血緣關係圖的方式,可根據資料品質驗證環境之預存程序產生資料品質驗證環境之資料表血緣關係圖。可透過解析資料品質驗證環境之每一預存程序以判斷出每一預存程序之有向圖,並於解析出資料品質驗證環境之所有預存程序之有向圖後可合併資料品質驗證環境之所有預存程序之有向圖以產生資料品質驗證環境之一資料表血緣關係圖。同樣地,可根據正式環境之預存程序產生正式環境之資料表血緣關係圖。可透過解析正式環境之每一預存程序以判斷出每一預存程序之有向圖,並於解析出正式環境之所有預存程序之有向圖後可合併正式環境之所有預存程序之有向圖以產生正式環境之一資料表血緣關係圖。Similarly, according to the aforementioned method of generating a data table lineage graph of a development environment, a data table lineage graph of a data quality verification environment can be generated based on the stored procedures of the data quality verification environment. Each stored procedure of the data quality verification environment can be parsed to determine the directed graph of each stored procedure, and after parsing the directed graphs of all stored procedures of the data quality verification environment, the directed graphs of all stored procedures of the data quality verification environment can be merged to generate a data table lineage graph of the data quality verification environment. Similarly, a data table lineage graph of a formal environment can be generated based on the stored procedures of the formal environment. By parsing each stored program in the formal environment, the directed graph of each stored program can be determined. After parsing the directed graphs of all stored programs in the formal environment, the directed graphs of all stored programs in the formal environment can be merged to generate a data table blood relationship graph of the formal environment.
在一實施例中,當產生開發環境、資料品質驗證環境以及正式環境之資料表血緣關係圖之後,可以產生一輸入選單以供使用者輸入選取。例如,請繼續參考第3圖及第4圖,如第3圖及第4圖所示之輸入選單304及404。使用者可透過輸入選單選取想要的環境類型、資料表種類(schema)、資料表名稱等欄位。例如,可在環境類型欄位中點選項目“DEV”以選取開發環境,在資料表種類欄位中點選項目“INT”以選取INT資料表,在資料表名稱欄位中點選項目“INT_CUSTOMER_SATISFACTION”以選取資料表INT_CUSTOMER_SATISFACTION。本實施例可以視覺化顯示出輸入選單以供使用者觀看並輸入所欲選取的項目,並以視覺化顯示出所選取之資料表血緣關係圖的相關資訊。如第3圖及第4圖所示,本實施例可以視覺化顯示出輸入選單以供使用者觀看並輸入所欲選取的項目,且以視覺化顯示出所選取的資料表之來源有向子圖的內容以及相應預存程序、來源資料表、目標資料表等資訊。如此一來,將可提供使用者(如通知資料中心的管理員或是工程師)可以快速且清楚地掌握相應環境中的預存程序資料結構關係。In one embodiment, after the data table lineage diagram of the development environment, the data quality verification environment, and the formal environment is generated, an input menu can be generated for the user to input and select. For example, please continue to refer to FIG. 3 and FIG. 4, such as the input menus 304 and 404 shown in FIG. 3 and FIG. 4. The user can select the desired environment type, data table type (schema), data table name and other fields through the input menu. For example, you can click the item "DEV" in the environment type field to select the development environment, click the item "INT" in the table type field to select the INT table, and click the item "INT_CUSTOMER_SATISFACTION" in the table name field to select the INT_CUSTOMER_SATISFACTION table. This embodiment can visually display the input menu for the user to view and input the items to be selected, and visually display the relevant information of the selected table blood relationship diagram. As shown in FIG. 3 and FIG. 4, the present embodiment can visually display an input menu for the user to view and input the desired item, and visually display the content of the source directed subgraph of the selected data table and the corresponding stored procedure, source data table, target data table and other information. In this way, it can provide users (such as administrators or engineers of the notification data center) with a quick and clear understanding of the stored procedure data structure relationship in the corresponding environment.
於步驟S106中,可比較開發環境、資料品質驗證環境以及正式環境之資料表血緣關係圖以產生比較結果。例如,比較開發環境、資料品質驗證環境以及正式環境當中之至少兩個環境之資料表血緣關係圖以產生比較結果。例如,比較開發環境、資料品質驗證環境以及正式環境當中之任兩個環境之資料表血緣關係圖以產生比較結果。例如,比較開發環境與資料品質驗證環境之資料表血緣關係圖以產生比較結果。比較資料品質驗證環境與正式環境之資料表血緣關係圖以產生比較結果。比較開發環境與正式環境之資料表血緣關係圖以產生比較結果。此外,亦可選擇特定的資料表來進行比較以產生比較結果。例如,比較開發環境、資料品質驗證環境以及正式環境當中之任兩個環境之資料表血緣關係圖之至少一資料表所對應的來源有向子圖或目標有向子圖以產生比較結果。例如,常見的資料表種類包括DM資料表、INT資料表、ODS資料表、STAGE資料表。資料表的名稱的前綴為DM的資料表稱為DM資料表,例如資料表DM_XXXX。資料表的名稱的前綴為INT的資料表稱為INT資料表,依此類推。其中DM資料表通常是位於資料表血緣關係的終點。由於DM資料表通常是資料表血緣關係中的終點資料表,倘若有兩個環境有差異不一致的情況出現時,將可快速且輕易地被比對出來。在一實施例中,可以比較開發環境、資料品質驗證環境以及正式環境當中之任兩個環境之資料表血緣關係圖之DM資料表所對應的來源有向子圖以產生比較結果。In step S106, the data table lineage relationship diagrams of the development environment, the data quality verification environment, and the formal environment may be compared to generate a comparison result. For example, the data table lineage relationship diagrams of at least two environments among the development environment, the data quality verification environment, and the formal environment may be compared to generate a comparison result. For example, the data table lineage relationship diagrams of any two environments among the development environment, the data quality verification environment, and the formal environment may be compared to generate a comparison result. For example, the data table lineage relationship diagrams of the development environment and the data quality verification environment may be compared to generate a comparison result. Compare the data table lineage relationship graph between the data quality verification environment and the formal environment to generate comparison results. Compare the data table lineage relationship graph between the development environment and the formal environment to generate comparison results. In addition, you can also select specific data tables for comparison to generate comparison results. For example, compare the source directed subgraph or target directed subgraph corresponding to at least one data table in the data table lineage relationship graph of any two environments among the development environment, the data quality verification environment, and the formal environment to generate comparison results. For example, common types of data tables include DM data tables, INT data tables, ODS data tables, and STAGE data tables. A data table whose name starts with DM is called a DM data table, for example, data table DM_XXXX. A data table whose name starts with INT is called an INT data table, and so on. The DM data table is usually located at the end point of the data table lineage relationship. Since the DM data table is usually the end point data table in the data table lineage relationship, if there are differences and inconsistencies between two environments, they can be quickly and easily compared. In one embodiment, the source directed subgraph corresponding to the DM data table of the data table lineage relationship diagram of any two environments among the development environment, the data quality verification environment, and the formal environment can be compared to generate a comparison result.
於步驟S108中,可根據步驟S106之比較結果執行一通知功能。當比較結果顯示任兩個環境之資料表血緣關係圖存在差異時,本發明實施例可產生並發送出一通知訊號來提醒資料中心的管理員或是工程師,以執行通知功能。例如,第5圖分別顯示了開發環境及資料品質驗證環境之資料表血緣關係圖中的資料表INT_CUSTOMER_X所對應的一來源有向子圖。第6圖分別顯示了資料品質驗證環境及正式環境之資料表血緣關係圖中的資料表INT_CUSTOMER_X所對應的一來源有向子圖。如第5圖及第6圖所示,在開發環境、資料品質驗證環境以及正式環境之中,資料表INT_CUSTOMER_X為來源有向子圖的終點。將資料品質驗證環境與開發環境進行比較後,比較結果顯示資料品質驗證環境與開發環境具有相同的來源有向子圖,正式環境的來源有向子圖則與發環境、資料品質驗證環境有所差異。如第5圖所示,於步驟S106中軟體工程師選擇開發環境與資料品質驗證環境進行比較,比較結果顯示開發環環境之對應於資料表INT_CUSTOMER_X的來源有向子圖與資料品質驗證環境之對應於資料表INT_CUSTOMER_X的來源有向子圖為相同。如第6圖所示,當於步驟S106中軟體工程師選擇正式環境與資料品質驗證環境進行比較,比較結果顯示正式環境之對應於資料表INT_CUSTOMER_X的來源有向子圖不同於資料品質驗證環境之對應於資料表INT_CUSTOMER_X的來源有向子圖。如第6圖所示,級別0對應的預存程序INT_CUSTOMER_X_SP.STOREDPROCEDURE.SQL的來源資料表是不一致的。更具體而言,在正式環境中,級別0所對應的預存程序INT_CUSTOMER_X_SP.STOREDPROCEDURE.SQL的來源資料表為資料表ODS_CUSTOMER_X。但是在資料品質驗證環境中,級別0所對應的預存程序INT_CUSTOMER_X_SP.STOREDPROCEDURE.SQL的來源資料表為資料表TP1_INT_CUSTOMER_X。這也表示,品質驗證環境之資料與預存程序被同步更新至正式環境時的更新運作可能有所異常或尚未執行更新運作。在此情況下,比較結果顯示正式環境資料與品質驗證環境之資料表血緣關係圖中的來源有向子圖有差異之處。基於比較結果顯示正式環境資料與品質驗證環境之間存有差異,將可產生並發送一通知信號來通知資料中心的管理員或是工程師,以執行通知功能。如此一來,資料中心的管理員或是工程師便可輕易且立即地依據通知信號辨別是否有錯誤存在。In step S108, a notification function can be executed according to the comparison result of step S106. When the comparison result shows that there are differences in the data table blood relationship graphs of any two environments, the embodiment of the present invention can generate and send a notification signal to remind the administrator or engineer of the data center to execute the notification function. For example, Figure 5 shows a source directed subgraph corresponding to the data table INT_CUSTOMER_X in the data table blood relationship graphs of the development environment and the data quality verification environment, respectively. Figure 6 shows a source directed subgraph corresponding to the data table INT_CUSTOMER_X in the data table blood relationship graphs of the data quality verification environment and the formal environment, respectively. As shown in Figures 5 and 6, in the development environment, data quality verification environment, and formal environment, the data table INT_CUSTOMER_X is the end point of the source directed subgraph. After comparing the data quality verification environment with the development environment, the comparison results show that the data quality verification environment and the development environment have the same source directed subgraph, while the source directed subgraph of the formal environment is different from the development environment and the data quality verification environment. As shown in FIG. 5, in step S106, the software engineer selects the development environment and the data quality verification environment for comparison, and the comparison result shows that the source directed subgraph corresponding to the data table INT_CUSTOMER_X of the development environment is the same as the source directed subgraph corresponding to the data table INT_CUSTOMER_X of the data quality verification environment. As shown in FIG. 6, when the software engineer selects the formal environment and the data quality verification environment for comparison in step S106, the comparison result shows that the source directed subgraph corresponding to the data table INT_CUSTOMER_X of the formal environment is different from the source directed subgraph corresponding to the data table INT_CUSTOMER_X of the data quality verification environment. As shown in Figure 6, the source table of the stored procedure INT_CUSTOMER_X_SP.STOREDPROCEDURE.SQL corresponding to level 0 is inconsistent. More specifically, in the production environment, the source table of the stored procedure INT_CUSTOMER_X_SP.STOREDPROCEDURE.SQL corresponding to level 0 is the table ODS_CUSTOMER_X. However, in the data quality verification environment, the source table of the stored procedure INT_CUSTOMER_X_SP.STOREDPROCEDURE.SQL corresponding to level 0 is the table TP1_INT_CUSTOMER_X. This also means that when the data and stored procedures in the quality verification environment are synchronized and updated to the production environment, the update operation may be abnormal or the update operation has not yet been executed. In this case, the comparison result shows that there are differences in the source directed subgraphs in the data table lineage relationship graphs of the formal environment data and the quality verification environment. Based on the comparison result showing that there are differences between the formal environment data and the quality verification environment, a notification signal can be generated and sent to notify the administrator or engineer of the data center to perform the notification function. In this way, the administrator or engineer of the data center can easily and immediately identify whether there is an error based on the notification signal.
因此,在應用服務開發過程中,當軟體工程師在開發環境中增修預存程序時可透過比較開發環境的資料表血緣關係圖與資料品質驗證環境的資料表血緣關係圖以判斷出兩者之間是否有差異,藉以確認增修預存程序之後的整體系統正確性且能大幅提升應用服務開發的時效性。在應用服務開發過程中,當經增修的預存程序被同步更新至正式環境後,軟體工程師可以透過比較正式環境與資料品質驗證環境的資料表血緣關係圖以判斷出兩者之間是否有差異,以確認同步更新正式環境的作業是否被正確執行。當經增修的預存程序在同步更新至正式環境發生錯誤時,本發明實施例將能迅速找出受影響之預存程序。在一實施例中,可提供設定一監控週期,例如每天、每周或每隔一特定時間,使得資料中心每隔一監控週期執行流程10的步驟來比較在開發環境、資料品質驗證環境以及正式環境之資料表血緣關係圖,並且將比較結果回傳至資料中心,進而實現一個自動化測試與通知機制給資料中心的軟體工程師並能避免人員手動輸入錯誤的情況發生而達到完全測試自動化,同時自動化的測試也將大幅地提升生產測試流程效率。Therefore, during the application service development process, when software engineers modify stored procedures in the development environment, they can compare the table lineage relationship diagram of the development environment with the table lineage relationship diagram of the data quality verification environment to determine whether there is a difference between the two, so as to confirm the correctness of the overall system after the stored procedures are modified and greatly improve the timeliness of application service development. During the application service development process, when the modified stored procedures are synchronized to the formal environment, software engineers can compare the table lineage relationship diagram of the formal environment with the data quality verification environment to determine whether there is a difference between the two, so as to confirm whether the synchronization update of the formal environment is performed correctly. When an error occurs in the stored program that has been modified during synchronization with the formal environment, the embodiment of the present invention will be able to quickly find the affected stored program. In one embodiment, a monitoring cycle can be set, such as every day, every week, or every certain time, so that the data center executes the steps of process 10 every monitoring cycle to compare the data table lineage relationship diagram in the development environment, the data quality verification environment, and the formal environment, and the comparison result is sent back to the data center, thereby realizing an automated test and notification mechanism for the software engineer of the data center and avoiding the occurrence of manual input errors by personnel to achieve complete test automation. At the same time, automated testing will also greatly improve the efficiency of the production test process.
本領域具通常知識者當可依本發明的精神加以結合、修飾或變化以上所述的實施例,而不限於此。上述所有的陳述、步驟、及/或流程(包含建議步驟),可透過硬體、軟體、韌體(即硬體裝置與電腦指令的組合,硬體裝置中的資料為唯讀軟體資料)、電子系統、或上述裝置的組合等方式實現。硬體可包含類比、數位及混合電路(即微電路、微晶片或矽晶片)。例如,硬體可爲特定應用集成電路(ASIC)、現場可程序邏輯閘陣列(field programmable gate array,FPGA)、可程序化邏輯元件、耦接的硬體元件,或上述硬體的組合。在其他實施例中,硬件可包括通用處理器、微處理器、控制器、數字信號處理器(digital signal processor,DSP),或上述硬件的組合。軟體可爲程式碼的組合、指令的組合及/或函數(功能)的組合,其儲存在一儲存裝置中,例如一電腦可讀取記錄媒體或一非瞬時性電腦可讀取介質(non-transitory computer-readable medium)。舉例來說,電腦可讀取記錄媒體可包括唯讀記憶體(read-only memory,ROM)、快閃記憶體(Flash Memory)、隨機存取記憶體(random~access memory,RAM)、用戶識別模組(Subscriber Identity Module,SIM)、硬碟、軟碟或光碟唯讀記憶體(CD-ROM/DVD-ROM/BD-ROM),但不以此為限。本發明實施例可包括應用於資料中心之一資料處理裝置,資料處理裝置包括處理電路以及儲存裝置。本發明之流程步驟與實施例可被編譯成程式碼或指令的型態存在而儲存於所述資料處理裝置之儲存裝置中。所述資料處理裝置之處理電路可用於讀取與執行儲存裝置所儲存的程式碼或指令以實現前述所有步驟與功能。A person of ordinary skill in the art may combine, modify or change the above-described embodiments according to the spirit of the present invention, but is not limited thereto. All of the above statements, steps, and/or processes (including recommended steps) may be implemented through hardware, software, firmware (i.e., a combination of hardware devices and computer instructions, where the data in the hardware devices are read-only software data), electronic systems, or a combination of the above devices. The hardware may include analog, digital, and hybrid circuits (i.e., microcircuits, microchips, or silicon chips). For example, the hardware may be an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a programmable logic element, a coupled hardware element, or a combination of the above hardware. In other embodiments, the hardware may include a general purpose processor, a microprocessor, a controller, a digital signal processor (DSP), or a combination of the above hardware. The software may be a combination of program codes, a combination of instructions, and/or a combination of functions (functionality) stored in a storage device, such as a computer-readable recording medium or a non-transitory computer-readable medium. For example, the computer-readable recording medium may include read-only memory (ROM), flash memory (Flash Memory), random access memory (RAM), subscriber identity module (SIM), hard disk, floppy disk or optical disk read-only memory (CD-ROM/DVD-ROM/BD-ROM), but is not limited thereto. The embodiment of the present invention may include a data processing device applied to a data center, the data processing device including a processing circuit and a storage device. The process steps and embodiments of the present invention may be compiled into a program code or instruction form and stored in the storage device of the data processing device. The processing circuit of the data processing device can be used to read and execute the program code or instructions stored in the storage device to implement all the aforementioned steps and functions.
綜上所述,本發明實施例的資料處理方法可取得各環境的資料表血緣關係圖,當軟體工程師在對資料中心進行應用服務開發時便可即時比對不同環境的資料表血緣關係之差異,將能有效提升增修預存程序時的系統正確性及時效性,同時也可確認是否確實將增修的預存程序同步更新正式環境,進而可有效提升與優化公司數位轉型的速度。 以上所述僅為本發明之較佳實施例,凡依本發明申請專利範圍所做之均等變化與修飾,皆應屬本發明之涵蓋範圍。 In summary, the data processing method of the embodiment of the present invention can obtain the data table lineage relationship diagram of each environment. When the software engineer is developing application services for the data center, he can instantly compare the differences in the data table lineage relationships of different environments, which will effectively improve the system accuracy and timeliness when modifying the stored program. At the same time, it can also confirm whether the modified stored program is synchronized with the formal environment, thereby effectively improving and optimizing the speed of the company's digital transformation. The above is only a preferred embodiment of the present invention. All equal changes and modifications made according to the scope of the patent application of the present invention should be covered by the present invention.
10:流程 202,206,208,212,214:節點 204,210:有向邊 302:來源有向子圖 304,404:輸入選單 402:目標有向子圖 S100,S102,S104,S106,S108,S110:步驟10: Process 202,206,208,212,214: Nodes 204,210: Directed edges 302: Source directed subgraph 304,404: Input menu 402: Target directed subgraph S100,S102,S104,S106,S108,S110: Steps
第1圖為本發明實施例之資料處理方法之流程示意圖。 第2圖為本發明實施例之預存程序之有向圖之示意圖。 第3圖為本發明實施例之資料表血緣關係圖中之來源有向子圖之示意圖。 第4圖為本發明實施例之資料表血緣關係圖中之目標有向子圖之示意圖。 第5圖為本發明實施例之開發環境與資料品質驗證環境之資料表血緣關係圖之比較示意圖。 第6圖為本發明實施例之資料品質驗證環境與正式環境之資料表血緣關係圖之比較示意圖。 FIG. 1 is a flowchart of the data processing method of the embodiment of the present invention. FIG. 2 is a schematic diagram of the directed graph of the stored program of the embodiment of the present invention. FIG. 3 is a schematic diagram of the source directed subgraph in the data table kinship graph of the embodiment of the present invention. FIG. 4 is a schematic diagram of the target directed subgraph in the data table kinship graph of the embodiment of the present invention. FIG. 5 is a comparative schematic diagram of the data table kinship graph of the development environment and the data quality verification environment of the embodiment of the present invention. FIG. 6 is a comparative schematic diagram of the data table kinship graph of the data quality verification environment and the formal environment of the embodiment of the present invention.
10:流程 10: Process
S100,S102,S104,S106,S108,S110:步驟 S100, S102, S104, S106, S108, S110: Steps
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW112133169A TWI860074B (en) | 2023-09-01 | 2023-09-01 | Data processing method and data processing device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW112133169A TWI860074B (en) | 2023-09-01 | 2023-09-01 | Data processing method and data processing device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TWI860074B true TWI860074B (en) | 2024-10-21 |
| TW202511947A TW202511947A (en) | 2025-03-16 |
Family
ID=94084090
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW112133169A TWI860074B (en) | 2023-09-01 | 2023-09-01 | Data processing method and data processing device |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TWI860074B (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI238329B (en) * | 2002-09-11 | 2005-08-21 | Ibm | Methods and apparatus for root cause identification and problem determination in distributed systems |
| US20170351991A1 (en) * | 2016-06-07 | 2017-12-07 | International Business Machines Corporation | Detecting potential root causes of data quality issues using data lineage graphs |
| TW201812607A (en) * | 2016-09-12 | 2018-04-01 | 美商伊洛米歐公司 | Representation of servers that effectively summarize information in a decentralized network information management system |
| TW202319943A (en) * | 2021-11-04 | 2023-05-16 | 美商萬國商業機器公司 | Compliance risk management for data in computing systems |
-
2023
- 2023-09-01 TW TW112133169A patent/TWI860074B/en active
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TWI238329B (en) * | 2002-09-11 | 2005-08-21 | Ibm | Methods and apparatus for root cause identification and problem determination in distributed systems |
| US20170351991A1 (en) * | 2016-06-07 | 2017-12-07 | International Business Machines Corporation | Detecting potential root causes of data quality issues using data lineage graphs |
| TW201812607A (en) * | 2016-09-12 | 2018-04-01 | 美商伊洛米歐公司 | Representation of servers that effectively summarize information in a decentralized network information management system |
| TW202319943A (en) * | 2021-11-04 | 2023-05-16 | 美商萬國商業機器公司 | Compliance risk management for data in computing systems |
Also Published As
| Publication number | Publication date |
|---|---|
| TW202511947A (en) | 2025-03-16 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| EP4081903A1 (en) | Unit testing of components of dataflow graphs | |
| US20170228220A1 (en) | Self-healing automated script-testing tool | |
| CN109101410B (en) | Risk drive testing method and device and computer readable storage medium | |
| CN109977012B (en) | Joint debugging test method, device, equipment and computer readable storage medium of system | |
| CN116089258B (en) | Data migration test method, device, equipment, storage medium and program product | |
| US10558557B2 (en) | Computer system testing | |
| CN111581183A (en) | Data migration method and device based on data model | |
| CN106708897B (en) | Data warehouse quality guarantee method, device and system | |
| US8145988B2 (en) | Command line testing | |
| TWI860074B (en) | Data processing method and data processing device | |
| CN116866242A (en) | A switch regression testing method, equipment and media | |
| WO2021022702A1 (en) | Log insertion method and apparatus, computer apparatus and storage medium | |
| US20250307049A1 (en) | Systems and methods for determining errors during execution of multiple applications | |
| TWI862303B (en) | Data processing method and data processing device | |
| JP6169302B2 (en) | Specification configuration apparatus and method | |
| CN112068842A (en) | Dependency relationship establishing method, linkage compiling method and system | |
| US20250068149A1 (en) | Data processing method and data processing device | |
| CN117667884A (en) | Data migration method, device, equipment and storage medium | |
| CN114358889B (en) | Business order processing method, device, computer equipment and medium | |
| US12339831B2 (en) | Data processing method and data processing device | |
| WO2024016729A1 (en) | Visualization method and apparatus for call conflict | |
| US20160292210A1 (en) | System and method for automatically and efficiently validating database objects | |
| JP2023042058A (en) | Information processing apparatus and information processing method | |
| CN113626332A (en) | Debugging method, device, equipment, storage medium and computer program product | |
| CN111858329A (en) | A target public information model interface testing method and device |