[go: up one dir, main page]

TWI506456B - System and method for dispatching hadoop jobs in multi-cluster environment - Google Patents

System and method for dispatching hadoop jobs in multi-cluster environment Download PDF

Info

Publication number
TWI506456B
TWI506456B TW102118168A TW102118168A TWI506456B TW I506456 B TWI506456 B TW I506456B TW 102118168 A TW102118168 A TW 102118168A TW 102118168 A TW102118168 A TW 102118168A TW I506456 B TWI506456 B TW I506456B
Authority
TW
Taiwan
Prior art keywords
cluster
feature
work
matrix equation
module
Prior art date
Application number
TW102118168A
Other languages
Chinese (zh)
Other versions
TW201445332A (en
Inventor
Chun Hsiang Huang
Wei Ting Lin
Hsiu Min Lin
jing ying Huang
Ching Tang Tsai
Original Assignee
Chunghwa Telecom Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chunghwa Telecom Co Ltd filed Critical Chunghwa Telecom Co Ltd
Priority to TW102118168A priority Critical patent/TWI506456B/en
Publication of TW201445332A publication Critical patent/TW201445332A/en
Application granted granted Critical
Publication of TWI506456B publication Critical patent/TWI506456B/en

Links

Landscapes

  • Debugging And Monitoring (AREA)
  • Multi Processors (AREA)

Description

基於Hadoop多叢集環境的工作分派系統及方法Work distribution system and method based on Hadoop multi-cluster environment

本發明係關於一種基於Hadoop多叢集環境的工作分派系統及方法,特別係指一種結合叢集特徵分析、工作特徵分析,對叢集與運算任務進行特徵的分析判斷,並透過執行環境的選擇進行運算任務的派送,以達成簡便且效率高之服務運算資源之調配。The invention relates to a work assignment system and method based on Hadoop multi-cluster environment, in particular to a combination cluster feature analysis, work feature analysis, analysis and judgment of cluster and operation tasks, and computing tasks through execution environment selection. Delivery to achieve a simple and efficient deployment of service computing resources.

近年來因為大量的資訊化,使得一般企業與政府機構面對的是爆炸性成長的資料量,無論是在資料儲存、資料庫或資料檢索與資料探勘的領域中,都遭遇相同的問題,資料過濾與整理的龐大且耗時的工作,已無法由一台超級電腦負荷,轉而導向透過大量的群組電腦同時進行運算,進而獲得最大的效益。現今的資訊領域採用雲端服務的技術提供分散式運算來解決上述的問題,其中又以Apache Hadoop為主要的開放原始碼解決方案之一。In recent years, due to the large amount of informationization, the average enterprise and government agencies are facing an explosive growth in data volume. In the fields of data storage, database or data retrieval and data exploration, the same problem is encountered. The huge and time-consuming work that has been done is no longer burdened by a supercomputer, but instead is directed to computing through a large number of group computers, so as to obtain maximum benefits. Today's information field uses cloud-based services to provide decentralized computing to solve the above problems, including Apache Hadoop as one of the main open source solutions.

Hadoop實做出一個分散式運算的處理框架概念稱為MapReduce,透過將對數據進行的運算工作分發給網路上的每個節點處理,每個節點會周期性的把完成的工作和狀態的更新報告回來,進而達成大規模的數據運算分析。在此處理 框架之下,工作的排程與分派預設為FIFO(First In First Out)演算法,雖然架構上簡單,卻因此忽略運算工作本質上需求的差異,可能造成某項工作長期占用資源的情況。此外,系統參數的調校是否能與運算工作本質上的需求相符合,也是另一項在Hadoop系統當中相當重要的因素,但是若需要滿足此項條件,使用者往往需要針對不同的運算工作重新設定整體系統環境參數,以便讓整體系統的效能與運作可以配合運算工作的需求。由此可見,上述習用的方法仍有諸多缺失,實非一良善之設計者,而亟待加以改良。如何大量的群組電腦同時進行運算,卻又能針對運算工作進行最佳化,本案發明人鑑於上述習用方法所衍生的各項缺點,乃亟思加以改良創新,並經多年苦心孤詣潛心研究後,終於成功研發完成本件基於Hadoop多叢集環境的工作分派之系統。Hadoop actually makes a decentralized operation processing framework called MapReduce. By distributing the data processing work to each node on the network, each node periodically reports the completed work and status updates. Come back, and then achieve a large-scale data operation analysis. Processing here Under the framework, the scheduling and assignment of work is preset to the FIFO (First In First Out) algorithm. Although the architecture is simple, it ignores the difference in the essential requirements of the computing work, which may cause a certain task to occupy resources for a long time. In addition, whether the adjustment of the system parameters can be consistent with the essential requirements of the computing work is another important factor in the Hadoop system, but if this condition is met, the user often needs to re-work for different operations. Set the overall system environment parameters so that the overall system performance and operation can match the needs of computing work. It can be seen that there are still many shortcomings in the above-mentioned methods, which are not a good designer, but need to be improved. How can a large number of group computers perform calculations at the same time, but they can optimize the computing work. The inventors of the present invention have improved and innovated in view of the shortcomings derived from the above-mentioned conventional methods, and after years of painstaking research. Finally, I successfully developed this system based on Hadoop multi-cluster environment.

本發明之目的即在於提供一種裝置與系統,特別是應用在多個大量資料處理的分散式電腦叢集,能夠根據執行程式特徵,待處理資料特性,與電腦叢集的動態行為,選擇最佳的執行環境。可以降低不同運算特性的工作之排程等待時間,有效的加快運算分析的速度,並提升整體資源使用效率。The object of the present invention is to provide an apparatus and system, in particular, a distributed computer cluster applied to a plurality of large-scale data processing, which can select the best execution according to the characteristics of the execution program, the characteristics of the data to be processed, and the dynamic behavior of the computer cluster. surroundings. It can reduce the scheduling waiting time of work with different computing characteristics, effectively speed up the calculation and analysis, and improve the overall resource utilization efficiency.

可達成上述發明目的之基於Hadoop多叢集環境的工作分派系統及方法,係利用一組叢集特徵與監控模組、工作資料與程式分析模組以及執行環境選擇模組的結合,提供最佳化的Hadoop多叢集環境工作分派系統給使用者執行大資料運算服務。其方法係透過掌握叢集特徵、監控叢集運作情 形、分析運算資料特性與程式運算特性等影響參數,進而運算比對找出最合適的叢集,再透過執行環境選擇模組找到對應的叢集,並將用戶工作,包含用戶程式與輸入資料派送到對應的叢集中執行。A work distribution system and method based on the Hadoop multi-cluster environment that achieves the above objects, and utilizes a combination of a cluster feature and a monitoring module, a work data and a program analysis module, and an execution environment selection module to provide optimization. The Hadoop multi-cluster environment work dispatching system performs large data computing services for users. The method is to grasp the cluster characteristics and monitor the cluster operation. Shape, analysis of the operating data characteristics and program operation characteristics and other influencing parameters, and then the operation to find the most appropriate cluster, and then through the execution environment selection module to find the corresponding cluster, and the user work, including user program and input data is sent to The corresponding cluster is executed.

1‧‧‧工作分派系統1‧‧‧Work assignment system

11‧‧‧特徵資料庫模組11‧‧‧Characteristic database module

12‧‧‧叢集特徵模組12‧‧‧ Cluster Feature Module

13‧‧‧叢集監控模組13‧‧‧ Cluster Monitoring Module

14‧‧‧工作資料分析模組14‧‧‧Work data analysis module

15‧‧‧工作程式分析模組15‧‧‧Working program analysis module

16‧‧‧執行環境選擇模組16‧‧‧Execution Environment Selection Module

2‧‧‧用戶操作介面2‧‧‧User interface

3‧‧‧客戶程式3‧‧‧Client

4‧‧‧輸入資料4‧‧‧ Input data

5‧‧‧迷你叢集5‧‧‧Mini Cluster

6‧‧‧主機叢集6‧‧‧ Host cluster

第1圖為本發明之基於Hadoop多叢集環境的工作分派系統架構圖;第2圖為本發明基於Hadoop多叢集環境的工作分派系統之運作流程圖;第3圖為本發明基於Hadoop多叢集環境的工作分派系統之執行環境選擇流程圖。1 is a schematic diagram of a work distribution system based on a Hadoop multi-cluster environment of the present invention; FIG. 2 is a flowchart of operation of a work distribution system based on a Hadoop multi-cluster environment; FIG. 3 is a Hadoop multi-cluster environment according to the present invention; The execution environment selection flow chart of the work distribution system.

如第1圖所示,為本發明基於Hadoop多叢集環境的工作分派系統及方法之一種實施範例的架構示意圖,包括有:一特徵資料庫模組11,係用以儲存後述叢集特徵模組12、叢集監控模組13、工作資料分析模組14、工作程式分析模組15的矩陣方程式;一叢集特徵模組12,係用以收集叢集中不會隨著時間改變的靜態特徵,並以叢集靜態特徵矩陣方程式來描述其收集到的靜態特徵;一叢集監控模組13,用以定期收集每個叢集的動態特徵,並分析動態特徵曲線,以建立叢集動態特徵矩陣方程式來描述叢集特徵分析結果; 一工作資料分析模組14,係用以收集工作執行中不會隨著時間改變的靜態特徵,並以工作靜態特徵矩陣方程式來描述其收集到的靜態特徵;一工作程式分析模組15,係用以分析用戶程式在執行時使用資源的情形,主要用以建立工作動態特徵矩陣方程式來描述用戶程式行為特徵;一執行環境選擇模組16,係用以由工作程式分析模組15與叢集特徵模組12建立的矩陣方程式中選出最適合用戶工作之叢集,並將其送往對應的叢集;本發明基於Hadoop多叢集環境的工作分派系統運作流程如第2圖所示,客戶將其工作(包含客戶程式3與輸入資料4)透過用戶操作介面2送至Hadoop多叢集環境的工作分派系統1,工作分派系統1由客戶工作特性與各叢集6特性找出最適合之叢集在將其送往此叢集執行,工作分派系統1中各個模組之說明如下。FIG. 1 is a schematic structural diagram of an implementation example of a work distribution system and method based on a Hadoop multi-cluster environment according to the present invention, including: a feature database module 11 for storing a cluster feature module 12 described later. , the cluster monitoring module 13, the work data analysis module 14, the work program analysis module 15 matrix equation; a cluster feature module 12, is used to collect static features that the cluster does not change over time, and clustered The static feature matrix equation is used to describe the collected static features; a cluster monitoring module 13 is used to periodically collect the dynamic features of each cluster and analyze the dynamic characteristic curves to establish a cluster dynamic feature matrix equation to describe the cluster feature analysis results. ; A work data analysis module 14 is configured to collect static features that do not change over time during work execution, and describe the collected static features by working static feature matrix equations; a work program analysis module 15, The method for analyzing the use of resources by the user program during execution is mainly used to establish a working dynamic characteristic matrix equation to describe the user program behavior feature; an execution environment selection module 16 is configured to analyze the module 15 and the cluster feature by the working program. The cluster equation established by the module 12 selects the cluster most suitable for the user's work and sends it to the corresponding cluster; the working process of the work distribution system based on the Hadoop multi-cluster environment of the present invention is shown in FIG. 2, and the customer works ( The work distribution system 1 including the client program 3 and the input data 4) is sent to the Hadoop multi-cluster environment through the user operation interface 2, and the work distribution system 1 finds the most suitable cluster by the customer work characteristics and the cluster 6 characteristics, and sends it to the cluster. This cluster is executed, and the description of each module in the work distribution system 1 is as follows.

首先,叢集監控模組13定期收集每個叢集的動態特徵(例如CPU頻率(GHz)、Disk空間、Memory的使用量),並針對動態特徵曲線進行分析,將分析結果轉換成叢集動態特徵矩陣方程式,再儲存在特徵資料庫模組11。舉例來說,定期收集N個叢集(C_1…C_N)的n個動態特徵,如每秒CPU頻率(GHz)的使用量(%)、Disk空間的使用量(%)等,並以矩陣表示: First, the cluster monitoring module 13 periodically collects the dynamic characteristics of each cluster (such as CPU frequency (GHz), Disk space, Memory usage), and analyzes the dynamic characteristic curve to convert the analysis result into a cluster dynamic characteristic matrix equation. And stored in the feature database module 11. For example, periodically collect n dynamic features of N clusters (C_1...C_N), such as CPU usage (%) per second (%), usage (%) of Disk space, etc., and represent them in a matrix:

每個叢集各取時間間隔(t 1 ~t k ),其中k為間隔總數,計算出每個時間間隔的平均使用量,並以n×k 矩陣表示: Each cluster takes a time interval ( t 1 ~ t k ), where k is the total number of intervals, and the average usage of each time interval is calculated and expressed in an n × k matrix:

再將每個叢集在各時間間隔的1×n 矩陣儲存到特徵資料庫模組11。Each cluster is stored in the feature database module 11 at a 1×n matrix of each time interval.

而叢集特徵模組12主要負責收集叢集中不會隨著時間改變的靜態特徵,例如CPU核心數、CPU頻率(GHz)、Disk空間大小、Memory大小等規格,並將收集到的資料轉換成叢集靜態特徵矩陣方程式儲存在特徵資料庫模組11中,第i個叢集的矩陣方程式以1xn矩陣表示: The cluster feature module 12 is mainly responsible for collecting static features that are not changed over time in the cluster, such as CPU core number, CPU frequency (GHz), Disk space size, Memory size, etc., and converting the collected data into clusters. The static feature matrix equation is stored in the feature database module 11, and the matrix equation of the i-th cluster is represented by a 1xn matrix:

例如: E.g:

當有新叢集加入系統時,叢集特徵模組12會收集其靜態特徵,並同樣儲存在特徵資料庫模組11。When a new cluster is added to the system, the cluster feature module 12 collects its static features and also stores them in the feature database module 11.

工作資料分析模組14收集工作執行中不會隨著時間改變的靜態特徵,例如總資料量大小、總資料筆數、資料格式型態、是否壓縮等,描述工作的靜態特徵。當有新工作進入工作分派系統時,工作資料分析模組14會收集其靜態特徵,並將收集到的資料轉換成工作靜態特徵矩陣方程式儲存在特徵資料庫模組11中,工作靜態特徵矩陣方程式以1xn矩 陣表示:[Js 1 Js 2 Js 3 …Js n ] (6) The work profile analysis module 14 collects static features that do not change over time during the execution of the work, such as the total amount of data, the total number of data, the format of the data format, whether it is compressed, etc., describing the static characteristics of the work. When a new job enters the work distribution system, the work data analysis module 14 collects its static features, and converts the collected data into a working static feature matrix equation stored in the feature database module 11, working static characteristic matrix equation Expressed in a 1xn matrix: [Js 1 Js 2 Js 3 ... Js n ] (6)

例如:[總資料筆數] 1×n (7) For example: [Total data] 1×n (7)

工作程式分析模組15,係用以分析客戶程式3在執行時使用資源的情形,係工作特徵分析模組中的一子模組,主要用以建立矩陣方程式來描述客戶程式3行為特徵;用戶於用戶操作介面2提交客戶程式3與輸入資料4的存放路徑後,工作程式分析模組15從輸入資料4擷取固定筆數的資料當作樣本,將客戶程式3與輸入資料樣本上載至迷你叢集5,要求迷你叢集5啟動程式開始處理輸入資料樣本,記錄客戶程式3在處理固定筆數資料樣本時使用迷你叢集5資源的情形(例如中央處理器、記憶體、檔案讀取與寫入要求、網路封包讀取與寫入要求)與花費時間,並將收集到的資料轉換成工作動態特徵矩陣方程式儲存在特徵資料庫模組11中,工作動態特徵矩陣方程式以1xn矩陣表示,其中Jd n 為第n個工作動態特徵參數:[Jd 1 Jd 2 Jd 3 …Jd n ] (8) The work program analysis module 15 is used to analyze the situation in which the client 3 uses resources during execution, and is a sub-module in the work feature analysis module, which is mainly used to establish a matrix equation to describe the behavior characteristics of the client 3; After the user operation interface 2 submits the storage path of the client 3 and the input data 4, the work program analysis module 15 takes a fixed number of data from the input data 4 as a sample, and uploads the client 3 and the input data sample to the mini. Cluster 5, requires the Mini Cluster 5 startup program to start processing input data samples, and record the case where the client 3 uses the Mini Cluster 5 resource when processing a fixed number of data samples (eg, CPU, memory, file read and write requirements). , network packet read and write requirements) and time spent, and the collected data is converted into a working dynamic characteristic matrix equation stored in the feature database module 11, the working dynamic characteristic matrix equation is represented by a 1xn matrix, where Jd n is the nth working dynamic characteristic parameter: [Jd 1 Jd 2 Jd 3 ... Jd n ] (8)

例如: E.g:

執行環境選擇模組16之運作流程如第3圖所示,首先從特徵儲存庫模組11取得叢集監控模組13、叢集特徵模組12、工作資料分析模組14與工作程式分析模組15分析之 叢集靜態特徵矩陣方程式、叢集動態特徵矩陣方程式、工作靜態特徵矩陣方程式[Js 1 Js 2 Js 3 …Js n ] 與工作動態特徵矩陣方程式[Jd 1 Jd 2 Jd 3 …Jd n ] 並計算出用戶程式特徵矩陣方程式與對應各叢集的叢集特徵矩陣方程式如第(10)與(11)式所示: The operation flow of the execution environment selection module 16 is as shown in FIG. 3 . First, the cluster monitoring module 13 , the cluster feature module 12 , the work data analysis module 14 and the work program analysis module 15 are obtained from the feature repository module 11 . Analysis cluster static characteristic matrix equation Cluster dynamic characteristic matrix equation The working static characteristic matrix equation [Js 1 Js 2 Js 3 ... Js n ] and the working dynamic characteristic matrix equation [Jd 1 Jd 2 Jd 3 ... Jd n ] and calculate the user program characteristic matrix equation and the cluster feature matrix corresponding to each cluster The equation is as shown in equations (10) and (11):

其中F1 job 代表用戶程式特徵矩陣方程式中的第一個特徵,其值為工作靜態特徵矩陣方程式第一項Js 1 與工作動態特徵矩陣方程式第一項Jd 1 相乘之結果,後面依此類推,共有n個特徵值,而則是代表第i個叢集的叢集特徵矩陣方程式的第一個特徵,同樣也有n個特徵值,由於叢集動態特徵矩陣方程式的值為當時叢集的平均使用率,而我們分析需要的為叢集的剩餘使用率,所以以(1-)計算出叢集剩餘使用率,其值是由叢集靜態特徵矩陣方程式第一項 與第一項叢集動態特徵矩陣剩餘使用率(1-)相乘而來,這裡以叢集1為例,其叢集特徵矩陣方程式即為,為了避免混淆之後我們以J表示用戶程式特徵矩陣方程式而Ci表示第i個叢集的叢集特徵矩陣方程式,如第3圖說明第二步是針對Ci做分群的動作,首先將不適合的叢集過濾掉,由於部分用戶程式特徵有下限值,若叢集對應的特徵值低於下限值這些用戶程式就無法在叢集上執行,舉例來說用戶程式特徵中有disk使用量,若叢集特徵中的disk剩餘量低於用戶程式所需disk使用量時,此叢集就不適合執行此用戶程式。在判別不 適合的叢集可透過比較用戶程式特徵矩陣方程式與各叢集特徵矩陣方程式,若其中的元素屬於有下限值的特徵,且叢集特徵矩陣方程式元素小於用戶程式特徵矩陣方程式,表示此叢集特徵矩陣方程式對應的叢集不適合執行目前的用戶程式,於是不適合之叢集集合Clusterunsuitable 的表示如第(12)式所示: Where F 1 job represents the first feature in the user program characteristic matrix equation, and the value is the result of multiplying the first term Js 1 of the working static feature matrix equation with the first term Jd 1 of the working dynamic characteristic matrix equation, and so on. , there are n eigenvalues, and Then it is the first feature of the cluster feature matrix equation representing the i-th cluster, and there are also n eigenvalues. Since the value of the cluster dynamic feature matrix equation is the average usage of the cluster at the time, we need the remainder of the cluster. Usage rate, so take (1- Calculate the remaining usage of the cluster, the value of which is the first term of the cluster static characteristic matrix equation Residual usage rate with the first cluster dynamic feature matrix (1- Multiply, here is cluster 1 as an example, and the cluster characteristic matrix equation is In order to avoid confusion, we denote the user program characteristic matrix equation with J and Ci denote the cluster feature matrix equation of the i-th cluster. As shown in Figure 3, the second step is to perform clustering action on Ci. First, filter the clusters that are not suitable. Since some user program features have a lower limit value, if the feature value corresponding to the cluster is lower than the lower limit value, the user program cannot be executed on the cluster. For example, the user program feature has disk usage, if the disk in the cluster feature This cluster is not suitable for executing this user program when the remaining amount is lower than the disk usage required by the user program. In the discriminating unsuitable cluster, the user program characteristic matrix equation and the cluster feature matrix equation can be compared. If the element belongs to the feature with the lower limit value, and the cluster feature matrix equation element is smaller than the user program characteristic matrix equation, the cluster feature matrix is represented. the equation for the corresponding cluster is not currently executed user program, so it is not suitable for the set of cluster cluster unsuitable as a first representation (12) represented by the formula:

其中Cluster all 代表所有叢集特徵矩陣方程式集合,L代表有下限值的特徵集合,表示Ci 叢集的叢集特徵矩陣方程式的第j個元素,而Fj job 為用戶程式特徵矩陣方程式的第j個元素,過濾掉不適合的叢集後針對剩餘的叢集特徵方程式再將其分為最優先叢集特徵矩陣方程式集合與次優先叢集特徵矩陣方程式集合,首先最優先叢集特徵矩陣方程式集合在這裡定義為叢集特徵矩陣方程式的各特徵元素皆滿足用戶程式特徵方程式的所有元素,剩餘叢集特徵矩陣方程式則是次優先叢集特徵矩陣方程式集合,這兩個集合定義如下: Where Cluster all represents a set of all cluster feature matrix equations, and L represents a feature set with a lower limit value. The jth element of the cluster characteristic matrix equation representing the C i cluster, and F j job is the jth element of the user program characteristic matrix equation. After filtering out the unsuitable cluster, it is divided into the highest priority for the remaining cluster characteristic equations. Cluster feature matrix set and sub-priority cluster feature matrix set, first highest priority cluster feature matrix equation set here is defined as cluster feature matrix equations, each feature element satisfies all elements of the user program characteristic equation, and the remaining cluster feature matrix equation This is the set of priority cluster feature matrix equations, which are defined as follows:

ClusterCluster second priortySecond priorty =Cluster=Cluster allAll -(Cluster-(Cluster first priortyFirst priorty U ClusterU Cluster unsuitableUnsuitable ) (14)) (14)

其中Cluster first priorty 為最優先叢集特徵矩陣方程式集合而Cluster second priorty 為次優先叢集特徵矩陣方程式集合,將叢集特徵矩陣方程式分群後,下一步開始從中選擇目標叢集,選擇目標叢集可分為以下步驟: Cluster first priorty is the highest priority cluster feature matrix equation set and Cluster second priorty is the sub-priority cluster feature matrix equation set. After the cluster feature matrix equations are grouped, the next step is to select the target cluster, and the target cluster can be divided into the following steps:

A.檢查最優先叢集特徵矩陣方程式集合,若非空集合,則從集合中選擇最適合之叢集特徵矩陣方程式,在這裡用戶程式特徵矩陣方程式可視為一存在於n維空間的向量,同時最優先叢集特徵矩陣方程式集合也可視為多組存在於n為空間的向量集合,於是利用向量的距離做為選擇上的依據;一般而言距離越大代表叢集會有更充裕的資源供用戶程式執行,但本發明的目的在於降低工作等待(wait to run)時間,為了避免大量的用戶程式都配置到某個特定的叢集上降低了執行效率,於是在此選擇了向量距離最近,也就是最符合當時用戶程式執行的叢集,選擇如(15)所示: 其中dist(C i ,J) 為叢集特徵矩陣方程式與用戶程式特徵矩陣方程式的向量距離,算法如下式所示: A. Check the set of the most preferred cluster feature matrix equations. If it is not an empty set, select the most suitable cluster feature matrix equation from the set. Here, the user program feature matrix equation can be regarded as a vector existing in n-dimensional space, and the highest priority cluster. The set of characteristic matrix equations can also be regarded as multiple sets of vectors existing in n space, so the distance of the vector is used as the basis for selection; generally speaking, the larger the distance, the more resources the cluster will have for the user to execute, but The purpose of the present invention is to reduce the wait to run time. In order to avoid a large number of user programs being configured to a specific cluster, the execution efficiency is lowered, so that the vector distance is selected closest to the user at that time. The cluster of program execution, as shown in (15): Where dist(C i , J) is the vector distance between the cluster feature matrix equation and the user program feature matrix equation. The algorithm is as follows:

B.如最優先叢集特徵矩陣方程式集合不存在任何的叢集特徵方程式,則從次優先叢集特徵矩陣方程式集合中做選擇,在這集合中皆是無法完全滿足用戶程式的叢集集合但還是可以順利完成用戶工作的需求,這裡的選擇方法如第一步相同,將各矩陣方程式視為存在於n維空間的向量,為了避免因叢集特徵與用戶程式特徵差別過大導致用戶程式執行時間過長,這邊一樣是以選擇兩向量空間距離最少做為選擇的依據,選擇如(17)式所示: B. If there is no cluster characteristic equation in the set of the most preferential cluster feature matrix, then the selection is made from the set of sub-priority cluster feature matrix, in which the cluster set of the user program cannot be fully satisfied but can be successfully completed. The user's work needs, the selection method here is the same as the first step, and the matrix equations are regarded as the vectors existing in the n-dimensional space. In order to avoid the user program execution time is too long due to the difference between the cluster features and the user program features, this is too long. The same is to choose the minimum distance between the two vector spaces as the basis for selection, as shown in (17):

C.如最優先與次優先叢集特徵矩陣方程式集合皆不存在任何的叢集特徵方程式,則表示目前所有存在的叢集皆不適合執行用戶工作,此時執行環境選擇模組退回用戶工作要求,並通知使用者。C. If there is no cluster characteristic equation in the set of the highest priority and the second priority cluster feature matrix, it means that all the existing clusters are not suitable for performing user work. At this time, the execution environment selection module returns the user work request and notifies the use. By.

如有找出最適合的叢集特徵矩陣方程式,執行環境選擇模組16由選出的叢集特徵矩陣方程式找到對應的叢集,並將客戶工作,包含客戶程式3與輸入資料4派送到對應的叢集中執行。If the most suitable cluster feature matrix equation is found, the execution environment selection module 16 finds the corresponding cluster from the selected cluster feature matrix equation, and sends the client work, including the client 3 and the input data 4, to the corresponding cluster to execute. .

本發明所提供之基於Hadoop多叢集環境的工作分派之系統,與其他習用技術相互比較時,更具有下列之優點:The system of work assignment based on Hadoop multi-cluster environment provided by the invention has the following advantages when compared with other conventional technologies:

1.本發明可根據待處理資料特性、運算程式之特徵與電腦叢集的動態行為,提供最佳化的執行環境給使用者,有效降低工作等待時間,提供可行、可靠、高效率之運算服務。1. The invention can provide an optimized execution environment to the user according to the characteristics of the data to be processed, the characteristics of the computing program and the dynamic behavior of the computer cluster, effectively reducing the waiting time of the work, and providing a feasible, reliable and efficient computing service.

2.本發明的工作分派之系統與方法可根據待處理資料特性進而充分使用運算設備硬體資源,降低運算服務建置成本,確保服務的穩定性與可靠性,解決運算工作本質上需求的差異之問題,進而提昇整體服務速度與效率,其經濟效益非常明顯。2. The system and method for the work assignment of the present invention can fully utilize the hardware resources of the computing device according to the characteristics of the data to be processed, reduce the cost of computing service construction, ensure the stability and reliability of the service, and solve the difference in the essential requirements of the computing work. The problem, and thus the overall service speed and efficiency, its economic benefits are very obvious.

上列詳細說明係針對本發明之一可行實施例之具體說明,惟該實施例並非用以限制本發明之專利範圍,凡未脫離本發明技藝精神所為之等效實施或變更,均應包含於本案之專利範圍中。The detailed description of the preferred embodiments of the present invention is intended to be limited to the scope of the invention, and is not intended to limit the scope of the invention. The patent scope of this case.

綜上所述,本案不但在空間型態上確屬創新,並能較習用物品增進上述多項功效,應已充分符合新穎性及進 步性之法定發明專利要件,爰依法提出申請,懇請 貴局核准本件發明專利申請案,以勵發明,至感德便。In summary, this case is not only innovative in terms of space type, but also can enhance the above-mentioned multiple functions compared with the customary items, and should fully comply with the novelty and progress. The statutory invention patent requirements of the step, apply in accordance with the law, and ask your office to approve the invention patent application, in order to invent invention, to the sense of virtue.

1‧‧‧工作分派系統1‧‧‧Work assignment system

11‧‧‧特徵資料庫模組11‧‧‧Characteristic database module

12‧‧‧叢集特徵模組12‧‧‧ Cluster Feature Module

13‧‧‧叢集監控模組13‧‧‧ Cluster Monitoring Module

14‧‧‧工作資料分析模組14‧‧‧Work data analysis module

15‧‧‧工作程式分析模組15‧‧‧Working program analysis module

16‧‧‧執行環境選擇模組16‧‧‧Execution Environment Selection Module

2‧‧‧用戶操作介面2‧‧‧User interface

3‧‧‧客戶程式3‧‧‧Client

4‧‧‧輸入資料4‧‧‧ Input data

5‧‧‧迷你叢集5‧‧‧Mini Cluster

6‧‧‧主機叢集6‧‧‧ Host cluster

Claims (7)

一種基於Hadoop多叢集環境的工作分派之系統,係包括:一特徵資料庫模組,係用以儲存叢集的靜態、動態特徵矩陣方程式和工作的靜態、動態特徵矩陣方程式;一叢集特徵模組,主要負責分析各叢集的靜態特徵;一叢集監控模組,主要負責分析各叢集的動態特徵;一工作資料分析模組,主要負責分析計算工作的靜態特徵;一工作程式分析模組,係用以分析用戶程式在執行時使用資源的情形;以及一執行環境選擇模組,係用以選出最適合用戶工作之叢集,並將其送往對應的叢集執行。 A system for job assignment based on Hadoop multi-cluster environment includes: a feature database module for storing static and dynamic feature matrix equations of clusters and working static and dynamic feature matrix equations; a cluster feature module, It is mainly responsible for analyzing the static characteristics of each cluster; a cluster monitoring module is mainly responsible for analyzing the dynamic characteristics of each cluster; a working data analysis module is mainly responsible for analyzing the static characteristics of the computing work; a working program analysis module is used for Analyze the situation in which the user program uses resources during execution; and an execution environment selection module is used to select the cluster that is most suitable for the user's work and send it to the corresponding cluster for execution. 如請求項1所述之基於Hadoop多叢集環境的工作分派之系統,其叢集監控模組會定期收集每個叢集的動態特徵,並針對動態特徵曲線進行分析,將分析結果轉換成叢集動態特徵矩陣方程式,再儲存在特徵資料庫模組。 According to the Hadoop multi-cluster environment-based work dispatching system described in claim 1, the cluster monitoring module periodically collects dynamic features of each cluster, analyzes the dynamic characteristic curves, and converts the analysis results into a cluster dynamic feature matrix. The equation is stored in the feature database module. 如請求項1所述之基於Hadoop多叢集環境的工作分派之系統,其中該叢集特徵模組主要負責分析叢集中不會隨著時間改變的靜態特徵,並建立矩陣方程式來描述叢集的靜態特徵;當有新叢集加入系統時,叢集特徵模組會分析其靜態特徵,並將資料轉換成矩陣方程式儲存在特徵資料庫模組中。 The system of work assignment based on Hadoop multi-cluster environment according to claim 1, wherein the cluster feature module is mainly responsible for analyzing static features that do not change over time in the cluster, and establishing a matrix equation to describe static features of the cluster; When a new cluster is added to the system, the cluster feature module analyzes its static features and converts the data into a matrix equation stored in the feature database module. 如請求項1所述之基於Hadoop多叢集環境的工作分派之系統,其中該工作資料分析模組主要負責分析計算工作執行中的資料特性與靜態特徵,並建立矩陣方程式來描述工作 的靜態特徵;當有新工作進入工作分派系統時,資料分析模組會分析其靜態特徵,並將資料轉換成矩陣方程式儲存在特徵資料庫模組中。 The system of work assignment based on the Hadoop multi-cluster environment described in claim 1, wherein the work data analysis module is mainly responsible for analyzing data characteristics and static features in the execution of the calculation work, and establishing a matrix equation to describe the work. The static feature; when a new job enters the work distribution system, the data analysis module analyzes its static features and converts the data into a matrix equation stored in the feature database module. 如請求項1所述之基於Hadoop多叢集環境的工作分派之系統,其中該工作程式分析模組用於分析客戶程式在處理資料時使用資源的情形與花費時間,並將收集到的資料轉換成工作動態特徵矩陣方程式儲存在特徵資料庫模組。 The system for scheduling work based on a Hadoop multi-cluster environment according to claim 1, wherein the worker program analysis module is configured to analyze a situation in which a client uses resources when processing data, and spends time, and converts the collected data into The working dynamic characteristic matrix equation is stored in the feature database module. 如請求項1所述之基於Hadoop多叢集環境的工作分派之系統,其中該執行環境選擇模組從特徵資料庫模組取得叢集監控模組、叢集特徵模組、工作資料分析模組與工作程式分析模組分析之結果,並透過用戶程式特徵矩陣方程式將用戶工作,包含用戶程式與輸入資料派送到對應的叢集中執行。 The system for scheduling work based on a Hadoop multi-cluster environment according to claim 1, wherein the execution environment selection module obtains a cluster monitoring module, a cluster feature module, a work data analysis module, and a work program from a feature database module. Analyze the results of the module analysis and use the user program feature matrix equation to work with the user, including the user program and input data to the corresponding cluster. 一種基於Hadoop多叢集環境的工作分派方法,包括以下步驟:從特徵資料庫模組取得叢集監控模組、叢集特徵模組、工作資料分析模組與工作程式分析模組之結果;計算用戶程式特徵矩陣方程式與對應各叢集的叢集特徵矩陣方程式;透過用戶程式特徵矩陣方程式將對應各叢集的叢集特徵矩陣方程式分類為最優先叢集特徵矩陣方程式集合、次優先叢集特徵矩陣方程式集合與不適合之叢集特徵矩陣方程式集合;其中比較用戶程式特徵矩陣方程式與各叢集特徵矩陣方程式,若其中的元素屬於有下限值的特徵,且叢集特徵矩陣方程式元素小於用戶程式特徵矩陣方程式,就分類為不適合之叢集特徵矩陣方程式;過濾掉不適 合之叢集特徵矩陣方程式後,叢集特徵矩陣方程式的各特徵元素皆滿足用戶程式特徵方程式的所有元素分類為最優先叢集特徵矩陣方程式集合,剩餘叢集特徵矩陣方程式則分類次優先叢集特徵矩陣方程式集合;若最優先叢集特徵矩陣方程式集合並非空集合,則依據用戶程式特徵矩陣方程式從最優先叢集特徵矩陣方程式集合選出最適合的叢集特徵矩陣方程式;若最優先叢集特徵矩陣方程式集合為空集合,則檢查次優先叢集特徵矩陣方程式集合是否為空集合,若非空集合,則從中選出一個適合的叢集特徵矩陣方程式;透過選出的叢集特徵矩陣方程式計算找到對應的叢集,並將用戶工作,包含用戶程式與輸入資料派送到對應的叢集中執行;若最優先與次優先叢集特徵矩陣方程式集合皆為空集合,則表示目前所有存在的叢集皆不適合執行用戶工作,此時退回用戶工作要求,並通知使用者。 A work assignment method based on a Hadoop multi-cluster environment includes the following steps: obtaining a result of a cluster monitoring module, a cluster feature module, a work data analysis module, and a work program analysis module from a feature database module; calculating a user program feature The matrix equation and the cluster feature matrix equation corresponding to each cluster; the cluster feature matrix equations corresponding to each cluster are classified into the highest priority cluster feature matrix equation set, the second priority cluster feature matrix equation set and the unsuitable cluster feature matrix through the user program characteristic matrix equation A set of equations; wherein the user program characteristic matrix equation and each cluster feature matrix equation are compared, and if the element belongs to a feature having a lower limit value, and the cluster feature matrix equation element is smaller than the user program feature matrix equation, the cluster feature matrix is classified as unsuitable Equation; filter out discomfort After the cluster characteristic matrix equation is combined, all the elements of the cluster characteristic matrix equation satisfy the user program characteristic equation and all the elements are classified into the highest priority cluster feature matrix equation set, and the remaining cluster feature matrix equation classifies the sub-priority cluster feature matrix equation set; If the set of the most preferential cluster feature matrix is not an empty set, the most suitable cluster feature matrix equation is selected according to the user program feature matrix equation from the highest priority cluster feature matrix equation set; if the most preferential cluster feature matrix equation set is an empty set, then check Whether the set of sub-priority cluster feature matrix is an empty set, if not an empty set, then select a suitable cluster feature matrix equation; calculate the corresponding cluster by the selected cluster feature matrix equation, and work the user, including user program and input The data is sent to the corresponding cluster for execution; if the highest priority and the second priority cluster feature matrix are all empty sets, it means that all existing clusters are not suitable for performing user work. User job requirements, and notify the user.
TW102118168A 2013-05-23 2013-05-23 System and method for dispatching hadoop jobs in multi-cluster environment TWI506456B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW102118168A TWI506456B (en) 2013-05-23 2013-05-23 System and method for dispatching hadoop jobs in multi-cluster environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW102118168A TWI506456B (en) 2013-05-23 2013-05-23 System and method for dispatching hadoop jobs in multi-cluster environment

Publications (2)

Publication Number Publication Date
TW201445332A TW201445332A (en) 2014-12-01
TWI506456B true TWI506456B (en) 2015-11-01

Family

ID=52707054

Family Applications (1)

Application Number Title Priority Date Filing Date
TW102118168A TWI506456B (en) 2013-05-23 2013-05-23 System and method for dispatching hadoop jobs in multi-cluster environment

Country Status (1)

Country Link
TW (1) TWI506456B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201216073A (en) * 2010-10-01 2012-04-16 Kuan-Chang Fu System and method for sharing network storage and computing resource
TW201312467A (en) * 2011-07-28 2013-03-16 Yahoo Inc Method and system for distributed application stack deployment
TW201314463A (en) * 2011-05-20 2013-04-01 Soft Machines Inc Decentralized allocation of resources and interconnect structures to support the execution of instruction sequences by a plurality of engines
US20130124483A1 (en) * 2011-11-10 2013-05-16 Treasure Data, Inc. System and method for operating a big-data platform

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201216073A (en) * 2010-10-01 2012-04-16 Kuan-Chang Fu System and method for sharing network storage and computing resource
TW201314463A (en) * 2011-05-20 2013-04-01 Soft Machines Inc Decentralized allocation of resources and interconnect structures to support the execution of instruction sequences by a plurality of engines
TW201312467A (en) * 2011-07-28 2013-03-16 Yahoo Inc Method and system for distributed application stack deployment
US20130124483A1 (en) * 2011-11-10 2013-05-16 Treasure Data, Inc. System and method for operating a big-data platform

Also Published As

Publication number Publication date
TW201445332A (en) 2014-12-01

Similar Documents

Publication Publication Date Title
CN104298550B (en) A kind of dynamic dispatching method towards Hadoop
Chen et al. How does the workload look like in production cloud? analysis and clustering of workloads on alibaba cluster trace
WO2021179462A1 (en) Improved quantum ant colony algorithm-based spark platform task scheduling method
CN104516784B (en) A kind of method and system for predicting the task resource stand-by period
US20180198855A1 (en) Method and apparatus for scheduling calculation tasks among clusters
CN107357652B (en) A cloud computing task scheduling method based on segmentation sorting and standard deviation adjustment factor
Xu et al. Adaptive task scheduling strategy based on dynamic workload adjustment for heterogeneous Hadoop clusters
JP6953800B2 (en) Systems, controllers, methods, and programs for running simulation jobs
Canali et al. Improving scalability of cloud monitoring through PCA-based clustering of virtual machines
Saklani et al. Multicore Implementation of K-Means Clustering Algorithm
Vakilinia et al. Analysis and optimization of big-data stream processing
US20220050814A1 (en) Application performance data processing
Kumar et al. A comprehensive review of straggler handling algorithms for mapreduce framework
CN103019855A (en) Method for forecasting executive time of Map Reduce operation
CN112148475A (en) Task scheduling method and system of loongson big data all-in-one machine integrating load and power consumption
CN112000460A (en) Service capacity expansion method based on improved Bayesian algorithm and related equipment
CN104077398B (en) Work distribution system and method based on Hadoop multi-cluster environment
Suleiman et al. A framework for characterizing very large cloud workload traces with unsupervised learning
Ouyang et al. An approach for modeling and ranking node-level stragglers in cloud datacenters
US10817401B1 (en) System and method for job-to-queue performance ranking and resource matching
Piao et al. Computing resource prediction for mapreduce applications using decision tree
TWI506456B (en) System and method for dispatching hadoop jobs in multi-cluster environment
Leonenkov et al. Supercomputer efficiency: Complex approach inspired by Lomonosov-2 history evaluation
Qi et al. Data mining based root-cause analysis of performance bottleneck for big data workload
Lang et al. Implementation of load balancing algorithm based on flink cluster

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees