[go: up one dir, main page]

CN106446303A - System and method for deploying large-scale cluster file system - Google Patents

System and method for deploying large-scale cluster file system Download PDF

Info

Publication number
CN106446303A
CN106446303A CN201611169173.1A CN201611169173A CN106446303A CN 106446303 A CN106446303 A CN 106446303A CN 201611169173 A CN201611169173 A CN 201611169173A CN 106446303 A CN106446303 A CN 106446303A
Authority
CN
China
Prior art keywords
node
control unit
parameters
central control
configuration
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611169173.1A
Other languages
Chinese (zh)
Other versions
CN106446303B (en
Inventor
郝向东
侯斌
任东旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201611169173.1A priority Critical patent/CN106446303B/en
Publication of CN106446303A publication Critical patent/CN106446303A/en
Application granted granted Critical
Publication of CN106446303B publication Critical patent/CN106446303B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5011Pool
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/508Monitor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Stored Programmes (AREA)

Abstract

The invention provides a system and method for deploying a large-scale cluster file system. The method comprises the steps that a Json configuration file is generated by separating out configuration parameters needed in the cluster file system configuration procedure, the configuration file is analyzed through a shell script, and cluster deploying is quickly conducted; meanwhile, when a bottom-layer disk is processed, the disk with a large quantity of nodes is processed by adopting the strategy that the process and the thread are self-adaptively adjusted by a process pool and a thread pool, therefore, the disk processing speed of a large-scale cluster is optimized, and quick deployment of the large-scale cluster is achieved.

Description

用于部署大规模集群文件系统的系统及方法Systems and methods for deploying large-scale clustered file systems

技术领域technical field

本发明涉及集群文件系统领域,具体涉及大规模集群文件系统的快速部署。The invention relates to the field of cluster file systems, in particular to the rapid deployment of large-scale cluster file systems.

背景技术Background technique

集群(cluster)技术是一种较新的技术,通过集群技术,可以在付出较低成本的情况下获得在性能、可靠性、灵活性方面的相对较高的收益。集群是一组相互独立的、通过高速网络互联的计算机,它们构成了一个组,并以单一系统的模式加以管理。一个客户与集群相互作用时,集群像是一个独立的服务器。Cluster (cluster) technology is a relatively new technology. Through cluster technology, relatively high benefits in terms of performance, reliability, and flexibility can be obtained at a lower cost. A cluster is a group of independent computers interconnected by a high-speed network that form a group and are managed as a single system. When a client interacts with the cluster, the cluster behaves like a stand-alone server.

随着大数据、互联网乃至物联网的爆发和普及,集群已经向大规模超大规模发展。图1示出传统集群部署流程图,图2示出传统集群部署磁盘处理流程图,集群文件系统尤其是大规模集群文件系统部署最为企业诟病的一点是过程繁琐且耗时长,集群部署最耗时的步骤是磁盘处理,现有的集群部署方法采用手动顺序处理,首先处理第一个节点的第一块磁盘,然后处理第一个节点的第二块磁盘,依次进行直到第一个节点上的磁盘处理完毕,转而处理第二个节点,然后处理第三个节点,直到所有节点处理完成再进行其他部署操作;另一方面,手动部署设置大量的参数容易出错,导致部署失败,需清除痕迹重新部署。以上因素都导致集群部署过程繁琐而且耗时比较长。With the outbreak and popularization of big data, the Internet and even the Internet of Things, clusters have developed to large-scale and ultra-large-scale. Figure 1 shows the flow chart of traditional cluster deployment, and Figure 2 shows the flow chart of traditional cluster deployment disk processing. Cluster file systems, especially large-scale cluster file system deployments, are most criticized by enterprises for their cumbersome and time-consuming process, and cluster deployment is the most time-consuming. The most important step is disk processing. The existing cluster deployment method uses manual sequential processing. First, the first disk of the first node is processed, and then the second disk of the first node is processed. After the disk is processed, turn to the second node, and then the third node, until all nodes are processed before performing other deployment operations; on the other hand, manually deploying and setting a large number of parameters is prone to errors, resulting in deployment failure, and the traces need to be cleared Redeploy. All of the above factors make the cluster deployment process cumbersome and time-consuming.

发明内容Contents of the invention

为解决上述技术问题,本发明的技术方案如下:In order to solve the problems of the technologies described above, the technical solution of the present invention is as follows:

本发明提供一种用于部署大规模集群文件系统的系统,包括配置文件生成器、中控单元、检测器,配置文件生成器通过分离出集群文件系统部署流程中所需的配置参数,形成配置文件,配置文件生成器接中控单元的控制数据输入接口,以向中控单元传输配置文件,中控单元的控制数据输出接口接文件系统的所有节点,并向各节点传送配置文件、控制指令以及检测指令,每个节点可以为主节点或从节点,且每个节点中都配置有解析脚本,以用于解析中控单元传送的配置文件,检测器接所有节点,用于检测各节点配置过程中的参数,检测器还接中控单元的检测数据输入接口,向中控单元反馈各节点的配置参数。The present invention provides a system for deploying a large-scale cluster file system, including a configuration file generator, a central control unit, and a detector. The configuration file generator forms configuration parameters by separating the configuration parameters required in the cluster file system deployment process. The configuration file generator is connected to the control data input interface of the central control unit to transmit configuration files to the central control unit, and the control data output interface of the central control unit is connected to all nodes of the file system to transmit configuration files and control instructions to each node As well as detection instructions, each node can be a master node or a slave node, and each node is configured with a parsing script for parsing the configuration file transmitted by the central control unit, and the detector is connected to all nodes to detect the configuration of each node Parameters in the process, the detector is also connected to the detection data input interface of the central control unit, and feeds back the configuration parameters of each node to the central control unit.

进一步的,配置文件生成器为Json文件生成器,解析脚本为shell脚本。Further, the configuration file generator is a Json file generator, and the parsing script is a shell script.

进一步的,需要部署的各节点接收到中控单元传送的Json配置文件后,发送节点参数至检测器,检测器根据节点参数给定主节点和从节点,计算节点数n,并将节点数n以及各节点参数传送至中控单元,中控单元以节点数n作为节点处理进程数,并发送控制指令以及检测指令至各节点,控制指令用于控制n个节点同时进行参数配置,实现多进程配置。Further, after each node to be deployed receives the Json configuration file transmitted by the central control unit, it sends node parameters to the detector, and the detector specifies the master node and slave node according to the node parameters, calculates the number of nodes n, and calculates the number of nodes n And the parameters of each node are sent to the central control unit. The central control unit uses the number of nodes n as the number of node processing processes, and sends control instructions and detection instructions to each node. The control instructions are used to control n nodes to perform parameter configuration at the same time to achieve multi-process configuration.

进一步的,各节点收到中控单元发出的检测指令后,将各自的CPU核数以及磁盘参数至检测器,检测器根据收到的数据获得各节点的CPU核数t以及各节点的磁盘数量d,并将各节点的CPU核数t以及各节点的磁盘数量d传送至中控单元,中控单元以(2*ti)和di中较小的一个作为第i个节点上的线程数,使得各节点在对应的进程内对自身的磁盘执行多线程操作。Further, after receiving the detection instruction from the central control unit, each node sends its CPU core number and disk parameters to the detector, and the detector obtains the CPU core number t of each node and the disk number of each node according to the received data d, and transmit the number of CPU cores t of each node and the number of disks d of each node to the central control unit, and the central control unit uses the smaller one of (2*t i ) and d i as the thread on the i-th node number, so that each node performs multi-threaded operations on its own disk in the corresponding process.

进一步的,中控单元此外存储有工作人员输入的此次集群部署的节点参数以及各节点的磁盘参数,中控单元通过比较存储的以及从检测器获得的各节点参数以及各节点的磁盘参数,当发现某个参数给定节点不存在时,则设置动态断点并提示操作人员修改配置文件,或者若参数给定的磁盘不存在或已作他用,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。Further, the central control unit also stores the node parameters of the cluster deployment and the disk parameters of each node input by the staff, and the central control unit compares the parameters of each node and the disk parameters of each node stored and obtained from the detector, When it is found that a given parameter does not exist, set a dynamic breakpoint and prompt the operator to modify the configuration file, or if the disk given by the parameter does not exist or has been used for other purposes, set a dynamic breakpoint and prompt the operator to modify Configuration file, continue to execute if the detection is correct after modification.

进一步的,在创建存储服务以及创建元数据服务过程中,检测器同步检测各节点以及各磁盘的参数,并实时发送至中控单元,中控单元根据配置参数判断配置过程是否存在错误,若存在错误则设置断点并提示修改,当修改后检测无误后则继续执行。Furthermore, during the process of creating storage services and creating metadata services, the detector synchronously detects the parameters of each node and each disk, and sends them to the central control unit in real time. The central control unit judges whether there is an error in the configuration process according to the configuration parameters. If there is an error, set a breakpoint and prompt for modification, and continue to execute after the modification is detected to be correct.

本发明还提供一种用于部署大规模集群文件系统的方法,步骤如下:The present invention also provides a method for deploying a large-scale cluster file system, the steps are as follows:

步骤1,通过分离出集群文件系统部署流程中所需的配置参数,形成Json配置文件;Step 1, form a Json configuration file by separating the configuration parameters required in the cluster file system deployment process;

步骤2,创建监控服务;Step 2, create a monitoring service;

步骤3,自动检测参数给定节点,并以节点数n作为节点处理进程数,执行多进程;Step 3, automatically detect the given node of the parameter, and use the node number n as the number of node processing processes to execute multiple processes;

步骤4,自动检测磁盘参数,统计各节点磁盘数量di,获取各节点CPU核数ti,以(2*ti)和di中较小的一个作为第i个节点上的线程数,在对应的进程内执行多线程操作;Step 4, automatically detect disk parameters, count the number of disks di of each node, obtain the number of CPU cores ti of each node, and use the smaller one of (2*ti) and di as the number of threads on the i-th node, in the corresponding process Execute multi-threaded operations within;

步骤5,创建存储服务;Step 5, create a storage service;

步骤6,创建元数据服务;Step 6, create metadata service;

步骤7,创建文件系统。Step 7, create a file system.

进一步的,步骤3还包括通过比较存储的以及检测的各节点参数,当发现某个参数给定节点不存在时,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。Further, step 3 also includes comparing the stored and detected parameters of each node. When it is found that a given node of a certain parameter does not exist, then set a dynamic breakpoint and prompt the operator to modify the configuration file. After the modification, the detection is correct and then continue to execute .

进一步的,步骤4还包括通过比较存储的以及检测各节点的磁盘参数,若参数给定的磁盘不存在或已作他用,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。Further, step 4 also includes comparing the stored and detecting the disk parameters of each node, if the disk given by the parameter does not exist or has been used for other purposes, set a dynamic breakpoint and prompt the operator to modify the configuration file, and the detection is correct after modification Then continue to execute.

进一步的,步骤5还包括在创建存储服务过程中,检测器同步检测各节点以及各磁盘的参数,并实时发送至中控单元,中控单元根据配置参数判断配置过程是否存在错误,若存在错误则设置断点并提示修改,当修改后检测无误后则继续执行。Further, step 5 also includes that during the process of creating the storage service, the detector synchronously detects the parameters of each node and each disk, and sends them to the central control unit in real time, and the central control unit judges whether there is an error in the configuration process according to the configuration parameters, and if there is an error Then set a breakpoint and prompt for modification, and continue to execute after the modification is detected to be correct.

进一步的,步骤6还包括在创建元数据服务过程中,检测器同步检测各节点以及各磁盘的参数,并实时发送至中控单元,中控单元根据配置参数判断配置过程是否存在错误,若存在错误则设置断点并提示修改,当修改后检测无误后则继续执行。Further, step 6 also includes that during the process of creating the metadata service, the detector synchronously detects the parameters of each node and each disk, and sends them to the central control unit in real time, and the central control unit judges whether there is an error in the configuration process according to the configuration parameters. If there is an error, set a breakpoint and prompt for modification, and continue to execute after the modification is detected to be correct.

通过分离出集群文件系统部署流程中所需的配置参数,形成Json配置文件,利用shell脚本解析配置文件,快速的进行集群部署;同时,在处理底层磁盘时,采用进程池和线程池自适应调整进程和线程的策略处理大量节点的磁盘,优化大规模集群的磁盘处理速度,实现大规模集群的快速部署。By separating the configuration parameters required in the cluster file system deployment process, a Json configuration file is formed, and the configuration file is parsed by a shell script to quickly deploy the cluster; at the same time, when processing the underlying disk, the process pool and thread pool are used for adaptive adjustment The strategy of process and thread handles disks of a large number of nodes, optimizes the disk processing speed of large-scale clusters, and realizes the rapid deployment of large-scale clusters.

附图说明Description of drawings

图1示出传统集群部署流程图。Figure 1 shows a flowchart of traditional cluster deployment.

图2示出传统集群部署磁盘处理流程图。FIG. 2 shows a flow chart of disk deployment in a traditional cluster.

图3示出本发明的集群部署系统结构示意图。FIG. 3 shows a schematic structural diagram of the cluster deployment system of the present invention.

图4示出本发明的集群部署磁盘处理流程图。FIG. 4 shows a flow chart of cluster deployment disk processing in the present invention.

图5示出本发明的集群部署流程图。Fig. 5 shows a flow chart of cluster deployment of the present invention.

具体实施方式detailed description

以下结合说明书附图及具体实施例进一步说明本发明的技术方案。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。The technical solutions of the present invention will be further described below in conjunction with the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.

首先本发明提供一种用于部署大规模集群文件系统的系统。Firstly, the present invention provides a system for deploying a large-scale cluster file system.

如图3所示,本发明的部署系统包括配置文件生成器、中控单元、检测器,配置文件生成器接中控单元的控制数据输入接口,以向中控单元传输配置文件,中控单元的控制数据输出接口接节点1、节点2、节点3~节点n,并向节点1、节点2、节点3~节点n传送配置文件、控制指令以及检测指令,节点1、节点2、节点3~节点n可以为主节点或从节点,且每个节点中都配置有解析脚本,以用于解析中控单元传送的配置文件,检测器接节点1、节点2、节点3~节点n,用于检测节点1、节点2、节点3~节点n配置过程中的参数,检测器还接中控单元的检测数据输入接口,向中控单元反馈节点1、节点2、节点3~节点n的配置参数。As shown in Figure 3, the deployment system of the present invention includes a configuration file generator, a central control unit, and a detector, and the configuration file generator is connected to the control data input interface of the central control unit to transmit the configuration file to the central control unit, and the central control unit The control data output interface of the node is connected to node 1, node 2, node 3~node n, and transmits configuration files, control instructions and detection instructions to node 1, node 2, node 3~node n, and node 1, node 2, node 3~node Node n can be a master node or a slave node, and each node is configured with a parsing script for parsing the configuration file transmitted by the central control unit. The detector is connected to node 1, node 2, node 3 ~ node n for Detect the parameters in the configuration process of node 1, node 2, node 3~node n, the detector is also connected to the detection data input interface of the central control unit, and feedback the configuration parameters of node 1, node 2, node 3~node n to the central control unit .

配置文件生成器可以为Json文件生成器或XML配置文件生成器,以下仅以Json文件生成器作详细说明。The configuration file generator can be a Json file generator or an XML configuration file generator, and only the Json file generator will be described in detail below.

解析脚本可以为shell脚本。The parsing script can be a shell script.

系统部署时,首先Json文件生成器通过分离出集群文件系统部署流程中所需的配置参数,形成Json配置文件,并将Json配置文件传输至中控单元,中控单元将Json配置文件传送至需要部署的各节点,同时发送启动指令至检测器,需要部署的各节点接收到中控单元传送的Json配置文件后,首先发送节点参数至检测器,检测器根据节点参数给定主节点和从节点,计算节点数n,并将节点数n以及各节点参数传送至中控单元,中控单元以节点数n作为节点处理进程数,并发送控制指令以及检测指令至各节点,控制指令用于控制n个节点同时进行参数配置,实现多进程配置。When the system is deployed, firstly, the Json file generator forms the Json configuration file by separating the configuration parameters required in the cluster file system deployment process, and transmits the Json configuration file to the central control unit, and the central control unit transmits the Json configuration file to the required Each node to be deployed sends a startup command to the detector at the same time. After each node to be deployed receives the Json configuration file transmitted by the central control unit, it first sends the node parameters to the detector, and the detector specifies the master node and the slave node according to the node parameters. , calculate the number of nodes n, and send the number of nodes n and the parameters of each node to the central control unit. The central control unit uses the number of nodes n as the number of node processing processes, and sends control instructions and detection instructions to each node. The control instructions are used to control n nodes perform parameter configuration at the same time to realize multi-process configuration.

各节点收到中控单元发出的检测指令后,将各自的CPU核数以及磁盘参数至检测器,检测器根据收到的数据获得各节点的CPU核数t以及各节点的磁盘数量d,并将各节点的CPU核数t以及各节点的磁盘数量d传送至中控单元,中控单元以(2*ti)和di中较小的一个作为第i个节点上的线程数,使得各节点在对应的进程内对自身的磁盘执行多线程操作。After receiving the detection instruction from the central control unit, each node sends its CPU core number and disk parameters to the detector, and the detector obtains the CPU core number t of each node and the disk number d of each node according to the received data, and The number of CPU cores t of each node and the number of disks d of each node are sent to the central control unit, and the central control unit uses the smaller one of (2*t i ) and d i as the number of threads on the i-th node, so that Each node executes multi-threaded operations on its own disk in the corresponding process.

中控单元此外存储有工作人员输入的此次集群部署的节点参数以及各节点的磁盘参数,中控单元通过比较存储的以及从检测器获得的各节点参数,当发现某个参数给定节点不存在时,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。In addition, the central control unit stores the node parameters of this cluster deployment and the disk parameters of each node input by the staff. The central control unit compares the parameters of each node stored and obtained from the detector. If it exists, set a dynamic breakpoint and prompt the operator to modify the configuration file, and continue to execute after the modification is detected to be correct.

当中控单元通过比较存储的以及从检测器获得的各节点的磁盘参数,若参数给定的磁盘不存在或已作他用,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。The central control unit compares the disk parameters of each node stored and obtained from the detector. If the disk given by the parameter does not exist or has been used for other purposes, it will set a dynamic breakpoint and prompt the operator to modify the configuration file. After the modification, the detection is correct. Then continue to execute.

各节点中的shell脚本解析Json配置,并根据对应的进程对自身的磁盘执行多线程操作,开始创建存储服务,在创建存储服务步骤完成后开始创建元数据服务,在创建元数据服务步骤完成后开始创建文件系统并完成集群文件系统的部署。The shell script in each node parses the Json configuration, and performs multi-threaded operations on its own disk according to the corresponding process, and starts to create the storage service. After the step of creating the storage service is completed, it starts to create the metadata service. After the step of creating the metadata service is completed Start creating the file system and complete the deployment of the cluster file system.

在创建存储服务以及创建元数据服务过程中,检测器同步检测各节点以及各磁盘的参数,并实时发送至中控单元,中控单元根据配置参数判断配置过程是否存在错误,若存在错误则设置断点并提示修改,当修改后检测无误后则继续执行。During the process of creating storage service and metadata service, the detector synchronously detects the parameters of each node and each disk, and sends them to the central control unit in real time. The central control unit judges whether there is an error in the configuration process according to the configuration parameters. If there is an error, set Breakpoint and prompt to modify, and continue to execute after the modification is detected to be correct.

通过分离出集群文件系统部署流程中所需的配置参数,形成Json配置文件,利用shell脚本解析配置文件,快速的进行集群文件系统的部署。By separating the configuration parameters required in the cluster file system deployment process, a Json configuration file is formed, and the shell script is used to parse the configuration file to quickly deploy the cluster file system.

由于在部署过程中,参数监测机制会不断对配置参数进行检测,发现参数错误会自动设置动态断点并提示操作人员对参数进行修正,参数修正后监测无误继续从断点处执行,保证大规模集群文件系统部署的一次成功;同时,在处理底层磁盘时,采用进程池和线程池自适应调整进程和线程的策略处理大量节点的磁盘,通过节点数量、磁盘数量和CPU核数自适应调节进程和线程,优化大规模集群的磁盘处理速度优化大规模集群的磁盘处理速度,使磁盘处理效率达到最高。During the deployment process, the parameter monitoring mechanism will continuously detect the configuration parameters. If a parameter error is found, a dynamic breakpoint will be automatically set and the operator will be prompted to correct the parameter. After the parameter is corrected, the monitoring will continue to execute from the breakpoint, ensuring large-scale A successful deployment of the cluster file system; at the same time, when dealing with the underlying disk, the process pool and thread pool are used to adaptively adjust the process and thread strategy to process the disk of a large number of nodes, and the process is adaptively adjusted through the number of nodes, the number of disks and the number of CPU cores And threads, optimize the disk processing speed of large-scale clusters Optimize the disk processing speed of large-scale clusters, so that the disk processing efficiency can reach the highest.

此外本发明还提供一种用于部署大规模集群文件系统的方法。In addition, the present invention also provides a method for deploying a large-scale cluster file system.

集群部署步骤如下:The cluster deployment steps are as follows:

步骤1,分离参数,形成配置文件;Step 1, separate the parameters to form a configuration file;

步骤2,创建监控服务;Step 2, create a monitoring service;

步骤3,自动检测参数给定节点,并以节点数n作为节点处理进程数,执行多进程;Step 3, automatically detect the given node of the parameter, and use the node number n as the number of node processing processes to execute multiple processes;

步骤4,自动检测磁盘参数,统计各节点磁盘数量di,获取各节点CPU核数ti,以(2*ti)和di中较小的一个作为第i个节点上的线程数,在对应的进程内执行多线程操作;Step 4, automatically detect the disk parameters, count the number of disks di of each node, obtain the number of CPU cores ti of each node, take the smaller one of (2*t i ) and d i as the number of threads on the i-th node, and in the corresponding Execute multi-threaded operations within the process;

步骤5,创建存储服务;Step 5, create a storage service;

步骤6,创建元数据服务;Step 6, create metadata service;

步骤7,创建文件系统。Step 7, create a file system.

步骤1具体为通过分离出集群文件系统部署流程中所需的配置参数,形成Json配置文件。Step 1 is specifically to form a Json configuration file by separating the configuration parameters required in the cluster file system deployment process.

步骤3还包括通过比较存储的以及检测的各节点参数,当发现某个参数给定节点不存在时,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。Step 3 also includes comparing the stored and detected parameters of each node, and setting a dynamic breakpoint and prompting the operator to modify the configuration file when it is found that a given node does not exist.

步骤4还包括通过比较存储的以及检测各节点的磁盘参数,若参数给定的磁盘不存在或已作他用,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。Step 4 also includes comparing the stored and detecting the disk parameters of each node. If the disk given by the parameter does not exist or has been used for other purposes, set a dynamic breakpoint and prompt the operator to modify the configuration file. After the modification, the detection is correct and the execution continues .

步骤5还包括在创建存储服务过程中,检测器同步检测各节点以及各磁盘的参数,并实时发送至中控单元,中控单元根据配置参数判断配置过程是否存在错误,若存在错误则设置断点并提示修改,当修改后检测无误后则继续执行。Step 5 also includes that during the process of creating the storage service, the detector synchronously detects the parameters of each node and each disk, and sends them to the central control unit in real time. The central control unit judges whether there is an error in the configuration process according to the configuration parameters, and if there is an error, sets the interrupt Click and prompt to modify, and continue to execute after the modification is detected to be correct.

步骤6还包括在创建元数据服务过程中,检测器同步检测各节点以及各磁盘的参数,并实时发送至中控单元,中控单元根据配置参数判断配置过程是否存在错误,若存在错误则设置断点并提示修改,当修改后检测无误后则继续执行。Step 6 also includes that during the process of creating the metadata service, the detector synchronously detects the parameters of each node and each disk, and sends them to the central control unit in real time. The central control unit judges whether there is an error in the configuration process according to the configuration parameters, and if there is an error, set Breakpoint and prompt to modify, and continue to execute after the modification is detected to be correct.

通过以上方法,实现了集群的快速部署,解决了现有部署方法过程繁琐、耗时长的问题。Through the above method, the rapid deployment of the cluster is realized, and the problems of cumbersome and time-consuming deployment methods in the existing methods are solved.

尽管在装置的上下文中已描述了一些方面,但明显的是这些方面也表示对应方法的描述,其中块或设备与方法步骤或方法步骤的特征相对应。类似地,在方法步骤的上下文中所描述的各方面也表示对应的块或项目或者对应装置的特征的描述。可以通过(或使用)如微处理器、可编程计算机、或电子电路之类的硬件装置来执行方法步骤中的一些或所有。可以通过此类装置来执行最重要的方法步骤中的某一个或多个。Although some aspects have been described in the context of an apparatus, it is obvious that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Similarly, aspects described in the context of a method step also represent a description of the corresponding block or item or feature of the corresponding apparatus. Some or all of the method steps may be performed by (or using) hardware means such as microprocessors, programmable computers, or electronic circuits. One or more of the most important method steps can be carried out by means of such means.

所述实现可以采用硬件或采用软件或可以使用例如软盘、DVD、蓝光、CD、ROM、PROM、EPROM、EEPROM、或闪存之类的具有被存储在其上的电子可读控制信号的数字存储介质来执行,所述电子可读控制信号与可编程计算机系统配合(或能够与其配合)以使得执行相应的方法。可以提供具有电子可读控制信号的数据载体,所述电子可读控制信号能够与可编程计算机系统配合以使得执行本文所描述的方法。The implementation may be in hardware or in software or may use a digital storage medium such as a floppy disk, DVD, Blu-ray, CD, ROM, PROM, EPROM, EEPROM, or flash memory having electronically readable control signals stored thereon. For execution, the electronically readable control signals cooperate (or are capable of cooperating) with a programmable computer system such that the corresponding method is performed. A data carrier may be provided having electronically readable control signals capable of cooperating with a programmable computer system such that the methods described herein are carried out.

所述实现还可以采用具有程序代码的计算机程序产品的形式,当计算机程序产品在计算机上运行时,程序代码进行操作以执行该方法。可以在机器可读载体上存储程序代码。The implementation may also take the form of a computer program product having a program code operable to carry out the method when the computer program product is run on a computer. The program code can be stored on a machine readable carrier.

以上所描述的仅是说明性,并且要理解的是,本文所描述的布置和细节的修改和变化对于本领域技术人员而言将是明显的。因此,意在仅由所附权利要求的范围而不是由通过以上描述和解释的方式所呈现的特定细节来限制。What has been described above is illustrative only, and it is to be understood that modifications and variations in the arrangements and details described herein will be apparent to those skilled in the art. It is therefore the intention to be limited only by the scope of the appended claims rather than by the specific details presented by way of the foregoing description and explanation.

Claims (10)

1.一种用于部署大规模集群文件系统的系统,其特征在于:包括配置文件生成器、中控单元、检测器,配置文件生成器通过分离出集群文件系统部署流程中所需的配置参数,形成配置文件,配置文件生成器接中控单元的控制数据输入接口,以向中控单元传输配置文件,中控单元的控制数据输出接口接文件系统的所有节点,并向各节点传送配置文件、控制指令以及检测指令,每个节点可以为主节点或从节点,且每个节点中都配置有解析脚本,以用于解析中控单元传送的配置文件,检测器接所有节点,用于检测各节点配置过程中的参数,检测器还接中控单元的检测数据输入接口,向中控单元反馈各节点的配置参数,配置文件生成器为Json文件生成器,解析脚本为shell脚本。1. A system for deploying a large-scale cluster file system, characterized in that: it includes a configuration file generator, a central control unit, and a detector, and the configuration file generator separates the required configuration parameters in the cluster file system deployment process , to form a configuration file, the configuration file generator is connected to the control data input interface of the central control unit to transmit the configuration file to the central control unit, the control data output interface of the central control unit is connected to all nodes of the file system, and transmits the configuration file to each node , control instructions and detection instructions, each node can be a master node or a slave node, and each node is configured with a parsing script for parsing the configuration file transmitted by the central control unit, and the detector is connected to all nodes for detection The parameters in the configuration process of each node, the detector is also connected to the detection data input interface of the central control unit, and the configuration parameters of each node are fed back to the central control unit. The configuration file generator is a Json file generator, and the analysis script is a shell script. 2.根据权利要求1所述的系统,其中需要部署的各节点接收到中控单元传送的Json配置文件后,发送节点参数至检测器,检测器根据节点参数给定主节点和从节点,计算节点数n,并将节点数n以及各节点参数传送至中控单元,中控单元以节点数n作为节点处理进程数,并发送控制指令以及检测指令至各节点,控制指令用于控制n个节点同时进行参数配置,实现多进程配置。2. The system according to claim 1, wherein after each node to be deployed receives the Json configuration file transmitted by the central control unit, it sends the node parameters to the detector, and the detector specifies the master node and the slave node according to the node parameters, and calculates The number of nodes is n, and the number of nodes and parameters of each node are sent to the central control unit. The central control unit uses the number of nodes n as the number of node processing processes, and sends control instructions and detection instructions to each node. The control instructions are used to control n The nodes perform parameter configuration at the same time to realize multi-process configuration. 3.根据权利要求2所述的系统,其中各节点收到中控单元发出的检测指令后,将各自的CPU核数以及磁盘参数至检测器,检测器根据收到的数据获得各节点的CPU核数t以及各节点的磁盘数量d,并将各节点的CPU核数t以及各节点的磁盘数量d传送至中控单元,中控单元以(2*ti)和di中较小的一个作为第i个节点上的线程数,使得各节点在对应的进程内对自身的磁盘执行多线程操作。3. The system according to claim 2, wherein each node sends the number of CPU cores and disk parameters to the detector after receiving the detection instruction from the central control unit, and the detector obtains the CPU of each node according to the received data. The number of cores t and the number of disks d of each node, and the number of CPU cores t of each node and the number of disks d of each node are sent to the central control unit, and the central control unit uses the smaller of (2*t i ) and d i One is the number of threads on the i-th node, so that each node performs multi-threaded operations on its own disk in the corresponding process. 4.根据权利要求3所述的系统,其中中控单元此外存储有工作人员输入的此次集群部署的节点参数以及各节点的磁盘参数,中控单元通过比较存储的以及从检测器获得的各节点参数以及各节点的磁盘参数,当发现某个参数给定节点不存在时,则设置动态断点并提示操作人员修改配置文件,或者若参数给定的磁盘不存在或已作他用,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。4. The system according to claim 3, wherein the central control unit further stores the node parameters of this cluster deployment input by the staff and the disk parameters of each node, and the central control unit compares the parameters stored and obtained from the detector. Node parameters and disk parameters of each node. When a node given by a parameter does not exist, set a dynamic breakpoint and prompt the operator to modify the configuration file, or if the disk given by the parameter does not exist or has been used for other purposes, then Set a dynamic breakpoint and prompt the operator to modify the configuration file, and continue to execute after the modification is detected to be correct. 5.根据权利要求3所述的系统,其中在创建存储服务以及创建元数据服务过程中,检测器同步检测各节点以及各磁盘的参数,并实时发送至中控单元,中控单元根据配置参数判断配置过程是否存在错误,若存在错误则设置断点并提示修改,当修改后检测无误后则继续执行。5. The system according to claim 3, wherein during the process of creating storage services and creating metadata services, the detector synchronously detects the parameters of each node and each disk, and sends them to the central control unit in real time, and the central control unit according to the configuration parameters Determine whether there is an error in the configuration process, if there is an error, set a breakpoint and prompt for modification, and continue execution after the modification is detected to be correct. 6.一种用于部署大规模集群文件系统的方法,步骤如下:6. A method for deploying a large-scale cluster file system, the steps are as follows: 步骤1,通过分离出集群文件系统部署流程中所需的配置参数,形成Json配置文件;Step 1, form a Json configuration file by separating the configuration parameters required in the cluster file system deployment process; 步骤2,创建监控服务;Step 2, create a monitoring service; 步骤3,自动检测参数给定节点,并以节点数n作为节点处理进程数,执行多进程;Step 3, automatically detect the given node of the parameter, and use the node number n as the number of node processing processes to execute multiple processes; 步骤4,自动检测磁盘参数,统计各节点磁盘数量di,获取各节点CPU核数ti,以(2*ti)和di中较小的一个作为第i个节点上的线程数,在对应的进程内执行多线程操作;Step 4, automatically detect disk parameters, count the number of disks di of each node, obtain the number of CPU cores ti of each node, and use the smaller one of (2*ti) and di as the number of threads on the i-th node, in the corresponding process Execute multi-threaded operations within; 步骤5,创建存储服务;Step 5, create a storage service; 步骤6,创建元数据服务;Step 6, create metadata service; 步骤7,创建文件系统。Step 7, create a file system. 7.根据权利要求6所述的方法,其中步骤3还包括通过比较存储的以及检测的各节点参数,当发现某个参数给定节点不存在时,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。7. The method according to claim 6, wherein step 3 further comprises by comparing stored and detected node parameters, when a given node of a certain parameter is found not to exist, setting a dynamic breakpoint and prompting the operator to modify the configuration file, continue to execute after the modification is detected to be correct. 8.根据权利要求6所述的方法,其中步骤4还包括通过比较存储的以及检测各节点的磁盘参数,若参数给定的磁盘不存在或已作他用,则设置动态断点并提示操作人员修改配置文件,修改后检测无误则继续执行。8. The method according to claim 6, wherein step 4 also includes comparing the stored and detecting disk parameters of each node, if the disk given by the parameter does not exist or has been used for other purposes, then set a dynamic breakpoint and prompt the operation Personnel modify the configuration file, and continue to execute after the modification is detected to be correct. 9.根据权利要求6所述的方法,其中步骤5还包括在创建存储服务过程中,检测器同步检测各节点以及各磁盘的参数,并实时发送至中控单元,中控单元根据配置参数判断配置过程是否存在错误,若存在错误则设置断点并提示修改,当修改后检测无误后则继续执行。9. The method according to claim 6, wherein step 5 further comprises that during the process of creating the storage service, the detector synchronously detects the parameters of each node and each disk, and sends them to the central control unit in real time, and the central control unit judges according to the configuration parameters Whether there is an error in the configuration process, if there is an error, set a breakpoint and prompt for modification, and continue to execute after the modification is detected to be correct. 10.根据权利要求6所述的方法,其中步骤6还包括在创建元数据服务过程中,检测器同步检测各节点以及各磁盘的参数,并实时发送至中控单元,中控单元根据配置参数判断配置过程是否存在错误,若存在错误则设置断点并提示修改,当修改后检测无误后则继续执行。10. The method according to claim 6, wherein step 6 further comprises that in the process of creating the metadata service, the detector synchronously detects the parameters of each node and each disk, and sends them to the central control unit in real time, and the central control unit according to the configuration parameters Determine whether there is an error in the configuration process, if there is an error, set a breakpoint and prompt for modification, and continue execution after the modification is detected to be correct.
CN201611169173.1A 2016-12-16 2016-12-16 System and method for deploying large-scale cluster file system Active CN106446303B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611169173.1A CN106446303B (en) 2016-12-16 2016-12-16 System and method for deploying large-scale cluster file system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611169173.1A CN106446303B (en) 2016-12-16 2016-12-16 System and method for deploying large-scale cluster file system

Publications (2)

Publication Number Publication Date
CN106446303A true CN106446303A (en) 2017-02-22
CN106446303B CN106446303B (en) 2020-01-14

Family

ID=58217465

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611169173.1A Active CN106446303B (en) 2016-12-16 2016-12-16 System and method for deploying large-scale cluster file system

Country Status (1)

Country Link
CN (1) CN106446303B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062504A (en) * 2018-07-13 2018-12-21 郑州云海信息技术有限公司 Storage system dispositions method and device under a kind of virtual platform
CN111061503A (en) * 2018-10-16 2020-04-24 航天信息股份有限公司 Cluster system configuration method and cluster system
CN111338580A (en) * 2020-02-29 2020-06-26 苏州浪潮智能科技有限公司 A method and device for optimizing disk performance

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103064742A (en) * 2012-12-25 2013-04-24 中国科学院深圳先进技术研究院 Automatic deployment system and method of hadoop cluster
US20130339486A1 (en) * 2012-06-14 2013-12-19 Microsoft Corporation Scalable Storage with Programmable Networks
CN104461467A (en) * 2013-09-25 2015-03-25 广州中国科学院软件应用技术研究所 Method for increasing calculation speed of SMP cluster system through MPI and OpenMP in hybrid parallel mode
CN105703940A (en) * 2015-12-10 2016-06-22 中国电力科学研究院 Multistage dispatching distributed parallel computing-oriented monitoring system and monitoring method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130339486A1 (en) * 2012-06-14 2013-12-19 Microsoft Corporation Scalable Storage with Programmable Networks
CN103064742A (en) * 2012-12-25 2013-04-24 中国科学院深圳先进技术研究院 Automatic deployment system and method of hadoop cluster
CN104461467A (en) * 2013-09-25 2015-03-25 广州中国科学院软件应用技术研究所 Method for increasing calculation speed of SMP cluster system through MPI and OpenMP in hybrid parallel mode
CN105703940A (en) * 2015-12-10 2016-06-22 中国电力科学研究院 Multistage dispatching distributed parallel computing-oriented monitoring system and monitoring method

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062504A (en) * 2018-07-13 2018-12-21 郑州云海信息技术有限公司 Storage system dispositions method and device under a kind of virtual platform
CN111061503A (en) * 2018-10-16 2020-04-24 航天信息股份有限公司 Cluster system configuration method and cluster system
CN111061503B (en) * 2018-10-16 2023-08-18 航天信息股份有限公司 Cluster system configuration method and cluster system
CN111338580A (en) * 2020-02-29 2020-06-26 苏州浪潮智能科技有限公司 A method and device for optimizing disk performance
CN111338580B (en) * 2020-02-29 2021-12-21 苏州浪潮智能科技有限公司 Method and equipment for optimizing disk performance

Also Published As

Publication number Publication date
CN106446303B (en) 2020-01-14

Similar Documents

Publication Publication Date Title
US9053236B1 (en) Automated directory services test setup utility
WO2017161984A1 (en) Method, device and system for deploying data clusters, and computer storage medium
CN108897557B (en) Updating method and device of microservice architecture
CN108459951B (en) Test method and apparatus
CA3002807C (en) Techniques for determining client-side effects of server-side behavior using canary analysis
CN104216766B (en) The method and device that stream data is handled
CN106446303B (en) System and method for deploying large-scale cluster file system
CN102591658A (en) Method and device for processing message
CN107918558A (en) Business Process Control method, apparatus and equipment based on state machine
US20130013753A1 (en) Embedded Configuration Variance Detector
CN105933136B (en) A kind of resource regulating method and system
US9612935B2 (en) Enhanced resiliency testing by enabling state level control for request
CN112579247B (en) Method and device for determining task state
US20160335170A1 (en) Model checking device for distributed environment model, model checking method for distributed environment model, and medium
CN108241545A (en) System fault debugging method and device
CN104581200B (en) The method and apparatus of section transcoding
JP2018005768A (en) Job scheduler test program, job scheduler test method and parallel processor
CN104102583A (en) High-availability cluster software distributed automated testing framework
CN107623746B (en) A data processing method and system
US10417080B2 (en) Remote client screen shots monitoring system and method
CN109634769B (en) Fault-tolerant processing method, device, equipment and storage medium in data storage
CN101877874B (en) The transmission of performance data and output intent, system and equipment
CN105765908A (en) Method, client and system for multi-site automatic update
TW201821986A (en) Motherboard and its setting update method
KR101989222B1 (en) Method, apparatus and system for detecting structural variations

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20191204

Address after: 215100 No. 1 Guanpu Road, Guoxiang Street, Wuzhong Economic Development Zone, Suzhou City, Jiangsu Province

Applicant after: SUZHOU LANGCHAO INTELLIGENT TECHNOLOGY Co.,Ltd.

Address before: 450000 Henan province Zheng Dong New District of Zhengzhou City Xinyi Road No. 278 16 floor room 1601

Applicant before: ZHENGZHOU YUNHAI INFORMATION TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address

Address after: Building 9, No.1, guanpu Road, Guoxiang street, Wuzhong Economic Development Zone, Wuzhong District, Suzhou City, Jiangsu Province

Patentee after: Suzhou Yuannao Intelligent Technology Co.,Ltd.

Country or region after: China

Address before: Building 9, No.1, guanpu Road, Guoxiang street, Wuzhong Economic Development Zone, Wuzhong District, Suzhou City, Jiangsu Province

Patentee before: SUZHOU LANGCHAO INTELLIGENT TECHNOLOGY Co.,Ltd.

Country or region before: China

CP03 Change of name, title or address