CN111209107A - Multi-cluster operation method - Google Patents
Multi-cluster operation method Download PDFInfo
- Publication number
- CN111209107A CN111209107A CN201911362939.1A CN201911362939A CN111209107A CN 111209107 A CN111209107 A CN 111209107A CN 201911362939 A CN201911362939 A CN 201911362939A CN 111209107 A CN111209107 A CN 111209107A
- Authority
- CN
- China
- Prior art keywords
- cluster
- user
- operates
- administrator
- operation method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Storage Device Security (AREA)
Abstract
本发明公开了一种多集群操作方法,包括:为用户添加一个属性;管理员通过设置所述属性的值来确定用户操作单集群、或操作当前登陆集群、或操作多集群。通过上述技术方案,能够避免将集群信息暴漏给用户的问题。
The invention discloses a multi-cluster operation method, comprising: adding an attribute for a user; an administrator determines that the user operates a single cluster, operates a current login cluster, or operates multiple clusters by setting the value of the attribute. Through the above technical solutions, the problem of exposing cluster information to users can be avoided.
Description
Technical Field
The invention relates to the technical field of computer clusters, in particular to a multi-cluster operation method.
Background
The SLURM is an open-source cluster job scheduling system with good fault tolerance and high scalability, and has the key functions of: allocating computing resources to perform work tasks; providing a framework for starting, executing, monitoring jobs on the assigned node sets; arbitration resource contention issues. The cluster consists of all nodes managed by one slarmctld daemon.
SLURM provides the ability to target commands to other clusters, rather than, or in addition to, the local cluster that invoked the command. After enabling this behavior, the user may submit jobs to one or more clusters and receive status from these remote clusters. Part of the client commands now provide an "-M" -clusters ═ "option that provides the ability to communicate with comma-separated cluster lists.
At present, a user must explicitly specify a cluster list by using an option of "-M, — clusters", and an administrator must expose cluster information to the user, which cannot meet the requirement of the administrator for controlling the cluster information. SLURM provides temporarily no functionality to shield the cluster information from the user.
Disclosure of Invention
In view of the above problems in the related art, the present invention provides a multi-cluster operation method, which can eliminate the need for explicitly specifying cluster names.
The technical scheme of the invention is realized as follows:
according to an aspect of the present invention, there is provided a multi-cluster operation method, including:
adding an attribute to the user;
the administrator determines whether to operate a single cluster by the user, or to operate a current login cluster, or to operate multiple clusters by setting the values of the attributes.
According to an embodiment of the present invention, adding an attribute to a user comprises: a field is added to the database for the user table, which is a list of cluster names operable by the user.
According to an embodiment of the present invention, the administrator setting the values of the attributes includes: for a user having authority only for a first cluster, when the value of the field set by the administrator is the name of the first cluster, if the user logs in the first cluster having the authority, the user operates the first cluster, and if the user logs in a second cluster having no authority, the user operates the first cluster.
According to an embodiment of the present invention, the administrator setting the values of the attributes includes: for users having authority over both the first cluster and the second cluster, when the administrator sets the value of the field as the first cluster name, if the user logs in the first cluster having authority, the user operates the first cluster, and if the user logs in the second cluster having authority, the user operates the first cluster.
According to an embodiment of the present invention, the administrator setting the values of the attributes includes: for users having authority over both the first cluster and the second cluster, when the administrator sets the value of the field as the current cluster name, if the user logs in the first cluster having authority, the user only operates the first cluster, and if the user logs in the second cluster having authority, the user only operates the second cluster.
According to an embodiment of the present invention, the administrator setting the values of the attributes includes: for a user having authority over both the first cluster and the second cluster, when the administrator sets the values of the fields as the first cluster name and the second cluster name, or all the cluster names, if the user logs in the first cluster having the authority, the user operates the first cluster and the second cluster, and if the user logs in the second cluster having the authority, the user operates the first cluster and the second cluster.
According to the embodiment of the invention, the operation of a user on a single cluster, or the operation of a current login cluster, or the operation of a multi-cluster comprises the following steps: for submitting jobs to a single cluster, or a currently logged-on cluster, or multiple clusters.
According to the embodiment of the invention, when the submission job is executed and when the multi-cluster operation needs to be executed, the user and cluster information in the database are sequentially inquired and the cluster list is returned so as to select the cluster from all available clusters to submit the job.
The technical scheme of the invention realizes the SLURM dynamically configurable multi-cluster operation method. An attribute is added to a user, and an administrator determines that the user submits jobs to functions of a single cluster, a current login cluster, a multi-cluster and the like by setting the value of the attribute. Therefore, the user does not need to care about the cluster information, and the problem that the cluster information is exposed to the user in the prior art is solved. By default, the user may submit jobs to all clusters. In addition, the control function of the cluster administrator is enhanced, and the requirement of the cluster administrator on protecting cluster information is met.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a flow chart of a method of multi-cluster operation according to an embodiment of the invention;
FIG. 2 is a flow diagram of a batch submit job command according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present invention.
FIG. 1 is a flow chart of a method of multi-cluster operation according to an embodiment of the invention. As shown in fig. 1, the multi-cluster operation method of the embodiment of the present invention may include the following steps:
s11, adding an attribute for the user;
s12, the administrator determines whether the user operates the single cluster, the current login cluster or the multi-cluster by setting the value of the attribute.
According to the technical scheme, by adding the attributes, an administrator determines the functions of a user operation list cluster, a current login cluster, a multi-cluster and the like by setting the user attributes. Therefore, in contrast to the prior art, the user does not need to explicitly specify the cluster name.
Specifically, a feandmulticycle field may be added to the user table in the database, where the meaning of the field is a list of cluster names that can be operated by the user, and the specific setting conditions include:
(1) for users having authority only to one cluster a (the first cluster), the administrator sets feandmulticruster a.
A user logs in the cluster A with the authority and can operate the cluster A;
a user logs in a cluster B (a second cluster) without authority and can operate the cluster A;
(2) for users with authority in two clusters a, B, the administrator sets feandmulticruster a.
A user logs in the cluster A with the authority and can operate the cluster A;
a user logs in the cluster B with the authority and can operate the cluster A;
(3) for users with authority in the two clusters A and B, the administrator sets the current as the fendnmulticruster.
A user logs in the cluster A with the authority and can only operate the cluster A;
a user logs in the cluster B with the authority and can only operate the cluster B;
(4) for users with authority over two clusters a, B, the administrator sets either feandmulticluster-a, B or feandmulticluster-all.
A user logs in the cluster A with the authority and can operate the clusters A and B;
and the user logs in the cluster B with the authority and can operate the clusters A and B.
In one embodiment, the SLURM multi-cluster operation commands are numerous, and the dynamic configuration implementation principle is described by taking a batch commit job command sbatch as an example, and a sbatch code processing flow chart is shown in fig. 2, which includes:
1) the sbatch command starts to be executed, firstly, a configuration file (slarm. conf) is analyzed, and some key parameters are stored;
2) analyzing and storing parameters transmitted from sources such as a job script, an environment variable, a command line and the like, and processing the condition of an option of ' M ' -Cluster ';
3) filling a job structure according to the parameters obtained in the above two steps, the structure containing all necessary information for execution of one job;
4) and judging whether to execute the multi-cluster operation according to opt. If there are multiple clusters, execute the slarmdb _ get _ first _ avail _ cluster, this function interacts with the database daemon slarmdb, execute job _ will _ run, slarmjb _ will _ run2 and job _ will _ run _ cluster in turn, the function is to select one suitable cluster from all available clusters to submit the job. Calling slurmdb _ get _ info _ cluster inside the function, sequentially inquiring user and cluster information in the mysql database, and returning appropriate cluster list information;
5) if not, executing the slarm _ submit _ batch _ jobs;
6) step 4) and step 5) call the slm _ send _ resv _ controller _ msg, and the function packs the job information and sends the job information to the management node daemon slrmctld of the cluster configured by the user to wait for scheduling and execution.
It should be understood that other commands related to multi-cluster operations may be processed similarly to fig. 2.
In summary, the technical solution of the present invention realizes a method for dynamically configurable multi-cluster operation by SLURM. An attribute is added to a user, and an administrator determines that the user submits jobs to functions of a single cluster, a current login cluster, a multi-cluster and the like by setting the value of the attribute. Therefore, the user does not need to care about the cluster information, and the problem that the cluster information is exposed to the user in the prior art is solved. By default, the user may submit jobs to all clusters. In addition, the control function of the cluster administrator is enhanced, and the requirement of the cluster administrator on protecting cluster information is met.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911362939.1A CN111209107A (en) | 2019-12-26 | 2019-12-26 | Multi-cluster operation method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911362939.1A CN111209107A (en) | 2019-12-26 | 2019-12-26 | Multi-cluster operation method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN111209107A true CN111209107A (en) | 2020-05-29 |
Family
ID=70782533
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201911362939.1A Pending CN111209107A (en) | 2019-12-26 | 2019-12-26 | Multi-cluster operation method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111209107A (en) |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101645022A (en) * | 2009-08-28 | 2010-02-10 | 曙光信息产业(北京)有限公司 | Work scheduling management system and method for a plurality of colonies |
| CN103294485A (en) * | 2013-06-27 | 2013-09-11 | 曙光信息产业(北京)有限公司 | Web service packaging method and Web service packaging system both used for ABINIT parallel computing system |
| CN105183820A (en) * | 2015-08-28 | 2015-12-23 | 广东创我科技发展有限公司 | Multi-tenant supported large data platform and tenant access method |
| CN106165367A (en) * | 2014-12-31 | 2016-11-23 | 华为技术有限公司 | A kind of access control method, storage device and control system storing device |
| CN107895113A (en) * | 2017-12-06 | 2018-04-10 | 北京搜狐新媒体信息技术有限公司 | A kind of fine-grained data authority control method and system for supporting the more clusters of hadoop |
| US20190089812A1 (en) * | 2016-03-31 | 2019-03-21 | Alibaba Group Holding Limited | Routing method and device |
| CN109740373A (en) * | 2018-12-19 | 2019-05-10 | 福建新大陆软件工程有限公司 | A kind of Hadoop cluster management method, system and platform |
-
2019
- 2019-12-26 CN CN201911362939.1A patent/CN111209107A/en active Pending
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101645022A (en) * | 2009-08-28 | 2010-02-10 | 曙光信息产业(北京)有限公司 | Work scheduling management system and method for a plurality of colonies |
| CN103294485A (en) * | 2013-06-27 | 2013-09-11 | 曙光信息产业(北京)有限公司 | Web service packaging method and Web service packaging system both used for ABINIT parallel computing system |
| CN106165367A (en) * | 2014-12-31 | 2016-11-23 | 华为技术有限公司 | A kind of access control method, storage device and control system storing device |
| CN105183820A (en) * | 2015-08-28 | 2015-12-23 | 广东创我科技发展有限公司 | Multi-tenant supported large data platform and tenant access method |
| US20190089812A1 (en) * | 2016-03-31 | 2019-03-21 | Alibaba Group Holding Limited | Routing method and device |
| CN107895113A (en) * | 2017-12-06 | 2018-04-10 | 北京搜狐新媒体信息技术有限公司 | A kind of fine-grained data authority control method and system for supporting the more clusters of hadoop |
| CN109740373A (en) * | 2018-12-19 | 2019-05-10 | 福建新大陆软件工程有限公司 | A kind of Hadoop cluster management method, system and platform |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111078315B (en) | Microservice orchestration, execution method and system, architecture, device, storage medium | |
| CN108920259B (en) | Deep learning job scheduling method, system and related equipment | |
| CN116018788A (en) | Configure service mesh networking resources for dynamically discovered peers or network functions | |
| JP2020501253A (en) | On-demand code execution in a localized device coordinator | |
| WO2019218463A1 (en) | Method and apparatus for automatically building kubernetes master node on basis of ansible tool, terminal device, and readable storage medium | |
| JP6532385B2 (en) | INFORMATION PROCESSING SYSTEM, CONTROL METHOD THEREOF, AND PROGRAM | |
| CN108701132B (en) | Resource management system and method | |
| CN112395107A (en) | Tax control equipment control method and device, storage medium and electronic equipment | |
| CN103098033A (en) | System and method for managing resources of a portable computing device | |
| JP2020502643A (en) | Localized device coordinator with on-demand code execution capability | |
| US11645098B2 (en) | Systems and methods to pre-provision sockets for serverless functions | |
| JPWO2014171130A1 (en) | Information processing system, deployment method, processing device, and deployment device | |
| WO2019223099A1 (en) | Application program calling method and system | |
| JP2021518014A (en) | On-demand code execution with limited memory footprint | |
| WO2024066342A1 (en) | Task processing method and apparatus, electronic device, and storage medium | |
| JP7313351B2 (en) | Resource processing method and system, storage medium, electronic device | |
| Smirnov et al. | Integration and combined use of distributed computing resources with Everest | |
| US8676842B2 (en) | Creating multiple Mbeans from a factory Mbean | |
| CN111209107A (en) | Multi-cluster operation method | |
| US20110246553A1 (en) | Validation of internal data in batch applications | |
| CN114048460B (en) | Cross-platform automatic data batch processing method, system, equipment and storage medium | |
| JP2018084994A (en) | Control system and control method | |
| WO2024226591A1 (en) | Third party interface for systems providing access management as a service | |
| CN107784488A (en) | A kind of business process management system of loose couplings | |
| CN114237818A (en) | Method, system, computing device and storage medium for sharing resources among virtual machines |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200529 |