CN111817870B

CN111817870B - Method, controller device, and storage medium for managing multiple network devices

Info

Publication number: CN111817870B
Application number: CN201910909817.3A
Authority: CN
Inventors: 钱德拉塞克哈尔·A; 贾扬斯·R; 哈维尔·安蒂克
Original assignee: Juniper Networks Inc
Current assignee: Juniper Networks Inc
Priority date: 2019-04-10
Filing date: 2019-09-25
Publication date: 2021-12-21
Anticipated expiration: 2039-09-25
Also published as: US11640291B2; CN114095331B; US10884728B2; CN111817870A; US20230214208A1; CN114095331A; US11922162B2; EP3722944A1; US20200326924A1; US20210124570A1

Abstract

The present disclosure relates to a method, a controller device, and a storage medium for managing a plurality of network devices. A controller device manages multiple network devices. The controller device includes one or more processing units configured to: receive an upgrade request; determine an upgrade graph having nodes, each node representing a network device or network service provided by the network and also having one or more arrows, Each vector line connects two nodes and represents network redundancy or service dependency; vector line weights for each vector line are calculated and assigned by combining the results of at least one objective function, each objective function in at least one objective function Having a minimum target or maximum target for the network; dividing the upgrade graph into subgroups based on the arrow weights; determining the upgrade progress; and upgrading the software of each of the plurality of network devices according to the upgrade progress.

Description

Method for managing a plurality of network devices, controller device and storage medium

Cross reference to the present application

The present application claims the benefit of EP application No. 19382267.3 filed on 2019, month 4 and day 10, the entire contents of which are incorporated herein by reference.

Technical Field

The present disclosure relates to computer networks and, more particularly, to management of network devices.

Background

Network devices typically include mechanisms (such as management interfaces) for configuring the device locally or remotely. By interacting with the management interface, the customer is able to perform configuration tasks and execute operational commands to collect and view operational data of the managed device. For example, a customer may configure an interface card of a device, adjust parameters of supported network protocols, specify physical components within the device, modify routing information maintained by a router, access software modules and other resources hosted on the device, and perform other configuration tasks. In addition, the client may allow the user to view current operating parameters, system logs, information related to network connections, network activity, or other status information from the device, and view and react to event information received from the device.

The network configuration service may be performed by a number of distinct devices, such as routers and/or dedicated service devices with service cards. The services include connectivity services such as layer three virtual private network (L3VPN), virtual private local area network service (VPLS), and peer-to-peer (P2P) services. Other services include network configuration services such as Dot1q VLAN services. Network Management Systems (NMSs) and NMS devices, also referred to as controllers or controller devices, may support these services so that administrators can easily create and manage these high-level network configuration services.

In particular, the user configuration of a device may be referred to as "intent". The intent-based network system lets an administrator describe the intended network/computer/storage state. User intent can be classified as business policy or stateless intent. A traffic policy or state intent can be resolved based on the current state of the network. Stateless intent may be a fully declarative way of describing the intended network/computer/storage state, regardless of the current network state.

The intent may be represented as an intent data model that is modeled using a unified graph. The intent data model can be represented as a connected graph, enabling business policies to be implemented across the intent data model. For example, the data model may be represented using a connected graph having vertices connected with self edges (has-edges) and reference (ref) edges. The controller device may model the intent data model as a unified graph such that the intent model can be represented as connected. In this way, business policies can be implemented across the intent data model. When modeling intents using a unified graphics model, extending new intent support requires extending the graphics model and editing logic.

To configure a device for execution intent, a user (such as an administrator) may write a translator that translates high-level configuration instructions (e.g., instructions according to an intent data model, expressible as a unified graphics model) into low-level configuration instructions (e.g., instructions according to a device configuration model). As part of the configuration service support, a user/administrator may provide an intent data model and a mapping between the intent data model and the device configuration model.

To simplify the mapping definition for the user, the controller device may be designed to provide the ability to define the mapping in a simple manner. For example, some controller devices provide for the use of rate templates (Velocity templates) and/or extensible stylesheet language transformations (XSLT). The translator contains translation or mapping logic from the intent data model to a low-level device configuration model. In general, relatively small changes in the intent data model affect relatively many properties configured across devices. When a service is created, updated, and deleted from the intent data model, a different translator may be used.

Disclosure of Invention

In general, this disclosure describes techniques for upgrading network device software according to an intent-based upgrade framework. A Network Management System (NMS) device, also referred to herein as a controller device, may configure a network device using low-level (i.e., device-level) configuration data expressed, for example, in another next generation (YANG) data modeling language. Also, the controller device may manage the network device based on configuration data of the network device. According to techniques of the present disclosure, a controller device is configured to receive an upgrade request for upgrading software of a network device. In response to the upgrade request, the controller device may determine at least some of the goals and constraints specified in the upgrade request, and then generate a device upgrade schedule for upgrading the network device in an attempt to achieve the goals and constraints.

For example, the controller device may determine a series of relationships between network devices and generate a unified graphical model representing the relationships. The controller device may obtain a set of devices from the unified graphics model based on the objectives and constraints to model in the multi-objective upgrade graphics. The controller device then calculates an optimal algorithm for the multi-target upgrade graph to produce a device upgrade schedule that attempts to achieve the goals and constraints, at least some of which are specified in the upgrade request.

In some aspects, to specify path and device redundancies that facilitate implementing constraints on service and device availability, a technique includes an enhanced device model for specifying one or more redundant devices of a modeling device and an enhanced service model for specifying one or more redundant paths of a modeling service. When computing an optimal algorithm for a multi-objective upgrade graph, the controller device may schedule upgrades for devices with redundant devices or service relationships at different times using redundant information in the enhanced device model and the enhanced service model.

The techniques of this disclosure may provide one or more technical advantages, i.e., provide at least one practical application. For example, an intent-based upgrade request may permit an administrator to express the intent of a selected set of devices to be upgraded without requiring the administrator to schedule device upgrades while also providing the administrator the ability to express goals and constraints on the upgrade process for the devices. Further, the techniques may provide an extensible, programmable infrastructure for defining new criteria for the upgrade process.

In one embodiment, a method comprises: receiving, by a controller device that manages a plurality of network devices of a network that provides one or more services, an upgrade request; determining, by the controller device based on the upgrade request, an upgrade graph having nodes, each node representing one network device or network service provided by the network, the upgrade graph further having one or more edges, each edge connecting two nodes and representing network redundancy or service dependency; calculating and assigning, by the controller device, an edge weight for each of the edges by combining results of at least one objective function, each of the at least one objective function having a minimum target or a maximum target of the network; dividing, by the controller device, the upgrade graph into a plurality of subgroups based on the edge weights; determining, by the controller device, an upgrade progress based on the subgroup; and upgrading, by the controller device, the software of each of the plurality of network devices according to the upgrade schedule.

In another embodiment, a controller device that manages a plurality of network devices includes one or more processing units implemented in circuitry and configured to: receiving an upgrade request; determining, based on the upgrade request, an upgrade graph having nodes, each node representing a network device or a network service provided by the network, the upgrade graph further having one or more edges, each edge connecting two nodes and representing network redundancy or service dependency; calculating and assigning an edge weight for each of the edges by combining the results of at least one objective function, each of the at least one objective function having a minimum objective or a maximum objective of the network; partitioning the upgrade graph into a plurality of subgroups based on the edge weights; determining an upgrade progress based on the subgroups; and upgrading the software of each of the plurality of network devices according to the upgrade schedule.

In another embodiment, a computer readable storage medium has instructions stored thereon that, when executed, cause a processor of a controller device that manages a plurality of network devices to: receiving an upgrade request; determining, based on the upgrade request, an upgrade graph having nodes, each node representing a network device or a network service provided by the network, the upgrade graph further having one or more edges, each edge connecting two nodes and representing network redundancy or service dependency; calculating and assigning an edge weight for each of the edges by combining the results of at least one objective function, each of the at least one objective function having a minimum objective or a maximum objective of the network; partitioning the upgrade graph into a plurality of subgroups based on the edge weights; determining an upgrade progress based on the subgroups; and upgrading the software of each of the plurality of network devices according to the upgrade schedule.

The details of one or more embodiments are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the claims.

Drawings

Fig. 1 is a block diagram illustrating an embodiment of elements comprising an enterprise network upgraded with a management device.

Fig. 2 is a block diagram illustrating an exemplary set of components of the management device in fig. 1.

FIG. 3 is a conceptual diagram illustrating an exemplary unified graphics model of the intent data model.

Fig. 4 is a conceptual diagram illustrating an example model of components of a controller device, such as the controller devices in fig. 1 and 2, in accordance with the techniques of this disclosure.

FIG. 5 is a flow chart illustrating an exemplary method performed by a controller device for upgrading network device software.

Fig. 6 is a flow diagram illustrating an example method for upgrading network device software in accordance with the techniques of this disclosure.

Like reference symbols refer to like elements throughout the drawings and description.

Detailed Description

Fig. 1 is a block diagram illustrating an embodiment including elements of an enterprise network 2 managed using a controller device 10. The management elements 14A-14G (collectively, "elements 14") of the enterprise network 2 include network devices interconnected via communication links to form a communication topology to exchange resources and information. For example, element 14 (also commonly referred to as a network device or remote network device) may comprise a router, switch, gateway, bridge, hub, server, firewall or other Intrusion Detection System (IDS) or intrusion prevention system (IDP), computing device, computing terminal, printer, other network device, or a combination of such devices. Although described in this disclosure as sending, transmitting, or otherwise supporting packets, enterprise network 2 may send data in accordance with any other discrete data units defined by any other protocol, such as units defined by the Asynchronous Transfer Mode (ATM) protocol or datagrams defined by the User Datagram Protocol (UDP), etc. The communication link interconnect element 14 may be a physical link (e.g., optical, copper, etc.), wireless, or any combination thereof.

An enterprise network 2 is shown coupled to a public network 18 (e.g., the internet) via a communication link. For example, public network 18 may include one or more client computing devices. Public network 18 may provide access to network servers, application servers, public databases, media servers, end-user devices, and other types of network resource devices and content.

The controller device 10 is communicatively coupled to the element 14 via the enterprise network 2. Although only one device in the device management system is shown for purposes of the embodiment in fig. 1, in some embodiments the controller device 10 forms part of the device management system. The controller device 10 may be directly or indirectly coupled to the various elements 14. Once the elements 14 are deployed and activated, the administrator 12 manages the network devices using the controller device 10 (or a plurality of such management devices) using a device management protocol. One exemplary device protocol is the Simple Network Management Protocol (SNMP) that allows the controller device 10 to traverse and modify a Management Information Base (MIB) that stores configuration data within the various management elements 14. In thathttp：//tools.ietf.org/ 12 yeas 2002 available from html/rfc3411Further details of the SNMP Protocol are found in RFC3411, "An Architecture for Describing Simple Network Management Protocol (SNMP) Management frames," filed by the monthly Network working group, the Internet engineering task force, et al, which is incorporated herein by reference in its entirety.

In general practice, the controller device 10, also referred to as a Network Management System (NMS) or NMS device, and the elements 14 are centrally maintained by an IT group of an enterprise. An administrator 12 interacts with the controller device 10 to remotely monitor and configure the elements 14. For example, administrator 12 may receive alerts from controller device 10 regarding any of elements 14, view configuration data for elements 14, modify configuration data for elements 14, add new network devices to enterprise network 2, remove existing network devices from enterprise network 2, or otherwise operate enterprise network 2 and the network devices therein. Although described with reference to enterprise networks, the techniques of this disclosure are applicable to other network types, public and private, including LANs, VLANs, VPNs, and the like.

In some embodiments, for example, administrator 12 interacts directly with element 14 using controller device 10 or a local workstation through telnet, Secure Shell (SSH), or other such communication sessions. That is, the element 14 typically provides an interface for direct interaction, such as a Command Line Interface (CLI), a web-based interface, a Graphical User Interface (GUI), or the like, through which a user can interact with the device or issue text-based commands directly. For example, these interfaces typically allow a user to interact directly with the device, such as through a remote login, Secure Shell (SSH), hypertext transfer protocol (HTTP), or other network session, to enter text according to a defined syntax to submit commands to the managed element. In some embodiments, a user initiates an SSH session 15 with one element 14 (e.g., element 14F) using the controller device 10 to directly configure the element 14F. In this manner, the user is able to provide commands to the element 14 in a directly executed format.

Further, the administrator 12 can also create scripts that are submitted by the controller device 10 to any or all of the elements 14. For example, in addition to the CLI interface, the element 14 provides an interface for receiving a script specifying a command according to a scripting language. In a sense, a script may be output by the controller device 10 to automatically invoke a corresponding Remote Procedure Call (RPC) on the management element 14. The script may be consistent with, for example, extensible markup language (XML) or another data description language.

The administrator 12 uses the controller device 10 to configure the elements 14 to specify particular operating characteristics that facilitate the goals of the administrator 12. For example, administrator 12 may specify specific operating policies for elements 14 with respect to security, device accessibility, traffic engineering, quality of service (QoS), Network Address Translation (NAT), packet filtering, packet forwarding, rate limiting, or other policies. The controller device 10 performs configuration using one or more network management protocols designed for management of configuration data within the management network element 14, such as the SNMP protocol or the network configuration protocol (NETCONF) protocol or derivatives thereof, such as the Juniper device management interface. Generally, NETCONF provides a mechanism for configuring network devices and encodes configuration data using extensible markup language (XML) based data, which may include policy data. NETCONF is described in the ens "NETCONF Configuration Protocol" of network working group RFC4741, month 12 2006, available to tools, ietf, org/html/RFC 4741. The controller device 10 may establish a NETCONF session with one or more elements 14.

The controller device 10 may be configured to compare the new intent data model to an existing (or old) intent data model, determine differences between the new intent data model and the existing intent data model, and apply a reactive mapper to the differences between the new intent data model and the old intent data model. In particular, the controller device 10 determines whether the new configuration data set includes any additional configuration parameters related to the old intent data model and whether the new configuration data set modifies or ignores any configuration parameters included in the old intent data model.

The intent Data model may be a unified graphics model, and the low-level Configuration Data may be expressed as YANG, which is described in Bjorklund "YANG-A Data Modeling Language for the Network Configuration Protocol (NETCONF)" of Internet engineering task force group RFC6020, 2010, created by tools, ietf. In some embodiments, the intent data model may be expressed as YAML rather than markup language (YAML). The controller device 10 may include various reactive mappers for translating the intent data model differences. These functions are configured to accept an intent data model (e.g., which may be expressed as structured input parameters according to YANG or YAML). The function is also configured to output a respective set of low-level device configuration data changes (e.g., device configuration additions and deletions). I.e. y₁＝f₁(x)、y₂＝f₂(x)、...、y_N＝f_N(x)。

The controller device 10 may use YANG modeling for the intent data model and the low-level device configuration model. This data may contain relationships between YANG entities such as list items and containers. Conventionally, the controller device cannot support the configuration management function in real time. As discussed in more detail below, the controller device 10 can convert a YANG data model to a database model and convert a YANG confirmation to a data confirmation. Techniques for managing NETWORK DEVICES USING a graphics model of high-level configuration data are described in U.S. patent application No. 15/462,465, "configuration AND MANAGING NETWORK DEVICES USING PROGRAM ON yam-BASED GRAPH DATABASE," filed ON 17.3.2017, which is incorporated herein by reference in its entirety.

Controller device 10 may receive data from an administrator 12 representing any or all of the creation, update, and/or deletion actions with respect to the unified intent data model. The controller device 10 may be configured to use the same compilation logic as each of the creation, updating, and deletion to which the graphical model applies.

Generally, controllers like the controller device 10 use the intended hierarchical data model, low-level data models, and resources. The hierarchical data model is based on YANG or YAML. As discussed above, the hierarchical data model can be represented as a graph. Contemporary systems support the intent of easy network management. The intent is announceable. To achieve the intent, the controller device 10 attempts to select the optimal resource.

Typically, the customer environment is configured to allow the customer (e.g., administrator 12) to control intent implementation and ensure programming intent. The techniques of this disclosure support customer requirements to support Service Layer Agreements (SLAs) in near real-time. In this manner, customer traffic will not be negatively impacted by the intent. If the resources for which the state is intended become degraded (e.g., unavailable, highly utilized, or otherwise problematic on the corresponding device), the controller device 10 may select the appropriate resources to generate the desired configuration in near real-time.

The controller device 10 can support SLAs in almost real time. For example, the controller device 10 may support concurrent intent provisioning. The controller device 10 may use an enhanced resource matched filter to include and exclude certain system resources. The controller device 10 may further maintain the network in a consistent state while managing the traffic SLAs. The controller device 10 may also support concurrent stateless intent configurations. That is, the controller device 10 may support concurrent intent updates without invalidating other intent changes. The controller device 10 may also support the current version of the intent graph until a pause change has been deployed.

U.S. application No. 16/125,245 entitled "DYNAMIC INTENT installation AND program activity IN COMPUTER network", filed on 7.9.2018 (AND incorporated herein by reference IN its entirety) describes resource filtering query semantics as follows:

from this, the controller device 10 can derive decision variables, goals, and constraints. The controller device 10 may enhance queries to support expanding resources and including resources, for example, as shown below:

the excluded list may become a constraint for the resource selection optimization algorithm and the included list may also become a constraint for the resource selection optimization algorithm. In the above embodiment, the constraint includes the restriction of resources defined as "not in { d1, d2} and" in { d3} ". U.S. patent application 16/370,189, filed 3/29 in 2019, the entire contents of which are incorporated herein by reference.

In some embodiments according to the techniques of this disclosure, the controller device 10 may receive the upgrade request 11 from, for example, an administrator or other user. For example, the upgrade request 11 may include a command indicating an intent of the software of the component 14 to be upgraded from software release "a" to new or updated software release "B". The upgrade request 11 may also include a progress upgrade query describing one or more goals or constraints that must be attached to the upgrade process. The software of the element 14 may include an image.

In response to the upgrade request 11, the controller device 10 may identify at least some of the goals and constraints specified in the upgrade request 11, and then generate a device upgrade schedule for upgrading the components 14 in an attempt to achieve the goals and constraints. The controller device 10 may store the goals and/or constraints applicable to the selection element 14.

For example, the controller device 10 may determine a set of relationships between the components 14 and generate a multi-target upgrade graph to represent the relationships. The controller device 10 may calculate an optimal algorithm for the multi-target upgrade graph to generate a device upgrade schedule that attempts to achieve at least some of the goals and constraints specified in the upgrade request 11.

Fig. 2 is a block diagram illustrating an exemplary set of components of the controller device 10 in fig. 1. In this embodiment, the controller device 10 includes a control unit 22, a network interface 34, and a user interface 36. Network interface 34 represents an exemplary interface capable of communicatively coupling controller device 10 to an external device (e.g., one of elements 14 in fig. 1). Network interface 34 may represent a wireless and/or wired interface, e.g., an ethernet interface or radio configured to communicate in accordance with a wireless standard such as one or more of IEEE 802.11 wireless networking protocols (such as 802.11a/b/g/n or other such wireless protocols). Although only one network interface is shown for purposes of the embodiments, in various embodiments, the controller device 10 may include multiple network interfaces.

The control unit 22 represents any combination of hardware, software, and/or firmware for implementing the functions attributed to the control unit 22 and its constituent modules and elements. When the control unit 22 includes software or firmware, the control unit 22 also includes any necessary hardware for storing and executing the software or firmware, such as one or more processors or processing units. In general, the processing unit may include one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), or any other equivalent integrated or discrete logic circuitry, as well as any combinations of such components. Further, the processing units are typically implemented using fixed and/or programmable logic circuits.

User interface 36 represents one or more interfaces through which a user, such as administrator 12 (FIG. 1), interacts with controller device 10 to provide input and receive output, for example. For example, the user interface 36 may represent one or more of a monitor, keyboard, mouse, touch screen, touch pad, track pad, speaker, camera, microphone, and the like. Further, in the present embodiment, although the controller device 10 includes a user interface, however, the administrator 12 need not interact directly with the controller device 10, but may remotely access the controller device 10 via, for example, the network interface 34.

In this embodiment, the control unit 22 includes a user interface module 38, a network interface module 32, and a management module 24. The control unit 22 executes a user interface module 38 to receive input from the user interface 36 and/or to provide output to the user interface 36. The control unit 22 also executes a network interface module 32 to transmit and receive data (e.g., packets) via a network interface 34. The user interface module 38, the network interface module 32, and the management module 24 may again be implemented as respective hardware units, or software or firmware, or a combination thereof.

The functions of the control unit 22 may be implemented as one or more processing units in fixed or programmable digital logic circuits. The digital logic circuitry may comprise one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), or any other equivalent integrated or discrete logic circuitry, as well as any combinations of such components. When implemented as programmable logic circuits, the control unit 22 may also include one or more computer-readable storage media that store hardware or firmware instructions for execution by the processing unit of the control unit 22.

Control unit 22 executes management module 24 to manage various network devices, such as element 14 in fig. 1. For example, management of network devices includes configuring network devices according to instructions received from a user (e.g., administrator 12 in fig. 1) and providing the user with the ability to submit instructions to configure the network devices. Management of the network device also includes upgrading the network device with updated software, such as an updated software image. In this embodiment, management module 24 also includes a configuration module 26 and a translation module 28.

Management module 24 is configured to receive intent unified graphical modeling configuration data for a set of managed network devices from a user, such as administrator 12. The intent unified graphical modeling configuration data may be referred to as an "intent data model". In the future, the user may update the configuration data, for example, adding a new service, removing an existing service, or modifying an existing service executed by the managed device. The unified intent data model can be constructed from, for example, YANG or YAML. The graphical model may include a plurality of vertices connected by edges in a hierarchical manner. In YANG, the edges of the graphic model are represented as "LEAFref" elements. In the case of YAML, these edges can be represented as "ref" edges. Likewise, parent-child vertex relationships can be represented as "owned" edges. For example, a vertex of element a can be understood to mean "element a has element B" using an owned-edge to point to a vertex of element B. In some embodiments, management module 24 also provides the user with the ability to submit a reactive mapper that translation module 28 executes to convert the intent data model to device-specific low-level configuration instructions.

The controller device 10 further comprises a configuration database 40. Configuration database 40 typically includes information describing managed network devices, such as elements 14. The configuration database 40 may serve as an intent data store that may be used to hold and manage a collection of intent data models. For example, configuration database 40 may include information indicating device identification (such as MAC and/or IP address), device type, device vendor, device class (e.g., router, switch, bridge, hub, etc.), and so forth. Configuration database 40 also stores current configuration information (e.g., an intent data model, or in some cases, an intent data model with low-level configuration information) for the managed device (e.g., element 14). In accordance with techniques of this disclosure, configuration database 40 may include a unified intent data model.

Although user interface 36 is described for purposes of the embodiment as allowing administrator 12 (FIG. 1) to interact with controller device 10, in other embodiments, other interfaces may be used. For example, the controller device 10 may include a representational state transfer (REST) client (not shown) that serves as an interface to another device through which the administrator 12 may configure the controller device 10. Likewise, the administrator 12 may interact with the controller device 10 through the REST client to configure the elements 14.

Management module 24 may model configuration database 40 as a graphical database representing YANG configuration data elements. YANG specifies various types of data structures including lists, leaf lists, containers, container presence, and features. Management module 24 may model each of the list, the container presence and features, and the top level container as vertices in the graphics database. Alternatively, the configuration database 40 may represent YAML configuration data elements.

After the graph database is built, management module 24 may perform operations on the data in the graph database. For example, management module 24 may map Netconf-based operations such as get-config, get-config with filters, and edge-config to a graphical query language query such as a Gremlin query. Gremlin is described in Gremlin docs of Gremlin, spmallette, documentup, com, and gitub, com/tinkerp/Gremlin/wiki. If the condition attribute changes, management module 24 may execute the condition mapped to the vertices and edges of the graph database. As with the processing of functions discussed in more detail below, management module 24 may process additional changes in response to conditions. Management module 24 may further update all changes in the semantics of the transaction.

In accordance with the techniques of this disclosure, controller device 10 may be configured to determine and execute an upgrade schedule to upgrade software of a plurality of network devices 14 (fig. 1) according to user-defined criteria. A user planning a software upgrade for multiple devices 14 may have one or more intent goals, policies, constraints, or parameters regarding the order of device upgrades, or the date and/or time of device upgrades. Some non-limiting examples of the user constraints may include: (1) maximum device "down time" such as five minutes; (2) a maximum number of device connections that are simultaneously "disconnected" at any given time, such as five percent of connections; (3) all devices 14 must be upgraded to the latest licensing date of the new release or edition; (4) a maximum number or amount of devices that can be upgraded simultaneously, such as two percent; a maximum "upgrade window" duration, such as two hours per day; (5) indicating a priority that devices with more users should upgrade first; or (6) redundant devices cannot be upgraded at the same time.

The user may specify one or more upgrade constraints within the upgrade request 11. The controller device 10 receives an upgrade request 11 from a user. The upgrade request 11 may include a progress upgrade query that can query against the model. The progress-escalation query may include various data inputs from the user, such as device selectors, scheduling parameters, and policies. One embodiment of a progress escalation query is as follows:

the upgrade request 11 may include a device selector. The device selector is a data set indicating a particular device or set of devices 14 applying particular upgrade constraints specified in the progress upgrade query. For example, in the sample upgrade query listed above, the device selector indicates devices that meet the following two criteria: (1) equipment located at a site called "Bangalore" (Bangalore); and (2) devices that have the "role" of provider-edge (PE) devices. The device selector may follow a filtering grammar, such as that described in U.S. patent application No. 16/144,313, which is incorporated herein by reference in its entirety.

Upgrade request 11 may include one or more scheduling parameters indicating a preferred date and/or time for controller device 10 to perform an upgrade on device 14. For example, the sample upgrade query listed above includes four scheduling parameters: (1) the start date of the update schedule; (2) time of day of upgrade deployment; (3) an interval duration; and (4) an end date of the progress of the image update.

The upgrade request 11 may include one or more policies. As described above, the policy may include user preferences indicating the order in which the controller device 10 upgrades the devices 14. A policy may be formatted to include a policy name, a policy priority level (such as an integer within a scalar from one to ten), and one or more variables indicating the policy. In the sample escalation query provided above, the policy named "min _ sites _ impact" has a priority of "1" and the integer variable has a value of "200," which specifies the maximum number of places that can be affected, in order to help achieve a priority that minimizes the number of places affected.

In some embodiments, a policy may include goals and constraints. "goal" expresses a business goal as a mathematical value. The goal may include an indication of an intent to maximize, minimize, or average a particular value. "constraint" means a specific mathematical relationship with a variable such as an equality or inequality (e.g., >, <, >, or |). An example of a policy that explicitly defines goals and constraints is as follows:

in addition to user-specified upgrade criteria, the controller device 10 may also receive information indicating service redundancy between devices and/or devices 14. Device redundancy provides additional or alternative instances of specific network devices within the network infrastructure. Network devices with redundancy may facilitate high availability in the event of a network device failure. In some embodiments, different paths may be used by the network to provide service redundancy depending on the particular service of one or more network devices 14, rather than the device itself. In order to reduce the risk of network failures, redundant devices and redundant connections should not be upgraded simultaneously. For example, if the devices (p1, p2, p3, p4) are stand-alone devices that define a first path between p1 and p2, and a redundant path between p3 and p4, then there is network traffic impact while these devices are being upgraded. To avoid impacting customer traffic, p1, p2, p3, and p4 should not be upgraded at the same time. Specifically, p1 and p2 should be upgraded simultaneously, and p3 and p4 should be upgraded simultaneously, not simultaneously with p1 and p 2.

The controller device 10 may determine device redundancy information for network devices in the network in accordance with the network device model enhanced by the techniques of this disclosure. An exemplary network device model is as follows:

in the above embodiments, the enhanced network device model includes alternative device characteristics for specifying redundant, highly available, or otherwise alternative network devices capable of communication traffic forwarding and service handling for the modeled network devices. The asterisks indicate a series of elements. Thus, there may be zero or more alternative devices.

The service model may include link information indicating devices and communication paths, thereby indicating alternative or redundant link information. Service endpoint redundancy may be indicated by "alternative nodes". In one embodiment, the L3VPN service includes a set of PE devices as endpoints. In multi-homing, the vendor-edge device PE1 may be redundant to PE 2. This is represented as an alternative node as the following exemplary text model:

in the above embodiments, the enhanced service model includes alternative link characteristics for specifying redundant, highly available, or otherwise alternative paths for services capable of transporting or otherwise handling communication traffic for the modeled service. As in the above embodiments, the enhanced service model may further specify alternative nodes to the service node. In another embodiment, a Label Switched Path (LSP) may include a primary path and a secondary path. Both the primary path and the secondary path may be modeled as links. The primary path links may include the secondary path as an "alternative" link. Further, the secondary path link may include the primary path as an alternative link. The asterisks indicate a series of elements. Thus, there may be zero or more alternative links.

In some embodiments, the controller device 10 may also consider all services provided by the device to the customer to determine an additional form of redundancy. As with the examples provided below, redundancy may be an inherent property of YAML dictionaries:

the controller device 10 may include an upgrade module 27. Upgrade module 27 may be configured to receive data inputs such as device selectors, scheduling parameters, user policies, and redundancy information, and calculate a device upgrade schedule that satisfies all, or the highest possible number, of these constraints. For example, upgrade module 27 may calculate a multi-target graphic such as upgrade graphic 29 and then calculate an optimal algorithm for the graphic.

The upgrade graph 29 includes a plurality of nodes (or vertices) and a plurality of edges, each edge connecting two nodes. Each node within upgrade graph 29 represents a single network device. Each node within upgrade graph 29 may include device level information, or upgrade constraints related to the respective network device, such as device upgrade time, number of customers or users associated with the device, or device location. At least some of the edges each connecting two nodes indicate device redundancy or service-based dependencies, such as upgrade policies relating to the two network devices to each other.

Once upgrade module 27 calculates the nodes and edges of upgrade graphic 29, the upgrade module may determine and assign a "weight" to each node and each edge. For example, the upgrade module may determine and assign node weights based on a required upgrade time for the respective network device or criticality of the device (e.g., number of users of the respective device). Based on this characteristic, upgrade module 27 will facilitate even distribution of more devices among the sub-graphs, as described in further detail below.

Also as specified in the upgrade request 11, for an edge indicating a service-based dependency, such as an upgrade objective or policy specified in the upgrade request 11, the edge weight may indicate that the user of the policy specifies a "priority" (such as an integer within a scalar from 1 to 10). In this embodiment, lower priority values have higher priorities, while high priority values have lower priorities. In other embodiments, the priority level may be indicated in other ways.

For edges between two nodes indicating multiple targets, upgrade module 27 may assign a single scalar edge weight based on the multi-objective optimal function. The multi-objective optimization method provides a scheme set for balancing the competition upgrading objectives. The multi-objective function can have a maximum and a minimum objective function. Mathematically, it can be expressed as follows:

minimum/maximum fm (x), for m 1, 2,. N, where, for example:

f1(x) -minimum loss of communication traffic

f2(x) -minimum number of affected places

f3(x) -maximizing customer 1 VPN service runtime

Combining multiple targets into a single target based on the combined equation. Upgrade module 27 may first normalize the range of the objective function output such that the output is scaled between 0 and 1. Mathematically, it can be expressed as follows:

f' x ═ f x-fmin)/frange, where

f' x-standard value x in the range [0, 1]

fx-old value x

fmin-the minimum of the old values

frange-old Range

The upgrade module 27 may then convert all of the minimum functions to maximum functions, or vice versa. Since the full function range is normalized to [0, 1], upgrade module 27 may convert the minimum function to the maximum function as follows:

maxf(x)＝1-(minf(x))

based on the normalized objective function, the upgrade module 27 may calculate initial objective weights o for each objective_iAnd arranging the target weight of each edge e as a target vector o^e＝(o^e1，o ^e2，...o^en)。

Each target may have a priority as specified in the upgrade request 11. Upgrade module 27 may arrange the respective priorities of the plurality of targets into a priority vector. Based on the priority vector, the upgrade module 27 may normalize the edge weights as follows:

wherein:

w^eis a scalar edge weight of the edge e,

o^e _iis the object weight of the 'i' th object of the edge e,

P^e _iis the priority of the 'i' th target of the edge e, and

m is the number of targets associated with two nodes connected by an edge e.

For example, the edges of three targets with

target weights

2, 2, and 1 and priorities 1, 5, and 1, respectively, would have a target weight vector (2, 2, 1) and a priority vector (1, 5, 1). The upgrade module will assign scalar edge weights (2 × 1) + (2 × 5) + (1 × 1) ═ 2+10+1 ═ 13 to the edges.

Once upgrade module 27 assigns a node weight to each node and an edge weight to each edge, upgrade module 27 may then compute a graph partitioning algorithm to "partition" upgrade graph 29 into a plurality of subgroups. For example, each subgroup may indicate a set of network devices that, to the extent feasible, are capable of or should be upgraded simultaneously based on redundancy relationships and/or user policies. The upgrade module 27 may calculate a graph partitioning algorithm selected to obtain at least one of the following two parameters: (1) the sum of all node weights of each of the subgroups is approximately equal; and (2) the sum of all edge weights connecting two distinct subgroups is minimal. Mathematically:

given pattern G ═ N, E, W_N，W_E) Wherein:

node (or vertex)

E-edge line

W_NNode weight

W_EEdge weight

For the division of the graph G, the division N is selected as N₁∪N₂∪...∪N_KSo that:

(1) each N_jThe sum of node weights in (1) is evenly distributed (load balancing).

(2) The sum of all edge weights connecting all the differently segmented edges is the minimum (edge-cut minimum).

G may be any upgrade graphics described in this disclosure. In the context of multi-target graph partitioning, the operation is as follows:

in some embodiments, the selection upgrade module 27 applies a K-segmentation method to segment the graphics into K groups. The upgrade module 27 obtains the value k, i.e., the number of slots indicated in the upgrade request 11, calculated from the user's progress.

Once upgrade module 27 divides upgrade image 29 into subgroups, upgrade module 27 may "subdivide" the various subgroups. Upgrade module 27 may subdivide each subgroup to arrange or reorder all devices within the subgroup based on device level criteria, such as device critical time, upgrade priority based on number of users per device, device geographic location, device upgrade time, or maximum number of parallel nodes within a maintenance window that are allowed to be upgraded. The upgrade module 27 may also subdivide the order of devices within a subgroup according to criteria such as the maximum number of parallel upgrades allowed or the maximum number of upgrades per maintenance window, user policy or time, cost, and/or processing constraints. As other examples, the devices within the subgroup may be reordered based on device criticality time, priority of devices with a larger number of users, geographic location, and upgrade time. The upgrade module 27 may avoid upgrading devices in different subgroups simultaneously. For example, if a user policy "maximum number of allowed parallel upgrades" is defined, the set of devices obtained from each subgroup will be less than or equal to the value of "maximum number of allowed parallel upgrades". The total number of devices selected for each slot will be less than or equal to the maximum number of upgrades in the maintenance window.

Once the upgrade module 27 subdivides each subgroup, the upgrade module may calculate an upgrade schedule based on the subgroups. For example, upgrade module 27 may select multiple devices from each subgroup for parallel upgrade and assign upgrade time slots to the devices. Finally, the upgrade module 27 may perform device software upgrades according to schedule. In particular, upgrade module 27 may communicate with component configuration service 116 to communicate with each device to be upgraded and perform device upgrades based on the determined upgrade schedule. As one embodiment, upgrade module 27 may invoke component configuration service 116 to direct managed devices to a repository with software updates according to instructions to upgrade the devices with the software updates.

FIG. 3 is a conceptual diagram illustrating an exemplary unified graphics model 60 of the intent data model. In addition, in the present embodiment, unified graphics model 60 includes nodes A62, B64, and C66. Initially, the unified graphics model may not include VPN 68, VPN 72, and VPN 78, nor optical 180, lambda 82, optical 184, and lambda 86. As a result of the modification by the intention data model update, node a 62 is coupled to node B64 via VPN 68 and LSP 170, and node B64 is coupled to node C66 via VPN 72 and LSP 274, and node C66 is coupled to node a 62 via VPN 78 and LPS 376. Further, additional nodes optical 180, lambda 82, optical 184, and lambda 86 are added between node B64 and node C66 as a result of the additional capabilities required by the optical intent.

The stateful business policies can be written on top of the stateless intent layer. For example, the user may state the intent "to provide bandwidth between a-B, B-C, C-a,.. to a high bandwidth VPN connection between sites A, B, and C". This can produce various stateless intents. The state intent may be compiled into the L3VPN (overlay tunnel) and transport mechanism between a-B, B-C, C-a that provides the required bandwidth. For example, the transport mechanisms may include an RSVP LSP between A-B with 30Mbps, an RSVP LSP between B-C with 50Mbps, and an RSVP LSP between C-A with 80 Mbps. In this example, it may be desirable to establish an RSVP LSP between C-As with 80 Mbps. There may be situations where more capacity is needed, and therefore there may also be another intent "optical intent: increase the capacity between C-a ". If C-A already has a connection of 70Mbps, the stateless intent can provide a new 10G lambda between C-A on the optical network.

When implementing a state intent, a controller device such as controller device 10 may need to consider existing stateless intents between endpoints, as well as the current state. In the above embodiments, in order to perform various intents, the controller device 10 may query the connection graph (including the stateless intent) and create/modify the stateless intent as needed. Techniques related to the use of unified graphics models and intentions are described in U.S. patent No. 15/462,465 filed on 2017, 3, 17, which is hereby incorporated by reference in its entirety. Thus, the intention data model can be represented using a unified graphic model. The intent data model (i.e., unified graphics model) can be extended as more use cases are added. Furthermore, the use of a unified graphics model allows for the retrieval of intent based on endpoints (e.g., through query graphics).

Fig. 4 is a conceptual diagram illustrating an example model 100 of components of a controller device, such as the controller device 10, according to the techniques of this disclosure. In the present embodiment, model 100 includes a management unit 102, an intent infrastructure 110, and an analysis node 130. The management unit 102 includes an upgrade module 27 and policies 105. The intent infrastructure 110 includes an intent layer 112, an intent compiler 114, a component configuration service 116, an intent database 118, and a configuration (config.) database 120. Analysis node 130 includes a telemetry aggregation unit 132, an element telemetry set 134, and a telemetry database 136. Management module 24 in FIG. 2 may include components that perform the functions attributed to the various components of model 100. For example, configuration module 26 in fig. 2 may correspond to intent infrastructure 110, translation module 28 may correspond to intent compiler 114, configuration database 120 may correspond to configuration database 40, and so on. The particular components shown in FIG. 4 may be implemented by management module 24 in FIG. 2.

Upgrade module 27 may communicate with component configuration service 116 to communicate with various devices to be upgraded and perform device upgrades based on the determined upgrade schedule. As one embodiment, upgrade module 27 may invoke component configuration service 116 to direct managed devices to a repository with software updates according to instructions to upgrade the devices with the software updates. In some cases, the element configuration service 116 is an Element Management Service (EMS) that is part of an EMS layer.

The management unit 102 invokes the intent layer 112 to provide stateless intent. The techniques of this disclosure may be used to ensure that the intended form of the traffic policy is converted to the network in near real-time to prevent negative impacts on the SLA. Intent compiler 114 may be configured to compile intents simultaneously. Additional details regarding the concurrent, simultaneous COMPILATION of INTENTs are described in U.S. application No. 16/282,160, "SUPPORTING COMPILATION AND exhanstingilty ON UNIFIED graphics BASED INTENT modules," filed ON 21/2/2019, the entire contents of which are incorporated herein by reference.

When a business policy (i.e., state intent) degrades, the management unit 102 can determine the appropriate resources to address the degraded intent and invoke the intent infrastructure 110 to provide the intent. When an intent is to implement a failure, the management unit 102 may determine a solution to the failure. For example, if the failure is related to a new resource, the management unit 102 may update the set of excluded resources, obtain the new resource, and provide the network. If the failure is not related to the new resource, but is due to an existing network element and the existing network element is not available, the management unit 102 may determine to reserve the old resource and provide the network. If the failure is due to a semantic failure, the management unit 102 may submit a negative intent, provide a network, and initiate an alert indicating the semantic failure.

Thus, in general, the management unit 102 ensures that there are no conflicting changes in the intent data model changes. After ensuring that there are no conflicting changes, the management unit 102 submits the intent data model changes to the intent infrastructure 110. When a stateless intent change is submitted, the intent infrastructure 110 can create a change set that maintains an updated (e.g., created, updated, or deleted) set of vertices and an identification of the corresponding version. The intent infrastructure 110 also maintains deployed intent data models and undeployed intent data models in an intent database 118. The intent infrastructure 110 triggers the intent compiler 114 to execute the translator of the affected vertex in the change set.

The translation may be asynchronous, and thus, the intent infrastructure 110, by using the global intent version, may ensure that intent changes do not override other intent changes. The global intent version represents a version of the intent graph that generated the low-level model resource. The intent infrastructure 110 saves the global intent version of the deployed graphical model along with the low-level model resources, for example, in a configuration database 120. The intent infrastructure 110 may rerun the intent compiler 114 if the newly generated low-level model global intent version is lower than the global version of the low-level model.

To support the updates, the intent infrastructure 110 supports multiple versions of intent data models, such as deployed and undeployed intent data models. Saving one version of the complete graph per change will serialize the intent changes. Thus, the intent infrastructure 110 saves the deployed and undeployed intent data models, including the deployed vertices and undeployed vertices, respectively, within the same graph. Each vertex contains a state and a version id. The intent infrastructure 110 may set the state values of the vertices corresponding to the intent changes to states representing "created", "updated", or "deleted". As discussed below, once the vertex is deployed, the intent infrastructure 110 may also set the state of the state value to represent "deployed".

The intent infrastructure 110 can save the updated vertices within the same graph. As described above, the intent infrastructure 110 can maintain a snapshot table that contains a list of globally unique identifiers (UUIDs) and old versions of corresponding updated vertices.

When a vertex is created, the intent infrastructure 110 sets the state value of the created vertex to a value representing the "created" state. After deploying the vertices to the network, the intent infrastructure 110 updates the state values to represent the "deployed" state.

When a vertex is updated, the intent infrastructure 110 sets the state value of the updated vertex to a value representing the "updated" state. After deploying the vertices to the network, the intent infrastructure 110 updates the state values to represent the "deployed" state.

When a vertex is deleted, the intent infrastructure 110 sets the state value of the created vertex to a value representing the "deleted" state. After the update to the network deployment, the intent infrastructure 110 removes the deleted vertices from the graph.

The following table represents exemplary state transitions:

thus, the intent database 118 includes a representation of the business policies of the network. The representation may take the form of a unified graphic 60, such as the unified graphic 60 in fig. 3.

The customer may interact with the management unit 102 to define new criteria for the upgrade process performed by the upgrade module 27, such as new service or device characteristics, or constraints, that are predicted for characteristics of the unified graphic 60 and queried for the upgrade request 11. Thus, the client can add a new role or label for the network device or add a new policy 105 by following the target syntax.

FIG. 5 is a flow chart illustrating an exemplary method that a controller device may use to upgrade network device software for a plurality of devices. The method is described with reference to an exemplary controller device 10. The controller device 10 receives an upgrade request 11 with a progress upgrade query from, for example, a network administrator or other user (250). The progress upgrade query may specify one or more user criteria or preferences regarding the time or sequence in which the controller device performs the software upgrade.

Based on the progress escalation query, and in some cases, based on device redundancy and/or service redundancy data provided according to the corresponding augmentation model, the controller device may query the unified graph 60 using a graphical query language such as GraphQL and use the results to obtain an escalation graph describing the network (252). For example, the graph may include edges between nodes representing various devices within the network that satisfy the progress-escalation query and nodes indicating redundancy relationships and escalation policy relationships between the devices. Next, the controller device 10 calculates and assigns an edge weight for each edge derived from the policy based on the upgrade objective(s) (254).

The controller device 10 divides the graphics into a plurality of sub-groups (256). The controller device 10 may determine the subgroup by at least one of the following two criteria: between subgroups, the sum of the weights of all nodes within a subgroup is approximately equal; and the sum of all edge weights connecting edges of two different subgroups is minimal. Each subgroup may indicate a set of network devices that must not be updated together.

Based on the subgroup, the controller device 10 determines an upgrade progress (258). For example, the controller device 10 may select a group of devices from a subset to upgrade the group in parallel (i.e., simultaneously) and assign a upgrade slot to the group. Finally, the controller device 10 may upgrade the network device according to the upgrade schedule (260).

Fig. 6 is a flow diagram illustrating an example method for upgrading network device software in accordance with the techniques of this disclosure. The method is described with reference to an exemplary controller device 10. The controller device 10 receives an upgrade request 11 with a progress upgrade query from, for example, a network administrator or other user (221). The progress upgrade query may specify one or more user criteria or preferences regarding the time or sequence in which the controller device performs the software upgrade. Based on the progress-escalation query, the controller device 10 may determine one or more escalation progress targets or constraints that the controller device follows in calculating an escalation progress (222).

The controller device 10 integrates the goals and constraints determined from the progress-escalation query, and in some cases, based on device redundancy and/or service redundancy data provided according to the corresponding augmentation model, the controller device may query the unified graph 60 using a graphical query language such as GraphQL and use the results to obtain an escalation graph describing the network (224). For example, the upgrade graph may include edges between nodes representing various devices within the network and nodes indicating redundancy relationships between devices and upgrade policy relationships. Next, the controller device 10 calculates and assigns an edge weight for each edge derived from the upgrade policy based on the plurality of targets (226).

The controller device 10 divides the graphics into a plurality of sub-groups (228). The controller device 10 may determine the subgroup by at least one of the following two criteria: between subgroups, the sum of the weights of all nodes within a subgroup is approximately equal; and the sum of all edge weights connecting edges of two different subgroups is minimal. Each subgroup may indicate a set of network devices that must not be updated together.

Once the controller device 10 determines the plurality of subgroups, the controller device may subdivide the various subgroups by arranging the various network devices based on device-level criteria, the maximum number of parallel upgrades allowed, and/or the maximum number of upgrades per maintenance window (230).

Based on the subdivided subset, the controller device 10 determines an upgrade progress (232). For example, the controller device 10 may select a group of network devices from a single subgroup to upgrade these new groups in parallel and allocate upgrade slots for these new groups. Finally, the controller device may upgrade (234) the network device according to the upgrade schedule.

The techniques described in this disclosure may be implemented, at least in part, in hardware, software, firmware, or any combination thereof. For example, various aspects of the described techniques may be implemented within one or more processors, including one or more microprocessors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs), or any other equivalent integrated or discrete logic circuitry, as well as any combinations of such components. The term "processor" or "processing circuitry" may generally refer to any of the above logic circuitry alone or in combination with other logic circuitry, or any other equivalent circuitry. A control unit comprising hardware may also perform one or more of the techniques of this disclosure.

The hardware, software, and firmware may be implemented within the same device or separate devices to support the various operations and functions described in this disclosure. Furthermore, any of the described units, modules, or components may be implemented together or separately as discrete, but interoperable, logic devices. The description of different features as modules or units is intended to highlight different functional aspects and does not necessarily imply that these modules or units must be realized by separate hardware or software components. Rather, functionality associated with one or more modules or units may be performed by separate hardware or software components, or integrated within common or separate hardware or software components.

The techniques described in this disclosure may also be embodied or encoded in a computer-readable medium, such as a computer-readable storage medium, that contains instructions. For example, instructions embedded or encoded in a computer-readable medium may cause a programmable processor or other processor to perform a method when the instructions are executed. Computer-readable media may include non-volatile computer-readable storage media and transitory communication media. Computer-readable storage media, i.e., tangible and non-volatile, may include Random Access Memory (RAM), read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory, a hard disk, a CD-ROM, a floppy disk, a cassette, magnetic media, optical media, or other computer-readable storage media. The term "computer-readable storage medium" refers to physical storage media rather than signal, carrier, or other transitory media.

Claims

1. A method of managing multiple network devices, comprising:

receiving an upgrade request by a controller device managing the plurality of network devices of the network providing one or more services;

Based on the upgrade request, the controller device determines an upgrade graph with nodes, each node representing one of the network devices or a network service provided by the network, and the upgrade graph also has one or more nodes. vector lines, each vector line connecting two of the nodes and representing device redundancy or service dependencies;

Calculate and assign, by the controller device, a vector weight for each of the vector lines by combining the results of at least one objective function, each of the at least one objective function have a minimum or maximum target of the network;

dividing, by the controller device, the upgrade graph into a plurality of subgroups based on the vector line weights, wherein each subgroup of the subgroups includes a node representing a corresponding network device;

subdividing, by the controller device, at least one of the subgroups based on at least one of device-level criteria, a maximum number of concurrent upgrades allowed, and a number of upgrades in a maintenance window;

An upgrade progress is determined by the controller device in which, for each of the subgroups, the controller device simultaneously executes software on all network devices represented by the nodes in the subgroup upgrade; and

The software of each of the plurality of network devices is upgraded by the controller device according to the upgrade progress.

2. The method of claim 1, wherein determining an upgrade progress based on the subgroup comprises determining, by the controller device, a plurality of maintenance slots, wherein each maintenance slot includes a set of devices from the same subgroup.

3. The method of claim 1, wherein the device-level standard comprises at least one of the following:

Equipment critical moment:

Upgrade priority based on the number of subscribers served by a specific device;

the geographic location of the device;

Equipment upgrade duration; and

The maximum number of concurrent nodes allowed for an upgrade within a maintenance window.

4. The method of claim 1,

wherein a first subgroup of the subgroups includes a first node representing a first network device and a second subgroup of the subgroups includes a second node representing a second network device;

Wherein, determining the upgrade progress based on the subset includes determining the upgrade progress such that the first network device is not scheduled to be upgraded simultaneously with the second network device.

5. The method of claim 1, wherein the upgrade request includes a device selector that selects the network device based on a label or role of the network device.

6. The method of claim 1, wherein the upgrade request includes scheduling parameters including at least one of the following:

the upgrade start date;

the start time of the upgrade window;

the upgrade window duration; and

the upgrade end date; and

Wherein, determining the upgrade progress based on the subgroup includes determining the upgrade progress based on the scheduling parameter.

7. The method of claim 1, wherein the upgrade request includes at least one policy for specifying one of the at least one objective function.

8. The method of claim 7, wherein the at least one policy specifies at least one of the following:

Maximum downtime for any network device;

the maximum number of connections that may be disconnected at any one time;

The upgrade completion date for the latest license;

The maximum percentage of devices being upgraded at any one time;

the maximum upgrade window duration; and

Upgrade priority based on the device with the most subscribers.

9. The method of claim 1, wherein dividing the graph into subgroups comprises reducing the sum of the vector line weights of all vector lines connecting any two subgroups.

10. The method of claim 1, further comprising:

A node weight of the node is calculated and assigned by the controller device for each of the nodes, wherein dividing the graph into subgroups includes evenly distributing the sum of all node weights among the subgroups.

11. The method of claim 1, further comprising:

receiving, by the controller device, a standard definition of a new standard defining an upgrade process,

Wherein, the upgrade request specifies the new standard.

12. A controller device for managing a plurality of network devices, the controller device comprising one or more processing units implemented in circuitry and configured to:

receive upgrade requests;

An upgrade graph is determined based on the upgrade request with nodes, each node representing one of the network devices or a network service provided by the network, the upgrade graph further having one or more arrows, each The arrows connect two of the nodes and represent device redundancy or service dependencies;

A vector weight for each of the vector lines is calculated and assigned for each of the vector lines by combining the results of at least one objective function, each of the at least one objective function having the smallest goal or maximum goal;

dividing the upgrade graph into a plurality of subgroups based on the vector line weights, wherein each of the subgroups includes a node representing a corresponding network device;

subdividing at least one of the subgroups based on at least one of device-level criteria, a maximum number of concurrent upgrades allowed, and a number of upgrades in a maintenance window;

determining an upgrade schedule in which, for each of the subgroups, the controller device simultaneously performs software upgrades on all network devices represented by nodes in the subgroup; and

The software of each network device in the plurality of network devices is upgraded according to the upgrade progress.

13. The controller device of claim 12,

wherein the controller device is configured to determine the upgrade progress based on the subgroup by determining a plurality of maintenance slots, wherein each maintenance slot includes a set of devices from the same subgroup.

14. The controller device of claim 12,

Wherein, the controller device is configured to determine an upgrade progress based on the subgroup such that the first network device is not scheduled to be upgraded at the same time as the second network device.

15. The controller device of claim 12, wherein dividing the upgrade graph into subgroups comprises reducing the sum of the vector line weights of all vector lines connecting any two subgroups.

16. A computer-readable storage medium having stored thereon instructions that, when executed, cause a processor of a controller device to manage a plurality of network devices:

receive upgrade requests;