WO2024019791A1 - Sharing network manager between multiple tenants - Google Patents
Sharing network manager between multiple tenants Download PDFInfo
- Publication number
- WO2024019791A1 WO2024019791A1 PCT/US2023/022191 US2023022191W WO2024019791A1 WO 2024019791 A1 WO2024019791 A1 WO 2024019791A1 US 2023022191 W US2023022191 W US 2023022191W WO 2024019791 A1 WO2024019791 A1 WO 2024019791A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- tenant
- configuration
- service
- logical network
- services
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0893—Assignment of logical groups to network elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0894—Policy-based network configuration management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/08—Configuration management of networks or network elements
- H04L41/0895—Configuration of virtualised networks or elements, e.g. virtualised network function or OpenFlow elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/34—Signalling channels for network management communication
- H04L41/342—Signalling channels for network management communication between virtual entities, e.g. orchestrators, SDN or NFV entities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/40—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using virtualisation of network functions or resources, e.g. SDN or NFV entities
Definitions
- Container orchestration systems such as Docker Swarm®, Apache Mesos®, or Kubernetes®, the latter of which has become a de- facto choice for container orchestration.
- Kubernetes clusters can be run in an on-premises datacenter or in any public cloud (e.g., as a managed service or by bringing up your own cluster on compute instances).
- an application could be offered by an application provider as a multi-tenant application. In such a scenario, the application provider needs to ensure fairness in distribution of resources to the different tenants.
- Some embodiments provide a multi-tenant network policy manager implemented in a container cluster (e g., a Kubernetes cluster).
- the network policy manager manages the logical networks for multiple tenants, each of which may have a logical network that is implemented across a respective set of one or more datacenters.
- the network policy manager cluster is responsible for receiving logical network configuration policy from administrators for the tenants, storing the logical network configuration, and distributing this configuration to the different tenant datacenters so that the logical networks can be correctly implemented across the tenant datacenters.
- different functions of the network policy manager are implemented as different services (e.g., micro-services) on their own containers (e.g., on separate Kubernetes Pods).
- the cluster includes multiple nodes, with the various services assigned to different nodes.
- the services include both (i) shared services that are shared by all of the tenants and (ii) per-tenant services that are instantiated on a per-tenant basis (i.e., each container implementing the service is dedicated to a single tenant).
- the network policy manager of some embodiments includes (i) application programming interface (API) processing services, (ii) a database service or services, (iii) queue management services, (iv) span determination services, and (v) channel management services.
- API application programming interface
- the API processing services and the database service are shared between the various tenants, while the queue management services, span determination services, and channel management services are implemented on a per-tenant basis.
- the API processing services receive configuration requests from tenants (e.g., specifying modifications to the tenant’s logical network).
- the ingress path for the container cluster is handled by a gateway that performs load balancing across the API processing services. That is, each configuration request is distributed by the gateway to one of the API processing services in the container cluster according to a load balancing algorithm.
- Each of these API processing services can receive configuration requests from multiple different tenants, as the API processing services of some embodiments are not instantiated on a per-tenant basis.
- the gateway / load balancer also ensures that a single tenant cannot overload the system and prevent other tenants from being able to access the API processing services.
- the service parses the request to identify (i) the tenant making the request and (ii) the logical network modification requested.
- the request may specify to add, remove, or change a logical network element (e.g., a logical forwarding element, logical port, policy rule, etc.) in some embodiments.
- the API processing service then posts the logical network configuration change to a configuration queue for the identified tenant as well as to the logical network configuration stored for the tenant in the shared database.
- the shared database in some embodiments, is managed by one or more database services in the cluster.
- the database is a distributed database (e.g., distributed across nodes in the cluster or stored outside of the container cluster) that stores the logical network configuration for each tenant.
- the database may be organized so as to store separate sets of tables for each tenant, or in any other way, so long as the tenant logical network configurations are accessible.
- the logical network configuration data in the database tables for a particular tenant is expressed as a hierarchical tree of tenant intent.
- the other services that make up the multi-tenant policy manager are separated per tenant in some embodiments, rather than shared. These services include the queue management services, span determination services, and channel management services.
- a particular enterprise may have more than one separate logical network spanning different sets of (potentially overlapping) datacenters, which are treated as different tenants by the policy manager in some embodiments (and thus have separate corresponding sets of services).
- the queue management service for a tenant stores the logical network configuration changes in a persistent configuration queue for the tenant.
- this queue is created for the tenant logical network when the tenant is first defined within the policy manager.
- the span determination service for a given tenant determines, for each logical network configuration change, which of the datacenters spanned by the tenant logical network needs to receive the update Tn some embodiments, certain logical network elements only span a subset of the datacenters spanned by the logical network (e.g., logical switches or routers defined only within a specific datacenter, policy rules defined only for a subset of datacenters, etc.).
- the span determination service is responsible for making this determination for each update and providing one or more copies of the update to its corresponding channel management service (for the same tenant).
- the span determination service also identifies for each item of logical network configuration stored in the database, which tenant datacenters require that configuration item.
- the span determination services can either be on-demand or dedicated services. If a tenant has specified (e.g., via its subscription to the policy manager) for a dedicated span determination service, then this service will be instantiated either at the same time as the corresponding tenant queue or when a first logical network configuration change is added to that queue and will not be removed even when inactive. On the other hand, if a tenant specifies for on-demand span determination service, then this service is instantiated when the first logical network configuration change is added to the corresponding tenant queue but can be removed (and its resources recovered for use in the cluster) after a predetermined period of inactivity.
- the channel management service for a particular tenant maintains dedicated channels (e g., asynchronous channels) with each of the datacenters spanned by the tenant’s logical network. Specifically, in some embodiments, the channel management service maintains channels with local network managers at each of these datacenters. In some embodiments, the channel management service includes queues for each of the tenant datacenters and logical network configuration changes are stored to each of the queues corresponding to the datacenters that require the changes. In addition, in some such embodiments, the channel management service guarantees various connection parameters required for dissemination of data.
- dedicated channels e g., asynchronous channels
- the channel management service maintains channels with local network managers at each of these datacenters.
- the channel management service includes queues for each of the tenant datacenters and logical network configuration changes are stored to each of the queues corresponding to the datacenters that require the changes.
- the channel management service guarantees various connection parameters required for dissemination of data.
- each per-tenant service has its own allocated resources (which may be dependent on the tenant’s subscription with the policy manager) and thus can continue processing any logical network configuration changes for its respective tenants even if another tenant’s services are overloaded.
- the number of services can be scaled as additional tenants are added to the policy manager
- a datacenter spanned by one of the tenant logical networks will require a complete synchronization of the logical network for that tenant (i.e., at least the portion of the logical network that spans to that datacenter). This can be a fairly resource-intensive process, as the entire logical network configuration relating to that particular datacenter needs to be streamed from the database to the local network manager for that datacenter.
- the policy manager rather than burdening all of the existing services that handle the provision of updates for that tenant, the policy manager instantiates a separate on-demand configuration streaming service to handle the synchronization process. In different embodiments, this on-demand service may be instantiated within the container cluster or in a different container cluster (so as to avoid overloading resources of the primary container cluster housing the network policy manager).
- the full synchronization of the configuration can be required for various reasons. For instance, if connectivity is lost between a tenant’s channel management service and the local network manager for a particular datacenter, upon restoration of that connectivity the channel management service will notify a management service (e.g., a shared service) that executes in the container cluster. This management service is responsible for instantiating the on-demand configuration streaming service in some embodiments. In other cases, the synchronization might be required due to the receipt of an API command (i.e., via the shared API policy processing services) to synchronize the configuration for a particular datacenter or when a new datacenter is added to the span of a tenant logical network (as a certain portion of the tenant logical network may automatically span to this new datacenter).
- an API command i.e., via the shared API policy processing services
- the instantiated on-demand configuration streaming service reads the data from the shared database and provides the data (as a stream of updates) to the channel management service for the tenant (identified for being sent to the particular datacenter so that it is enqueued correctly by the channel management service).
- each item of network configuration data has a span that is marked within the database and this span is used by the on- demand streaming configuration service to retrieve the correct configuration data.
- the network policy manager of some embodiments limits the number of concurrently instantiated configuration streaming services (therefore limiting the number of datacenters that can be synchronized at the same time) in order to limit the strain on the shared database (e.g., to 5, 10, 20, etc. concurrent synchronizations). Some embodiments also limit the number of concurrent synchronizations for a particular tenant to a smaller number or otherwise ensure that this maximum number of concurrently instantiated configuration streaming services is spread fairly among all of the tenants (e.g., based on tenant subscriptions).
- Figure 1 conceptually illustrates a network policy manager of some embodiments implemented in a container cluster.
- Figure 2 conceptually illustrates a process of some embodiments for parsing a configuration change request and storing a configuration update at the network policy manager.
- Figure 3 conceptually illustrates an example of the processing of a change request received by the network policy manager.
- Figure 4 conceptually illustrates a process of some embodiments for determining the span of a configuration change and enqueuing that configuration change to be sent to each datacenter spanned by the configuration change.
- Figure 5 conceptually illustrates an example of the processing of a configuration change by a span determination service.
- Figure 6 conceptually illustrates a process of some embodiments for managing a complete synchronization of the logical network configuration for a particular datacenter.
- Figure 7 conceptually illustrates the instantiation and operation of an on-demand streaming service to perform a complete synchronization of a logical network configuration for a tenant datacenter.
- Figure 8 conceptually illustrates the deletion of the on-demand streaming service once the synchronization is complete.
- Figure 9 conceptually illustrates an electronic system with which some embodiments of the invention are implemented.
- Some embodiments provide a multi-tenant network policy manager implemented in a container cluster (e g., a Kubernetes cluster).
- the network policy manager manages the logical networks for multiple tenants, each of which may have a logical network that is implemented across a respective set of one or more datacenters.
- the network policy manager cluster is responsible for receiving logical network configuration policy from administrators for the tenants, storing the logical network configuration, and distributing this configuration to the different tenant datacenters so that the logical networks can be correctly implemented across the tenant datacenters.
- Figure 1 conceptually illustrates such a network policy manager of some embodiments implemented in a container cluster 100.
- the container cluster 100 may be implemented in a public cloud datacenter or a private datacenter (e.g., a private datacenter of the enterprise providing the network policy manager to various tenants).
- different functions of the network policy manager are implemented as different services (e g., micro-services) on their own containers (e.g., on separate Kubernetes Pods).
- the cluster includes multiple nodes, with the various services assigned to different nodes. It should be noted that Figure 1 shows the logical architecture of the network policy manager within the container cluster 100.
- the container cluster is a Kubernetes cluster that includes one or more nodes (e.g., virtual machines or physical host servers), each of which hosts multiple Pods, with each of the illustrated services executing within its own Pod.
- a Kubernetes cluster includes various Kubernetes control elements (e.g., a Kube-API server, etc.) for configuring the cluster and each of the nodes executes various Kubernetes entities (e.g., a kube-proxy, container network interface, etc.).
- the network policy manager in some embodiments, is responsible for acting as a global network manager for multiple tenant logical networks.
- Each tenant logical network spans one or more physical sites (e.g., datacenters), and a separate local network manager operates at each of these physical sites to communicate with the network policy manager.
- a single tenant may administer multiple separate logical networks, but for the purposes of this description these are treated as separate tenants (e.g., in some embodiments each logical network has its own subscription with the network policy manager provider that specifies its own separate requirements and is thus treated separately by the network policy manager).
- the network policy manager receives global logical network configuration data (e.g., from a network administrator through a user interface provided to the administrator).
- the primary purpose of the network policy manager (with respect to a particular tenant) is to receive and store the global configuration for a logical network that spans multiple datacenters, determine the span for each logical network element in the global logical network configuration (i.e., the datacenters at which the logical network is implemented) based on the specified configuration, and distribute the configuration data for each element to the local network managers at the datacenters that require the configuration data for that element.
- the operation of a global network manager for a single tenant logical network is described in greater detail in U.S. Patent 11,381,456 and U.S. Patent 11,088,919, both of which are incorporated herein by reference.
- the network policy manager of some embodiments includes a management service 105, shared application programming interface (API) processing services 110, a shared database service or services 115, per-tenant queue management services 120-130, per-tenant span determination services 135-145, and per-tenant channel management services 150- 160.
- the services include both (i) shared services that are shared by all of the tenants and (ii) per-tenant services that are instantiated on a per-tenant basis (i.e., each container implementing the service is dedicated to a single tenant).
- the API processing services 110 and the database service 115 are shared between the various tenants, while the queue management services 120-130, span determination services 135-145, and channel management services 150- 160 are implemented on a per-tenant basis.
- the management service 105 handles the management of the other services in the cluster 100. For instance, the management service 105 is responsible for instantiating and configuring additional API processing services 110 as needed, as well as starting up new sets of per-tenant queue management services, span determination services, and channel management services as new tenant logical networks are defined. The management service 105 is also responsible for stopping these services as needed (e.g., removing a set of per-tenant services when its corresponding tenant logical network is deleted). It should be noted that although the management service 105 is only shown communicating with the per-tenant services 120, 135, and 150 of tenant 1 (via the dashed lines), it actually communicates similarly with all of the other per- tenant services as well.
- the API processing services 110 receive configuration requests from tenants (e.g., specifying modifications to the tenant’s logical network).
- the ingress path for the container cluster 100 is handled by a gateway 165 that performs load balancing across the API processing services 110. That is, each configuration request is distributed by the gateway (which may be a single gateway or a gateway cluster) to one of the API processing services 110 in the container cluster 100 according to a load balancing algorithm.
- the gateway 165 executes within the same datacenter as the container cluster 100 (e.g., as a gateway managed at least partly by the cloud provider that owns the public cloud in which the container cluster 100 is implemented).
- Configuration requests are received at the gateway 165 from tenants via a network (e.g., the public Internet, a virtual private network, etc.).
- a network e.g., the public Internet, a virtual private network, etc.
- the tenant administrators use a network management client (e g., provided by the network policy manager provider) that enables easy configuration of logical networks
- the API processing services 110 are shared between tenants (i.e., are not instantiated on a per-tenant basis) and thus each of these API processing services 110 can receive configuration requests from multiple different tenants.
- the gateway / load balancer 165 also ensures that a single tenant cannot overload the system and prevent other tenants from being able to access the API processing services 110.
- the gateway 165 can be configured to detect when a single source is sending a large number of configuration requests and either throttle these requests or send all of them to the same API processing service 110 so that they back up at that service while the other services are able to process configuration requests from other tenants.
- the service 110 parses the request to identify (i) the tenant making the request and (ii) the logical network modification requested.
- the request may specify to add, remove, or change a logical network element (e g., a logical forwarding element, logical port, policy rule, etc.) in some embodiments.
- the API processing service 110 posts the logical network configuration change to a configuration queue for the identified tenant as well as to the logical network configuration stored for the tenant in a shared database 170.
- the shared database 170 in some embodiments, is managed by one or more shared database services 115.
- the database 170 is a distributed database (e.g., Amazon DynamoDB® or a similar distributed database) that stores the logical network configuration for each tenant.
- the database 170 may be distributed across nodes in the container cluster 100 or stored outside of the cluster (but accessed by the database service 115).
- the database 170 may be organized so as to store separate sets of tables for each tenant, or in any other way, so long as the tenant logical network configurations are accessible.
- the logical network configuration data in the database tables for a particular tenant is expressed as a hierarchical tree of tenant intent.
- the hierarchical policy tree used to store a logical network configuration in some embodiments is described in U.S. Patent 1 1 ,38 1 ,456 and U.S. Patent 1 1 ,088,919, both of which are incorporated by reference above.
- the other services 120-160 that make up the multi -tenant policy manager are separated per tenant in some embodiments, rather than shared between tenants. These services include the queue management services 120-130, span determination services 135-145, and channel management services 150-160. As mentioned, a particular enterprise may have more than one separate logical network spanning different sets of (potentially overlapping) datacenters, which are treated as different tenants by the policy manager in some embodiments (and thus have separate corresponding sets of services).
- the queue management service 120-130 for a given tenant stores the logical network configuration changes requested for that tenant (as parsed by the API processing services 110) in a persistent configuration queue for the tenant. Because the API processing services 110 are shared, multiple different API processing services 110 may post configuration requests to the same queue. In some embodiments, the queue for a particular tenant logical network is created (and the queue management service instantiated to manage access to that queue) when the tenant logical network is initially defined within the policy manager (even if no configuration has yet been received). It should be noted that, in some embodiments, one or more shared queue management services are used in the network policy manager (instead of per-tenant services) to manage per-tenant configuration queues
- the span determination services 135-145 determine, for each logical network configuration change in its tenant’s corresponding queue, which of the datacenters spanned by the tenant logical network needs to receive the update.
- certain logical network elements only span a subset of the datacenters spanned by the logical network (e.g., logical switches or routers defined only within a specific datacenter, policy rules defined only for a subset of datacenters, etc.).
- the span determination service 135-145 is responsible for making this determination for each update and providing the update to its corresponding channel management service 150-160 (i.e., the channel management service for the same tenant).
- the span determination service 135-145 also identifies, for each item of logical network configuration stored in the shared database 170, which tenant datacenters require that configuration item so that this span information can be stored in the database 170.
- the span calculations of some embodiments are described in U.S. Patent 11,381,456 and U.S. Patent 11,088,919, both of which are incorporated by reference above.
- the span determination services 135- 145 can either be on- demand or dedicated services.
- the network policy manager restricts the span determination services to one or the other.
- the tenant can specify (e.g., via their subscription to the network policy manager) whether their span determination service should be on-demand (e g., a function as a service) or dedicated.
- a tenant specifies a dedicated span determination service (e.g., tenant 1 in the example shown in the figure), then this service 135 is instantiated (e.g., by the management service 105) either at the same time as the corresponding tenant queue or when a first logical network configuration change is added to that queue.
- the span management service 135 Even if the queue has not received any configuration change requests over a period of time, the span management service 135 remains operational.
- this service 140 is instantiated (e.g., by the management service 105) when the first logical network configuration change is added to the corresponding tenant queue but can be removed (e g , by the management service 105) and its resources recovered for use in the cluster after a period of inactivity lasting a predetermined time.
- the channel management services 150-160 each maintain dedicated channels with each of the datacenters spanned by their respective tenant’s logical network. Specifically, in some embodiments, the channel management service for each tenant maintains asynchronous channels with local network managers at each of the datacenters spanned by the tenant’s logical network. In some embodiments, each channel management service 150-160 includes queues for each of the tenant datacenters. When the corresponding span determination service 135-145 identifies the datacenters spanned by a particular logical network configuration change, that change is stored to each of the queues corresponding to those identified datacenters. The channel management service 150-160 then manages the transmission of these configuration changes from the respective queues to the respective local network managers.
- the channel management service guarantees various connection parameters required for dissemination of this configuration data and receives notifications and/or certain configuration updates from the local network managers.
- the channel management service for a single tenant is also discussed in further detail in U.S. Patent 11,381,456 and U.S. Patent 11,088,919, both of which are incorporated by reference above.
- each per-tenant service has its own allocated resources (which may be dependent on the tenant’s subscription with the policy manager) and thus can continue processing any logical network configuration changes for its respective tenants even if another tenant’s services are overloaded.
- the number of services can be easily scaled as additional tenants are added to the policy manager.
- Figure 2 conceptually illustrates a process 200 of some embodiments for parsing a configuration change request and storing a configuration update at the network policy manager.
- the process 200 is performed by an API processing service of the network policy manager that is shared between the tenants of the policy manager.
- the process 200 will be described in part by reference to Figure 3, which conceptually illustrates an example of the processing of a change request received by the network policy manager.
- the process 200 begins by receiving (at 205) a configuration change request.
- this request is previously received by the ingress gateway / load balancer for the network policy manager, which selects one of the shared API processing services of the network policy manager to process the request.
- Figure 3 shows an API gateway / load balancer 305 that initially receives a change request 310.
- the gateway 305 does not modify or process the change request 310, except to forward the data message(s) that carry the change request 310 to the shared API processing service 300 (i.e., the gateway 305 may modify L2-L4 headers of these data messages but does not perform processing on the payload of the data message relating to the logical network configuration).
- the gateway 305 does not assign the change request 310 to a specific tenant. Instead, the gateway 305 selects the API processing service 300 (from among the multiple shared API processing services of the network policy manager) and forwards the change request 310 to the selected API processing service 300.
- the process 200 parses (at 210) the received change request to identify (i) the tenant and (ii) the requested logical network configuration change.
- the tenant in some embodiments, is specified using a unique identifier in the change request. Other embodiments identify the tenant based on the source of the change request (e.g., a source network address) or other identifying information.
- Logical network configuration changes can add, delete, or modify any aspect of the logical network and/or certain aspects of how that logical network is implemented across the physical datacenters.
- the logical network aspects can include logical forwarding elements (e.g., logical switches and/or routers), logical services (e.g., distributed firewall, network address translation, etc.), and/or security policy (e g., security groups and/or security rules), as well as changes to the span of these elements.
- logical forwarding elements e.g., logical switches and/or routers
- logical services e.g., distributed firewall, network address translation, etc.
- security policy e g., security groups and/or security rules
- the logical network configuration changes can specify groups of physical devices that are eligible at specific datacenters for implementation of the logical network at those datacenters.
- the process 200 posts (at 215) the logical network configuration change, marked with an identifier of the tenant, to the shared database service for storage in the shared network policy manager database.
- this database stores the logical network configuration for all of the tenants, either within the network policy manager container cluster or separate from the cluster.
- the shared database service is responsible, in some embodiments, for accessing (i.e., storing data to and retrieving data from) this database.
- Figure 3 shows that the API processing service 300 provides configuration change data 315 to the shared database service 320. This configuration change data identifies the requested logical network configuration change as well as the tenant logical network to which it pertains, so that the database service 320 can differentiate the change from those of other tenants.
- the process 200 also posts (at 220) the logical network configuration change to the persistent queue of the identified tenant, then ends.
- this logical network configuration change need not specify the tenant.
- Figure 3 shows that, in addition to providing the configuration change 315 (with the tenant identifier for tenant N) to the shared database service 320, the API processing service 300 separately provides a configuration change data item 325 to the queue management service 330 for tenant N. The configuration change data item 325 is added to the queue for tenant N that is managed by this service 330 so that the data item 325 can be distributed to the required datacenters.
- Figure 4 conceptually illustrates a process 400 of some embodiments for determining the span of a configuration change and enqueuing that configuration change to be sent to each datacenter spanned by the configuration change.
- the process 400 is performed by a span management service for a specific tenant within the multi-tenant network policy manager, such as that described above by reference to Figure 1.
- the process 400 will be described in part by reference to Figure 5, which conceptually illustrates an example of the processing of a configuration change by a span determination service.
- the process 400 begins by, at the span determination service for a particular tenant, identifying (at 405) a new configuration change in the corresponding queue for the particular tenant.
- the queue management service notifies the corresponding span determination service each time a new configuration change is added to the queue.
- a notification is passed through the management service for the network policy manager, or the span determination service regularly polls the corresponding queue management service to determine whether any configuration changes are present in the queue.
- the process 400 then retrieves (at 410) this configuration change from the queue.
- the span determination service when the span determination service has finished processing any previous configuration changes and has been notified that there is at least one configuration change pending in its queue, the span determination service sends a retrieval request to the corresponding queue management service to retrieve the next configuration change in the queue for that tenant.
- Figure 5 shows that the queue management service 505 for tenant N provides a configuration change 5 10 to the span determination service 500 for tenant N (e.g., based on a retrieval request from the span determination service 500 for the next configuration change in the queue).
- the queue management service 505 also removes this configuration change 510 from the queue, such that the next request from the span determination service 500 will retrieve a different configuration change
- the process 400 determines (at 415) the one or more datacenters spanned by the configuration change.
- the span determination service interacts with the shared database to make this determination, as the span for a particular logical network element may depend on other logical network elements.
- the span determination service stores a mapping of existing logical network elements to groups of datacenters and can use this mapping for changes that do not affect the span of a logical network element (e g., connecting a logical switch to an existing logical router with a known span).
- assessment of the hierarchical policy tree for the logical network is required and thus the span determination service needs to interact with the shared database service to retrieve at least a portion of the existing logical network configuration.
- the process 400 provides a copy of the configuration change to the channel management service for the particular tenant for each datacenter spanned by the change, so that the channel management service can distribute the configuration change to the local network managers at these datacenters.
- the process 400 then ends.
- Different embodiments provide this configuration change differently when multiple datacenters require the change.
- the span determination service provides a separate copy of the configuration change to the channel management service, with each copy marked for a specific datacenter.
- the span determination provides a single copy of the change along with the list of datacenters to the channel management service, which replicates the change for each datacenter.
- Figure 5 shows this latter option, as the span determination service 500 provides the configuration change 515 with a list of identified datacenters to the channel management service 520 for tenant N.
- This channel management service 520 stores separate queues for each of the four datacenters spanned by tenant N’s logical network, and the configuration change is added to the queues for each of the datacenters identified by the span determination service 500. These changes will then be distributed by the channel management service 520 to the local network managers at the identified datacenters via the respective asynchronous communication channels.
- a datacenter spanned by one of the tenant logical networks will require a complete synchronization of the logical network for that tenant (i.e., at least the portion of the logical network that spans to that datacenter). This can be a fairly resource-intensive process, as the entire logical network configuration relating to that particular datacenter needs to be streamed from the database to the local network manager for that datacenter.
- the policy manager e.g., the management service of the policy manager
- this on-demand service may be instantiated within the container cluster or in a different container cluster (so as to avoid overloading resources of the primary container cluster housing the network policy manager).
- Figure 6 conceptually illustrates a process 600 of some embodiments for managing a complete synchronization of the logical network configuration for a particular datacenter.
- the process 600 is performed by the management service of a network policy manager in response to a trigger that indicates the need for the complete configuration synchronization.
- the process 600 will be described in part by reference to Figures 7 and 8, which conceptually illustrate the instantiation, operation, and deletion of an on-demand configuration streaming service for the network policy manager.
- Figure 7 conceptually illustrates the instantiation and operation of an on-demand streaming service to perform a complete synchronization of a logical network configuration for a tenant datacenter over three stages 705- 715
- Figure 8 conceptually illustrates the deletion of that on-demand streaming service over two stages 805-810 once the synchronization is complete.
- the process 600 begins by determining (at 605) that a particular datacenter for a particular tenant (i.e., spanned by the particular tenant’s logical network) requires a complete synchronization of its logical network configuration from the global network policy manager.
- the logical network configuration synchronization may be required for various different reasons. For instance, if connectivity is lost between a tenant’s channel management service and the local network manager for a particular datacenter, upon restoration of that connectivity, the channel management service notifies the management service that is responsible for instantiating the on-demand configuration streaming service in some embodiments.
- the synchronization might be required due to the receipt of an API command (i.e., via the shared API policy processing services) to synchronize the configuration for a particular datacenter or when a new datacenter is added to the span of a tenant logical network (as a certain portion of the tenant logical network may automatically span to this new datacenter).
- the first stage 705 of Figure 7 shows that the channel management service 720 for tenant K sends a reconnection notification 725 to the shared manager service 700 of the network policy manager.
- This reconnection notification 725 specifies the datacenter (e.g., using a unique datacenter identifier) for which the configuration synchronization is required.
- the reconnection notification 725 also includes a unique tenant identifier.
- the process 600 determines (at 610) whether an on-demand service can currently be instantiated for the particular tenant.
- Some embodiments limit the number of concurrently instantiated on-demand configuration streaming services, therefore limiting the number of datacenters for which the configurations can be synchronized at the same time (e g., to 5, 10, 20, etc. concurrent synchronizations). This limits the strain on the shared database. While ideally the need for complete synchronizations would be rare, if the network policy manager were to lose external connectivity or a tenant simultaneously added a larger number of new datacenters to its logical network, overloads could occur.
- some embodiments limit the number of concurrent synchronizations for a particular tenant to a smaller number or otherwise ensure that this maximum number of concurrently instantiated configuration streaming services is spread fairly among all of the tenants. For instance, a single tenant might be limited to a maximum that is a particular percentage (50%, 67%, etc.) of the total allowed number of concurrent synchronizations, with this per-tenant maximum variable based on the tenant subscription.
- the process 600 holds off (at 615) on synchronizing the configuration for the particular datacenter. While the process 600 is shown as returning to 610 to continuously check whether the service can be instantiated for the tenant, it should be understood that this is a conceptual process, and the management service of the network policy manager may handle this situation differently in different embodiments.
- the management service stores a queue of required synchronizations and is configured to instantiate an on-demand service for the next synchronization in the queue once one of the ongoing synchronizations is complete (although if the first synchronization in the queue is for a tenant at their per-tenant maximum and the completed synchronization is for a different tenant, then the first synchronization for an eligible tenant is begun instead).
- the process 600 instantiates (at 620) a configuration streaming service for the particular tenant datacenter.
- the management service instantiates the configuration streaming service by communicating with a cluster control plane (e.g., a Kubernetes control plane) to instantiate a Pod for the on-demand streaming service.
- the management service also configures this new service by specifying the particular tenant and datacenter that requires the configuration synchronization.
- the second stage 710 of Figure 7 shows that the management service 700, having received the reconnection notification 725, instantiates and configures an on- demand configuration streaming service 730, providing this service with configuration indicating the datacenter spanned by tenant K’s logical network that requires synchronization.
- the on-demand configuration streaming service retrieves the required logical network configuration data from the configuration database. If the database is organized with the span information stored for each logical network element, then the configuration streaming service can retrieve only the configuration data that spans to the specified datacenter. On the other hand, if this span information is not stored in the database, then the configuration streaming service of some embodiments retrieves all of the logical network data for the tenant and performs its own span calculations to determine which data should be streamed to the particular datacenter.
- the configuration streaming service provides this data to the channel management service for the particular tenant, and the channel management service streams the logical network configuration data via its connection with the local network manager at the particular datacenter.
- the logical network configuration is streamed as a series of configuration changes (in addition to start and stop indicators) that are added to the channel management service queue for the particular datacenter and then transmitted by the channel management service.
- the configuration streaming service uses the connection maintained by the channel management service but bypasses its datacenter queue.
- the third stage 715 of Figure 7 shows that the configuration streaming service 730 retrieves the logical network configuration 735 for the specified datacenter of tenant K from the distributed configuration database 740 via the shared database service 745 of the network policy manager.
- the database 740 stores the span for each logical network element, then the database service 745 can provide the configuration streaming service 730 with only the logical network configuration required for the specified datacenter.
- the configuration streaming service 730 streams this logical network configuration 735 to the channel management service 750, which transmits the logical network configuration data to the local network manager for the specified datacenter via its communication channel with the local network manager.
- the synchronization process of some embodiments is described in greater detail in U.S. Patent 11,088,902, which is incorporated herein by reference.
- the on-demand configuration streaming service should be stopped (and deleted) in order to free up resources (potentially for additional synchronizations with other datacenters).
- the process 600 receives (at 625) confirmation that the synchronization is complete. It should be understood that the process 600 is a conceptual process and that between operation 620 and 625 the management service of the network policy manager may perform various other operations relating to other configuration streaming services or other operations of the network policy manager.
- the on-demand configuration streaming service sends this notification to the management service.
- the first stage 805 of Figure 8 shows the configuration streaming service 730 sending such a notification 815 to the management service 700 to indicate that the logical network configuration for the particular datacenter has been sent to the logical network manager.
- the notification may be sent after the streaming service has provided all of the data to the channel management service or the channel management service has notified the configuration streaming service that all of the logical network configuration data is transmitted to the local network manager for the particular datacenter (either after the data is transmitted or after receiving a notification from the local network manager).
- the channel management service notifies the management service once the transmission of all of the data is completed, either after transmitting the last data or after receiving an acknowledgement from the local network manager that all of the data has been received.
- the process 600 stops and removes (at 630) the configuration streaming service, then ends.
- the management service stops the configuration streaming service by communicating with a cluster control plane (e g., a Kubernetes control plane) to remove the Pod implementing the on-demand streaming service.
- a cluster control plane e g., a Kubernetes control plane
- the management service stops and deletes the service (and its Pod) directly.
- the second stage 810 of Figure 8 shows that the management service 700, upon receipt of the completion notification 815, stops and deletes the configuration streaming service 730, thereby freeing up the resources used by that service.
- FIG. 9 conceptually illustrates an electronic system 900 with which some embodiments of the invention are implemented.
- the electronic system 900 may be a computer (e.g., a desktop computer, personal computer, tablet computer, server computer, mainframe, a blade computer etc.), phone, PDA, or any other sort of electronic device.
- Such an electronic system includes various types of computer readable media and interfaces for various other types of computer readable media.
- Electronic system 900 includes a bus 905, processing unit(s) 910, a system memory 925, a read-only memory 930, a permanent storage device 935, input devices 940, and output devices 945.
- the bus 905 collectively represents all system, peripheral, and chipset buses that communicatively connect the numerous internal devices of the electronic system 900.
- the bus 905 communicatively connects the processing unit(s) 910 with the read-only memory 930, the system memory 925, and the permanent storage device 935.
- the processing unit(s) 910 retrieve instructions to execute and data to process in order to execute the processes of the invention.
- the processing unit(s) may be a single processor or a multi-core processor in different embodiments.
- the read-only-memory (ROM) 930 stores static data and instructions that are needed by the processing unit(s) 910 and other modules of the electronic system.
- the permanent storage device 935 is a read-and- write memory device. This device is a nonvolatile memory unit that stores instructions and data even when the electronic system 900 is off. Some embodiments of the invention use a mass-storage device (such as a magnetic or optical disk and its corresponding disk drive) as the permanent storage device 935.
- the system memory 925 is a read-and-write memory device. However, unlike storage device 935, the system memory is a volatile read-and-write memory, such a random-access memory.
- the system memory stores some of the instructions and data that the processor needs at runtime.
- the invention’s processes are stored in the system memory 925, the permanent storage device 935, and/or the read-only memory 930. From these various memory units, the processing unit(s) 910 retrieve instructions to execute and data to process in order to execute the processes of some embodiments.
- the bus 905 also connects to the input and output devices 940 and 945.
- the input devices enable the user to communicate information and select commands to the electronic system.
- the input devices 940 include alphanumeric keyboards and pointing devices (also called “cursor control devices”).
- the output devices 945 display images generated by the electronic system.
- the output devices include printers and display devices, such as cathode ray tubes (CRT) or liquid crystal displays (LCD). Some embodiments include devices such as a touchscreen that function as both input and output devices.
- bus 905 also couples electronic system 900 to a network 965 through a network adapter (not shown).
- the computer can be a part of a network of computers (such as a local area network (“LAN”), a wide area network (“WAN”), or an Intranet, or a network of networks, such as the Internet. Any or all components of electronic system 900 may be used in conjunction with the invention.
- Some embodiments include electronic components, such as microprocessors, storage and memory that store computer program instructions in a machine-readable or computer- readable medium (alternatively referred to as computer-readable storage media, machine-readable media, or machine-readable storage media).
- electronic components such as microprocessors, storage and memory that store computer program instructions in a machine-readable or computer- readable medium (alternatively referred to as computer-readable storage media, machine-readable media, or machine-readable storage media).
- Computer-readable media include RAM, ROM, read-only compact discs (CD-ROM), recordable compact discs (CD-R), rewritable compact discs (CD-RW), read-only digital versatile discs (e.g., DVD-ROM, dual-layer DVD-ROM), a variety of recordable/rewritable DVDs (e.g., DVD-RAM, DVD-RW, DVD+RW, etc.), flash memory (e g., SD cards, mini-SD cards, micro-SD cards, etc.), magnetic and/or solid state hard drives, read-only and recordable Blu-Ray® discs, ultra-density optical discs, any other optical or magnetic media, and floppy disks
- the computer-readable media may store a computer program that is executable by at least one processing unit and includes sets of instructions for performing various operations. Examples of computer programs or computer code include machine code, such as is produced by a compiler, and files including higher-level code that are executed by a computer, an electronic component, or a microprocessor
- ASICs application specific integrated circuits
- FPGAs field programmable gate arrays
- integrated circuits execute instructions that are stored on the circuit itself.
- the terms “computer”, “server”, “processor”, and “memory” all refer to electronic or other technological devices. These terms exclude people or groups of people.
- display or displaying means displaying on an electronic device.
- the terms “computer readable medium,” “computer readable media,” and “machine readable medium” are entirely restricted to tangible, physical objects that store information in a form that is readable by a computer. These terms exclude any wireless signals, wired download signals, and any other ephemeral signals.
- DCNs data compute nodes
- addressable nodes may include non-virtualized physical hosts, virtual machines, containers that run on top of a host operating system without the need for a hypervisor or separate operating system, and hypervisor kernel network interface modules
- VMs in some embodiments, operate with their own guest operating systems on a host using resources of the host virtualized by virtualization software (e g., a hypervisor, virtual machine monitor, etc.).
- the tenant i.e., the owner of the VM
- Some containers are constructs that run on top of a host operating system without the need for a hypervisor or separate guest operating system.
- the host operating system uses name spaces to isolate the containers from each other and therefore provides operating-system level segregation of the different groups of applications that operate within different containers.
- This segregation is akin to the VM segregation that is offered in hypervisor-virtualized environments that virtualize system hardware, and thus can be viewed as a form of virtualization that isolates different groups of applications that operate in different containers.
- Such containers are more lightweight than VMs.
- Hypervisor kernel network interface modules in some embodiments, is a non-VM DCN that includes a network stack with a hypervisor kernel network interface and receive/transmit threads.
- a hypervisor kernel network interface module is the vmknic module that is part of the ESXiTM hypervisor of VMware, Inc.
- the examples given could be any type of DCNs, including physical hosts, VMs, non-VM containers, and hypervisor kernel network interface modules.
- the example networks could include combinations of different types of DCNs in some embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23730993.5A EP4559163A1 (en) | 2022-07-20 | 2023-05-14 | Sharing network manager between multiple tenants |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US17/869,640 US20240031229A1 (en) | 2022-07-20 | 2022-07-20 | Synchronization of logical network configuration between multi-tenant network manager and local network manager |
| US17/869,637 | 2022-07-20 | ||
| US17/869,640 | 2022-07-20 | ||
| US17/869,637 US12107722B2 (en) | 2022-07-20 | 2022-07-20 | Sharing network manager between multiple tenants |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024019791A1 true WO2024019791A1 (en) | 2024-01-25 |
Family
ID=86771439
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2023/022191 Ceased WO2024019791A1 (en) | 2022-07-20 | 2023-05-14 | Sharing network manager between multiple tenants |
Country Status (2)
| Country | Link |
|---|---|
| EP (1) | EP4559163A1 (en) |
| WO (1) | WO2024019791A1 (en) |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11088902B1 (en) | 2020-04-06 | 2021-08-10 | Vmware, Inc. | Synchronization of logical network state between global and local managers |
| US11088919B1 (en) | 2020-04-06 | 2021-08-10 | Vmware, Inc. | Data structure for defining multi-site logical network |
| WO2022132233A1 (en) * | 2020-12-15 | 2022-06-23 | Google Llc | Multi-tenant control plane management on computing platform |
| US11381456B2 (en) | 2020-04-06 | 2022-07-05 | Vmware, Inc. | Replication of logical network data between global managers |
-
2023
- 2023-05-14 EP EP23730993.5A patent/EP4559163A1/en active Pending
- 2023-05-14 WO PCT/US2023/022191 patent/WO2024019791A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11088902B1 (en) | 2020-04-06 | 2021-08-10 | Vmware, Inc. | Synchronization of logical network state between global and local managers |
| US11088919B1 (en) | 2020-04-06 | 2021-08-10 | Vmware, Inc. | Data structure for defining multi-site logical network |
| US20210367834A1 (en) * | 2020-04-06 | 2021-11-25 | Vmware, Inc. | Synchronization of logical network state between global and local managers |
| US11381456B2 (en) | 2020-04-06 | 2022-07-05 | Vmware, Inc. | Replication of logical network data between global managers |
| WO2022132233A1 (en) * | 2020-12-15 | 2022-06-23 | Google Llc | Multi-tenant control plane management on computing platform |
Also Published As
| Publication number | Publication date |
|---|---|
| EP4559163A1 (en) | 2025-05-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11296960B2 (en) | Monitoring distributed applications | |
| US20240031229A1 (en) | Synchronization of logical network configuration between multi-tenant network manager and local network manager | |
| US10713071B2 (en) | Method and apparatus for network function virtualization | |
| US20230244591A1 (en) | Monitoring status of network management agents in container cluster | |
| US8892707B2 (en) | Identification of virtual applications for backup in a cloud computing system | |
| EP2979183B1 (en) | Method and arrangement for fault management in infrastructure as a service clouds | |
| US20200348983A1 (en) | Monitoring and optimizing interhost network traffic | |
| US10686716B2 (en) | Dynamic processing of packets using multiple receive queue features | |
| US11929897B2 (en) | Highly-scalable, software-defined, in-network multicasting of load statistics data | |
| US11113085B2 (en) | Virtual network abstraction | |
| US12260246B2 (en) | External communication with packaged virtual machine applications without networking | |
| US11531564B2 (en) | Executing multi-stage distributed computing operations with independent rollback workflow | |
| US20180302467A1 (en) | Scalable monitoring of long running multi-step data intensive workloads | |
| US11032168B2 (en) | Mechanism for performance monitoring, alerting and auto recovery in VDI system | |
| US10459631B2 (en) | Managing deletion of logical objects of a managed system | |
| US12316471B2 (en) | Distributing multicast receiver information across multi-tier edge gateways | |
| US10698715B2 (en) | Alert mechanism for VDI system based on social networks | |
| US11190577B2 (en) | Single data transmission using a data management server | |
| US12107722B2 (en) | Sharing network manager between multiple tenants | |
| WO2024019791A1 (en) | Sharing network manager between multiple tenants | |
| EP4404061A1 (en) | Dynamic migration between receive side scaling (rss) engine states | |
| US20250016077A1 (en) | Architecture for monitoring metrics of network management system | |
| US20240028358A1 (en) | A general network policy for namespaces | |
| US10601669B2 (en) | Configurable client filtering rules | |
| US10911315B2 (en) | Inter-connecting local control planes for state data exchange |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23730993 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023730993 Country of ref document: EP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2023730993 Country of ref document: EP Effective date: 20250220 |
|
| WWP | Wipo information: published in national office |
Ref document number: 2023730993 Country of ref document: EP |