JP2025090892A

JP2025090892A - Container System

Info

Publication number: JP2025090892A
Application number: JP2023205757A
Authority: JP
Inventors: 亮輔馬場; Ryosuke Baba
Original assignee: Fuji Electric Co Ltd
Current assignee: Fuji Electric Co Ltd
Priority date: 2023-12-06
Filing date: 2023-12-06
Publication date: 2025-06-18

Abstract

To provide a method for automatically and appropriately changing a setting of a container when an error occurs on the container.SOLUTION: A container system includes: a container capable of executing an application; a container management unit; a container execution unit; a log storage unit; and a container setting change unit. The container management unit manages setting information related to setting of the container. The container execution unit operates the container according to the setting information. The log storage unit stores log information representing an event occurring in the container. The container setting change unit changes the setting information by referring to the log information when an error occurs on the container. The container execution unit operates the container based on the setting information changed by the container setting change unit.SELECTED DRAWING: Figure 1

Description

本発明は、コンテナ上で発生したエラーまたは異常に対処する技術に係わる。 The present invention relates to a technology for dealing with errors or abnormalities that occur on a container.

昨今、仮想化したコンテナを利用したアプリケーション管理技術の開発および活用が盛んである。コンテナは、１つのＯＳ（Operating System）上で１または複数の実行環境を配置することができる。また、多くのコンテナは、コンテナ上で動作するアプリケーションが停止または異常終了したときに、自動で復旧する機能を備える。なお、優先度に基づいて選択したサーバに対して仮想化マシン／コンテナの最適配置を推定する方法が提案されている（例えば、特許文献１）。 Recently, application management technology that uses virtualized containers has been actively developed and utilized. A container can place one or multiple execution environments on a single OS (Operating System). Furthermore, many containers have a function for automatic recovery when an application running on the container is stopped or abnormally terminated. A method has been proposed for estimating the optimal placement of virtual machines/containers for a server selected based on priority (for example, Patent Document 1).

国際公開第２０２２／１７２３８５号International Publication No. 2022/172385

上述したように、コンテナ上で動作するアプリケーションが停止または異常終了したときには、自動で復旧処理が行われる。ただし、誤ったコンテナ設定のまま復旧処理が行われると、エラーの発生により再起動手順が繰り返されるおそれがある。このため、現状では、コンテナ上で動作するアプリケーションが停止または異常終了したときには、多くのケースにおいて、ユーザまたは管理者が手動でコンテナの設定を変更している。 As mentioned above, when an application running on a container stops or abnormally terminates, recovery processing is performed automatically. However, if recovery processing is performed with incorrect container settings, there is a risk that an error will occur and the restart procedure will be repeated. For this reason, currently, when an application running on a container stops or abnormally terminates, in many cases the user or administrator manually changes the container settings.

本発明の１つの側面に係わる目的は、コンテナ上でエラーが発生したときに、そのコンテナの設定を自動で適切に変更する方法を提供することである。 An object of one aspect of the present invention is to provide a method for automatically and appropriately changing the settings of a container when an error occurs on the container.

本発明の１つの態様に係わるコンテナシステムは、アプリケーションを実行可能なコンテナと、前記コンテナの設定に係わる設定情報を管理するコンテナ管理部と、前記設定情報に従って前記コンテナを動作させるコンテナ実行部と、前記コンテナにおいて発生する事象を表すログ情報を保存するログ保存部と、前記コンテナ上でエラーが発生したときに、前記ログ情報を参照して前記設定情報を変更するコンテナ設定変更部と、を備える。前記コンテナ実行部は、前記コンテナ設定変更部により変更された設定情報に基づいて前記コンテナを動作させる。 A container system according to one aspect of the present invention includes a container capable of executing an application, a container management unit that manages configuration information related to the configuration of the container, a container execution unit that operates the container according to the configuration information, a log storage unit that stores log information indicating events that occur in the container, and a container setting change unit that changes the configuration information by referring to the log information when an error occurs in the container. The container execution unit operates the container based on the configuration information changed by the container setting change unit.

上述の態様によれば、コンテナ上でエラーが発生したときに、そのコンテナの設定が自動で適切に変更される。 According to the above-mentioned aspect, when an error occurs in a container, the settings of the container are automatically changed appropriately.

本発明の実施形態に係わるコンテナシステムの機能構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a functional configuration of a container system according to an embodiment of the present invention. コンテナシステムの階層モデルの一例を示す図である。FIG. 1 is a diagram illustrating an example of a hierarchical model of a container system. コンテナシステムの動作シーケンスの一例を示す図である。FIG. 11 is a diagram illustrating an example of an operation sequence of the container system. コンテナ上でエラーが発生したときのコンテナシステムの動作シーケンスの一例を示す図である。11 is a diagram illustrating an example of an operation sequence of the container system when an error occurs on the container. ログ情報の一例を示す図である。FIG. 4 illustrates an example of log information. 設定変更管理情報の一例を示す図である。FIG. 11 illustrates an example of setting change management information. 変更前の設定情報の一例を示す図である。FIG. 13 is a diagram illustrating an example of setting information before change. 変更後の設定情報の一例を示す図である。FIG. 11 is a diagram illustrating an example of changed setting information. コンテナ管理部の処理の一例を示すフローチャートである。13 is a flowchart illustrating an example of a process of a container management unit. コンテナ設定変更部の処理の一例を示すフローチャートである。13 is a flowchart illustrating an example of a process of a container setting change unit. コンテナシステムのハードウェア構成の一例を示す図である。FIG. 1 illustrates an example of a hardware configuration of a container system.

図１は、本発明の実施形態に係わるコンテナシステムの機能構成の一例を示す。本発明の実施形態に係わるコンテナシステム１は、アプリケーション実行基盤１０、ログ管理基盤２０、およびアプリケーション管理基盤３０を備える。なお、コンテナシステム１は、図１に示していない他の機能をさらに備えてもよい。 Figure 1 shows an example of the functional configuration of a container system according to an embodiment of the present invention. The container system 1 according to an embodiment of the present invention includes an application execution platform 10, a log management platform 20, and an application management platform 30. Note that the container system 1 may further include other functions not shown in Figure 1.

アプリケーション実行基盤１０は、複数のコンテナ１１（１１ａ～１１ｎ）およびコンテナ実行部１２を備える。各コンテナ１１は、複数のコンテナ１１に対して共通のＯＳ上で仮想的な実行環境を提供する。各コンテナ１１には、ハードウェア資源が割り当てられる。具体的には、各コンテナ１１には、少なくとも、プロセッサ資源およびメモリ資源が割り当てられる。そして、コンテナ１１は、割り当てられたハードウェア資源を利用して１または複数のアプリケーションを実行する。このとき、各コンテナ１１は、自コンテナにおいて発生する事象を表すログ情報を作成する。ログ情報は、例えば、定期的に作成される。また、ログ情報は、エラーの発生などの想定外の事象が発生したときに作成されてもよい。 The application execution platform 10 includes multiple containers 11 (11a to 11n) and a container execution unit 12. Each container 11 provides a virtual execution environment on a common OS for the multiple containers 11. Hardware resources are allocated to each container 11. Specifically, at least processor resources and memory resources are allocated to each container 11. Then, the container 11 executes one or more applications using the allocated hardware resources. At this time, each container 11 creates log information that indicates events that occur in its own container. The log information is created periodically, for example. In addition, the log information may be created when an unexpected event occurs, such as an error.

コンテナ実行部１２は、アプリケーション管理基盤３０から与えられる設定情報に従ってコンテナ１１をデプロイする。例えば、コンテナ実行部１２は、設定情報により指定されたハードウェア資源を対応するコンテナ１１に割り当てる。また、コンテナ実行部１２は、設定情報により指定されたアプリケーションを対応するコンテナ１１に実行させる。 The container execution unit 12 deploys the container 11 according to the configuration information provided by the application management infrastructure 30. For example, the container execution unit 12 allocates hardware resources specified by the configuration information to the corresponding container 11. In addition, the container execution unit 12 causes the corresponding container 11 to execute the application specified by the configuration information.

ログ管理基盤２０は、ログ保存部２１を備える。ログ保存部２１には、各コンテナ１１により作成されるログ情報が時系列に保存される。また、ログ情報は、コンテナ１１毎に保存されることが好ましい。 The log management platform 20 includes a log storage unit 21. The log storage unit 21 stores log information created by each container 11 in chronological order. It is also preferable that the log information be stored for each container 11.

アプリケーション管理基盤３０は、コンテナ管理部３１およびコンテナ設定変更部３２を備える。コンテナ管理部３１は、コンテナ１１の設定に係わる設定情報を管理する。設定情報は、例えば、コンテナシステム１のユーザによりコンテナ毎に作成される。また、設定情報は、例えば、下記の情報を含む。
（１）イメージ
（２）ポート
（３）ＣＰＵ
（４）メモリ
（５）レプリカ数
（６）ユーザ権限 The application management infrastructure 30 includes a container management unit 31 and a container setting change unit 32. The container management unit 31 manages setting information related to the setting of the container 11. The setting information is created for each container by, for example, a user of the container system 1. The setting information includes, for example, the following information:
(1) Image (2) Port (3) CPU
(4) Memory (5) Number of replicas (6) User permissions

イメージは、コンテナ１１上で実行されるアプリケーションのイメージを表す。ポートは、アプリケーションが利用するポートを表す。ＣＰＵは、コンテナ１１が利用する（または、コンテナ１１に割り当てられる）ＣＰＵ使用量の上限を表す。メモリは、コンテナ１１が利用する（または、コンテナ１１に割り当てられる）メモリ使用量の上限を表す。レプリカ数は、アプリケーションのレプリカ数を表す。ユーザ権限は、ユーザが使用可能なコマンド等を表す。 The image represents the image of an application executed on container 11. The port represents the port used by the application. The CPU represents the upper limit of CPU usage used by container 11 (or allocated to container 11). The memory represents the upper limit of memory usage used by container 11 (or allocated to container 11). The number of replicas represents the number of replicas of an application. The user permissions represent the commands, etc. that the user can use.

コンテナ管理部３１は、各コンテナ１１に対して作成されている設定情報をコンテナ実行部１２に与える。これにより、コンテナ１１は、設定情報に従って動作する。例えば、コンテナ１１ａの設定情報が「メモリ：２００Ｍｉ」を表し、コンテナ１１ｂの設定情報が「メモリ：５００Ｍｉ」を表すときは、コンテナ実行部１２は、コンテナ１１ａに対して２００Ｍｉのメモリ資源を割り当て、コンテナ１１ｂに対して５００Ｍｉのメモリ資源を割り当てる。 The container management unit 31 provides the configuration information created for each container 11 to the container execution unit 12. As a result, the containers 11 operate according to the configuration information. For example, when the configuration information of container 11a indicates "Memory: 200 Mi" and the configuration information of container 11b indicates "Memory: 500 Mi", the container execution unit 12 allocates memory resources of 200 Mi to container 11a and memory resources of 500 Mi to container 11b.

コンテナ設定変更部３２は、コンテナ管理部３１により管理されている設定情報を変更することができる。この実施例では、コンテナ１１上で動作するアプリケーションが異常終了したときに、コンテナ設定変更部３２は、ログ保存部２１に保存されているログ情報を参照して対応する設定情報を変更する。例えば、コンテナ１１ａにおいて動作するアプリケーションが異常終了したときには、コンテナ設定変更部３２は、コンテナ１１ａに係わるログ情報を参照してコンテナ１１ａの設定情報を変更する。 The container setting change unit 32 can change the setting information managed by the container management unit 31. In this embodiment, when an application running on the container 11 terminates abnormally, the container setting change unit 32 refers to the log information stored in the log storage unit 21 and changes the corresponding setting information. For example, when an application running in the container 11a terminates abnormally, the container setting change unit 32 refers to the log information related to the container 11a and changes the setting information of the container 11a.

図２は、コンテナシステム１の階層モデルの一例を示す。コンテナシステム１は、複数の実行環境を提供するためのハードウェア資源４１、ＯＳ４２、コンテナエンジン４３、およびコンテナ１１（１１ａ～１１ｎ）から構成される。ハードウェア資源４１は、コンテナシステム１を実現するために用意されたＣＰＵおよびメモリに相当する。ここで、ハードウェア資源４１は、複数のＣＰＵコア（または、複数のプロセッサコア）を備える。また、ハードウェア資源４１は、複数のメモリ機器を備えてもよい。ＯＳ４２は、特に限定されるものではなく、任意のＯＳを採用することができる。コンテナエンジン４３は、ＯＳ４２とコンテナ１１との間に設けられ、コンテナ管理部３１、コンテナ設定変更部３２、およびコンテナ実行部１２に相当する。また、コンテナエンジン４３は、ログ保存部２１にアクセスするためのインタフェースを含む。 Figure 2 shows an example of a hierarchical model of the container system 1. The container system 1 is composed of hardware resources 41 for providing multiple execution environments, an OS 42, a container engine 43, and containers 11 (11a to 11n). The hardware resources 41 correspond to a CPU and memory prepared to realize the container system 1. Here, the hardware resources 41 include multiple CPU cores (or multiple processor cores). The hardware resources 41 may also include multiple memory devices. The OS 42 is not particularly limited, and any OS can be adopted. The container engine 43 is provided between the OS 42 and the container 11, and corresponds to the container management unit 31, the container setting change unit 32, and the container execution unit 12. The container engine 43 also includes an interface for accessing the log storage unit 21.

コンテナ１１には、実行すべき１または複数のアプリケーションが与えられる。また、コンテナ１１は、各種ライブラリを備えることが好ましい。 The container 11 is provided with one or more applications to be executed. It is also preferable that the container 11 be provided with various libraries.

コンテナ１１は、自コンテナ上で動作するアプリケーションの状態（停止中、起動、実行中、異常など）を監視する監視機能を備える。また、コンテナ１１は、ロギング機能を備える。すなわち、コンテナ１１は、監視機能およびロギング機能を利用して、自コンテナにおいて発生する事象を表すログ情報を作成する。ログ情報は、例えば、定期的に作成される。あるいは、ログ情報は、アプリケーションの異常終了などの想定外の事象が発生したときに作成されてもよい。さらに、コンテナ１１は、通信機能を備える。すなわち、コンテナ１１は、コンテナエンジン４３からの指示または設定を受け付ける。また、コンテナ１１は、作成したログ情報をログ保存部２１に送信する。 The container 11 has a monitoring function for monitoring the status (stopped, started, running, abnormal, etc.) of applications running on the container itself. The container 11 also has a logging function. That is, the container 11 uses the monitoring function and logging function to create log information that indicates events that occur in the container itself. The log information is created periodically, for example. Alternatively, the log information may be created when an unexpected event occurs, such as an abnormal termination of an application. The container 11 also has a communication function. That is, the container 11 accepts instructions or settings from the container engine 43. The container 11 also transmits the created log information to the log storage unit 21.

図３は、コンテナシステム１の動作シーケンスの一例を示す。なお、図３に示す実施例では、１つのコンテナ１１が描かれているが、実際には、複数のコンテナ１１ａ～１１ｎが個々に動作する。また、各コンテナ１１の設定に係わる設定情報が予め作成されてコンテナ管理部３１により管理されているものとする。 Figure 3 shows an example of an operation sequence of the container system 1. Note that in the embodiment shown in Figure 3, one container 11 is depicted, but in reality, multiple containers 11a to 11n operate individually. Also, it is assumed that setting information related to the settings of each container 11 is created in advance and managed by the container management unit 31.

まず、コンテナ管理部３１は、コンテナ１１の設定情報をコンテナ実行部１２に送信する。設定情報は、上述したように、例えば、イメージ、ポート、ＣＰＵ、メモリ、レプリカ数、ユーザ権限などを表す情報を含む。コンテナ実行部１２は、コンテナ管理部３１から与えられる設定情報に従ってコンテナ１１をデプロイする。例えば、コンテナ実行部１２は、設定情報により指定されているハードウェア資源をコンテナ１１に割り当てる。 First, the container management unit 31 transmits the configuration information of the container 11 to the container execution unit 12. As described above, the configuration information includes information representing, for example, the image, port, CPU, memory, number of replicas, user authority, etc. The container execution unit 12 deploys the container 11 according to the configuration information provided by the container management unit 31. For example, the container execution unit 12 allocates the hardware resources specified by the configuration information to the container 11.

コンテナ１１は、監視機能を利用して、自コンテナに実装されているアプリケーションの状態を監視する。このとき、監視機能は、当該コンテナ上で動作するアプリケーション対して定期的に生存確認信号を送信し、その返信の有無に基づいてアプリケーションが正常に動作しているか否かを判定してもよい。そして、コンテナ１１は、定期的に、アプリケーションの状態を表す状態情報を作成してコンテナ管理部３１に送信する。ここで、状態情報は、例えば、時刻情報およびアプリケーションの状態（停止中、起動、実行中、異常など）を表す情報を含む。よって、コンテナ管理部３１は、コンテナ１１に実装されているアプリケーションの状態を検出できる。例えば、コンテナ１１に実装されているアプリケーションにエラーが発生したときには、コンテナ管理部３１は、状態情報によりそのエラーを検知できる。 The container 11 uses a monitoring function to monitor the status of an application implemented in its own container. At this time, the monitoring function may periodically send a liveness confirmation signal to the application running on the container, and determine whether the application is operating normally or not based on the presence or absence of a reply. The container 11 then periodically creates status information representing the status of the application and sends it to the container management unit 31. Here, the status information includes, for example, time information and information representing the application status (stopped, started, running, abnormal, etc.). Thus, the container management unit 31 can detect the status of the application implemented in the container 11. For example, when an error occurs in an application implemented in the container 11, the container management unit 31 can detect the error from the status information.

また、コンテナ１１は、定期的に、ログ情報を作成してログ保存部２１に送信する。加えて、コンテナ１１は、想定外の事象（例えば、アプリケーションの異常終了などのエラー）が発生したときに、その事象を表すログ情報をログ保存部２１に送信する。そうすると、ログ保存部２１は、コンテナ１１から受信するログ情報を時系列に保存する。 The container 11 also periodically creates log information and sends it to the log storage unit 21. In addition, when an unexpected event (for example, an error such as an abnormal termination of an application) occurs, the container 11 sends log information indicating the event to the log storage unit 21. The log storage unit 21 then stores the log information received from the container 11 in chronological order.

なお、コンテナ１１からコンテナ管理部３１への状態情報の送信、及び、コンテナ１１からログ保存部２１へのログ情報の送信は、互いに同期していることが好ましい。但し、これらの動作は、必ずしも互いに同期している必要はない。 It is preferable that the transmission of status information from the container 11 to the container management unit 31 and the transmission of log information from the container 11 to the log storage unit 21 are synchronized with each other. However, these operations do not necessarily need to be synchronized with each other.

このように、コンテナシステム１においては、設定情報に従って、コンテナ１１上で指定されたアプリケーションが実行される。そして、各コンテナ１１のログ情報がログ保存部２１に時系列に保存される。 In this way, in the container system 1, the specified application is executed on the container 11 according to the configuration information. Then, the log information of each container 11 is stored in chronological order in the log storage unit 21.

図４は、コンテナ上でエラーが発生したときのコンテナシステム１の動作シーケンスの一例を示す。この実施例では、コンテナ１１上でアプリケーションが実行されているときに、エラーが発生してそのアプリケーションが異常終了したものとする。 Figure 4 shows an example of the operation sequence of the container system 1 when an error occurs on the container. In this example, it is assumed that an error occurs while an application is being executed on the container 11, causing the application to terminate abnormally.

アプリケーションの異常終了を検知すると、コンテナ１１は、状態情報を利用してその旨をコンテナ管理部３１に通知する。このとき、状態情報は、アプリケーションが異常終了したことを表す情報およびアプリケーションが異常終了した時刻を表す情報を含む。また、コンテナ１１は、アプリケーションの異常終了に係わるログ情報を作成してログ管理基盤２０に送信する。そうすると、ログ管理基盤２０は、コンテナ１１から受信したログ情報をログ保存部２１に保存する。ログ情報は、アプリケーションが異常終了した時刻を表す情報に加えて、少なくとも、アプリケーションの異常終了の原因に係わるエラーログを含む。 When an abnormal termination of an application is detected, the container 11 notifies the container management unit 31 of this using the status information. At this time, the status information includes information indicating that the application terminated abnormally and information indicating the time when the application terminated abnormally. The container 11 also creates log information related to the abnormal termination of the application and sends it to the log management platform 20. The log management platform 20 then stores the log information received from the container 11 in the log storage unit 21. The log information includes at least an error log related to the cause of the abnormal termination of the application, in addition to information indicating the time when the application terminated abnormally.

図５は、ログ情報の一例を示す。この実施例では、ログ情報として、エラーが発生したコンテナ１１を識別する情報、異常終了したアプリケーションを識別する情報、ポート番号、異常終了の原因に係わるエラーログ（Reason: CrashLoopBackOff）などが記述されている。 Figure 5 shows an example of log information. In this embodiment, the log information includes information for identifying the container 11 in which the error occurred, information for identifying the application that terminated abnormally, a port number, an error log related to the cause of the abnormal termination (Reason: CrashLoopBackOff), and the like.

コンテナ管理部３１は、アプリケーションの異常終了を表す状態情報を受信すると、コンテナ設定変更部３２にエラー情報を送信する。エラー情報は、コンテナ１１から受信した状態情報に基づいて生成され、エラーが発生したコンテナを識別する情報、およびそのエラーが発生した時刻を表す情報を含む。なお、エラーが発生したコンテナを識別する情報は、コンテナ管理部３１が状態情報の送信元を検出することで特定される。 When the container management unit 31 receives status information indicating an abnormal termination of an application, it transmits error information to the container setting change unit 32. The error information is generated based on the status information received from the container 11, and includes information identifying the container in which the error occurred and information indicating the time when the error occurred. Note that the information identifying the container in which the error occurred is identified by the container management unit 31 detecting the source of the status information.

コンテナ管理部３１からエラー情報を受信すると、コンテナ設定変更部３２は、そのエラー情報を利用してログ保存部２１に保存されているログ情報を参照および検索する。具体的には、コンテナ設定変更部３２は、「エラーが発生した時刻」の近くの時間帯において、「エラーが発生したコンテナ」に係わるログ情報を検索する。これにより、コンテナ設定変更部３２は、アプリケーションの異常終了に係わるログ情報を取得する。 When error information is received from the container management unit 31, the container setting change unit 32 uses the error information to refer to and search for log information stored in the log storage unit 21. Specifically, the container setting change unit 32 searches for log information related to the "container in which the error occurred" in a time period close to the "time the error occurred." In this way, the container setting change unit 32 obtains log information related to the abnormal termination of the application.

続いて、コンテナ設定変更部３２は、取得したログ情報に基づいて、アプリケーションの異常終了に対する対策を決定する。ここで、コンテナ設定変更部３２は、ログ情報中に記述され得るエラーログに対応づけて、当該エラーログに係わるエラーを解消するための対策が設定された設定変更管理情報を保持する。そして、コンテナ設定変更部３２は、この設定変更管理情報を利用して、アプリケーションの異常終了に対する対策を決定する。 Then, the container setting change unit 32 determines countermeasures against abnormal termination of the application based on the acquired log information. Here, the container setting change unit 32 holds setting change management information in which countermeasures for resolving errors related to the error log are set, in association with an error log that may be described in the log information. Then, the container setting change unit 32 uses this setting change management information to determine countermeasures against abnormal termination of the application.

図６は、設定変更管理情報の一例を示す。設定変更管理情報は、この実施例では、エラーログに対して、エラーの原因および対策が記述されている。対策は、発生したエラーを解消するための方法を表し、具体的には、発生したエラーを解消するためにコンテナの設定情報をどのように変更するのかを表す。例えば、「OOM (Out of Memory) Killed」の原因が「コンテナを実行するために必要なメモリが不足」であり、その対策が「コンテナに割り当てるメモリ容量を１ＧＢ増加させる」であることが記述されている。なお、設定変更管理情報は、例えば、コンテナシステム１のユーザまたは管理者により予め作成されているものとする。 Figure 6 shows an example of configuration change management information. In this embodiment, the configuration change management information describes the cause of an error and countermeasures for the error log. The countermeasures indicate a method for resolving the error that occurred, and more specifically, how to change the container configuration information to resolve the error that occurred. For example, it is described that the cause of "OOM (Out of Memory) Killed" is "insufficient memory required to run the container," and the countermeasure is "increase the memory capacity allocated to the container by 1 GB." Note that the configuration change management information is assumed to have been created in advance by, for example, a user or administrator of the container system 1.

よって、コンテナ設定変更部３２は、エラー情報を利用して取得したログ情報を検索することで、アプリケーションの異常終了に対する対策を決定できる。例えば、取得したログ情報からエラーログ「OOMKilled」が検索されたときは、コンテナ設定変更部３２は、発生したエラーを解消するための対策が「コンテナに割り当てるメモリ容量を１ＧＢ増加させる」であることを決定する。 The container setting change unit 32 can therefore determine a measure to take against the abnormal termination of an application by searching the acquired log information using the error information. For example, when the error log "OOMKilled" is found from the acquired log information, the container setting change unit 32 determines that the measure to resolve the error that has occurred is to "increase the memory capacity allocated to the container by 1 GB."

また、コンテナ管理部３１は、エラーが発生したコンテナ１１の設定情報をコンテナ設定変更部３２に送信する。なお、図４に示す実施例では、エラー情報および設定情報が個々に送信されているが、本発明の実施形態はこの手順に限定されるものではない。すなわち、コンテナ管理部３１は、発生したエラーに係わるエラー情報およびそのエラーが発生したコンテナ１１の設定情報をいっしょにコンテナ設定変更部３２に送信してもよい。 The container management unit 31 also transmits the setting information of the container 11 in which the error occurred to the container setting change unit 32. Note that in the example shown in FIG. 4, the error information and the setting information are transmitted separately, but the embodiment of the present invention is not limited to this procedure. In other words, the container management unit 31 may transmit the error information related to the error that occurred and the setting information of the container 11 in which the error occurred together to the container setting change unit 32.

コンテナ設定変更部３２は、ログ情報を利用して決定した対策に従って、コンテナ管理部３１から受信した設定情報の内容を変更する。そして、コンテナ設定変更部３２は、変更後の設定情報をコンテナ管理部３１に送信する。 The container setting change unit 32 changes the contents of the setting information received from the container management unit 31 according to the measures determined using the log information. The container setting change unit 32 then transmits the changed setting information to the container management unit 31.

図７～図８は、変更情報の一例を示す。具体的には、図７は、コンテナ管理部３１からコンテナ設定変更部３２に送信される、変更前の設定情報の一例を示す。また、図８は、コンテナ設定変更部３２からコンテナ管理部３１に送信される、変更後の設定情報の一例を示す。 Figures 7 and 8 show an example of change information. Specifically, Figure 7 shows an example of setting information before the change that is sent from the container management unit 31 to the container setting change unit 32. Also, Figure 8 shows an example of setting information after the change that is sent from the container setting change unit 32 to the container management unit 31.

変更情報は、図７に示すように、コンテナ設定変更部３２により変更され得る情報として、ユーザ権限に係わる情報、イメージに係わる情報、およびメモリ／ＣＰＵに係わる情報を含む。ユーザ権限に係わる情報は、コンテナシステム１のユーザが使用できるコマンド等を表す。イメージに係わる情報は、コンテナ１１上で実行すべきアプリケーションイメージを表す。メモリ／ＣＰＵに係わる情報は、コンテナ１１に割り当てられるメモリの容量およびコンテナ１１に割り当てられるＣＰＵコアの個数を表す。 As shown in FIG. 7, the change information includes information related to user authority, information related to image, and information related to memory/CPU, which can be changed by the container setting change unit 32. Information related to user authority indicates commands that can be used by a user of the container system 1. Information related to image indicates an application image to be executed on the container 11. Information related to memory/CPU indicates the memory capacity allocated to the container 11 and the number of CPU cores allocated to the container 11.

この実施例では、例えば、エラーが発生したコンテナ１１に対応するログ情報からエラーログ「OOMKilled」が検索されるものとする。ここで、このエラーログに対応する対策は、上述したように、「コンテナに割り当てるメモリ容量を１ＧＢ増加させる」である。したがって、この場合、コンテナ設定変更部３２は、設定情報中のメモリに係わる記述を変更する。具体的には、図８に示すように、メモリの上限値が「２００Ｍｉ」から「１２００Ｍｉ」に書き換えられる。そして、コンテナ設定変更部３２は、内容を変更した設定情報をコンテナ管理部３１に送信する。このとき、コンテナ設定変更部３２は、変更後の設定情報といっしょにコンテナ実行命令をコンテナ管理部３１に送信してもよい。 In this embodiment, for example, the error log "OOMKilled" is searched for from the log information corresponding to the container 11 in which the error occurred. Here, the countermeasure for dealing with this error log is, as described above, "increase the memory capacity allocated to the container by 1 GB." Therefore, in this case, the container setting change unit 32 changes the description related to memory in the setting information. Specifically, as shown in FIG. 8, the upper limit value of the memory is rewritten from "200 Mi" to "1200 Mi." Then, the container setting change unit 32 transmits the changed setting information to the container management unit 31. At this time, the container setting change unit 32 may transmit a container execution command to the container management unit 31 together with the changed setting information.

コンテナ管理部３１は、コンテナ設定変更部３２により内容が変更された設定情報をコンテナ実行部１２に与えると共に、コンテナ１１の実行を指示する。そうすると、コンテナ実行部１２は、変更後の設定情報に従って、コンテナ１１を再デプロイ（または、再起動）する。そして、コンテナ１１は、変更後の設定情報に従って動作する。 The container management unit 31 provides the container execution unit 12 with the configuration information whose contents have been changed by the container configuration change unit 32, and instructs it to execute the container 11. The container execution unit 12 then redeploys (or restarts) the container 11 in accordance with the changed configuration information. The container 11 then operates in accordance with the changed configuration information.

図７～図８に示す例では、コンテナ１１の設定情報において、メモリの上限値が２００Ｍｉから１２００Ｍｉに増加されている。したがって、コンテナ実行部１２は、この設定情報に従って、コンテナ１１に割り当てるメモリ容量を増加させる。そして、コンテナ１１は、メモリ容量が増加した状態でアプリケーションを実行する。よって、メモリ不足に起因してエラーが発生する事態が回避される。 In the example shown in Figures 7 and 8, the memory upper limit has been increased from 200 Mi to 1200 Mi in the configuration information for the container 11. Therefore, the container execution unit 12 increases the memory capacity allocated to the container 11 in accordance with this configuration information. The container 11 then executes the application with the increased memory capacity. This prevents errors from occurring due to a lack of memory.

このように、本発明の実施形態に係わるコンテナシステム１においては、コンテナ１１においてエラーが発生してアプリケーションが異常終了すると、そのエラーの原因が特定され、そのエラーを解消するための対策が決定される。そして、エラーが発生したコンテナ１１の設定情報は決定した対策に応じて変更され、変更後の設定情報に従って再デプロイされる。したがって、コンテナ１１において同じエラーが繰り返される事態は回避される。 In this way, in the container system 1 according to an embodiment of the present invention, when an error occurs in the container 11 and causes an application to abnormally terminate, the cause of the error is identified and measures to resolve the error are determined. The configuration information of the container 11 in which the error occurred is then changed in accordance with the determined measures, and the container is redeployed according to the changed configuration information. This prevents the same error from occurring repeatedly in the container 11.

図９は、コンテナ管理部３１の処理の一例を示すフローチャートである。この実施例では、コンテナ１１の設定に係わる設定情報が作成されているものとする。 Figure 9 is a flowchart showing an example of the processing of the container management unit 31. In this embodiment, it is assumed that configuration information related to the settings of the container 11 has been created.

Ｓ１において、コンテナ管理部３１は、コンテナ１１の設定情報をコンテナ実行部１２に送信する。これにより、コンテナ実行部１２はコンテナ１１をデプロイし、コンテナ１１は、実装されているアプリケーションを実行する。 In S1, the container management unit 31 transmits the configuration information of the container 11 to the container execution unit 12. As a result, the container execution unit 12 deploys the container 11, and the container 11 executes the implemented application.

Ｓ２において、コンテナ管理部３１は、コンテナ１１から送信される状態情報を監視する。そして、アプリケーションの異常終了等のエラーが発生したことを表す状態情報を受信すると、コンテナ管理部３１の処理はＳ３に進む。 In S2, the container management unit 31 monitors the status information sent from the container 11. Then, when status information is received indicating that an error such as an abnormal termination of an application has occurred, the processing of the container management unit 31 proceeds to S3.

Ｓ３において、コンテナ管理部３１は、コンテナ１１においてエラーが発生したことを表すエラー情報をコンテナ設定変更部３２に送信する。エラー情報は、エラーが発生したコンテナ１１を識別する情報を含む。また、Ｓ４において、コンテナ管理部３１は、エラーが発生したコンテナ１１の設定情報をコンテナ設定変更部３２に送信する。なお、コンテナ管理部３１は、Ｓ３およびＳ４を同時に実行してもよい。この後、コンテナ管理部３１は、コンテナ設定変更部３２からの返信を待ち受ける。 In S3, the container management unit 31 sends error information indicating that an error has occurred in the container 11 to the container setting change unit 32. The error information includes information identifying the container 11 in which the error has occurred. In addition, in S4, the container management unit 31 sends setting information of the container 11 in which the error has occurred to the container setting change unit 32. Note that the container management unit 31 may execute S3 and S4 simultaneously. Thereafter, the container management unit 31 waits for a reply from the container setting change unit 32.

Ｓ５において、コンテナ管理部３１は、コンテナ設定変更部３２により内容が変更されたコンテナ１１の設定情報を受信する。そうすると、コンテナ管理部３１は、Ｓ６において、変更後の設定情報をコンテナ実行部１２に送信する。これにより、コンテナ実行部１２は、変更後の設定情報に基づいてコンテナ１１を再デプロイする。 In S5, the container management unit 31 receives the configuration information of the container 11 whose contents have been changed by the container setting change unit 32. Then, in S6, the container management unit 31 transmits the changed configuration information to the container execution unit 12. As a result, the container execution unit 12 redeploys the container 11 based on the changed configuration information.

図１０は、コンテナ設定変更部３２の処理の一例を示すフローチャートである。コンテナ設定変更部３２は、Ｓ１１において、コンテナ管理部３１からエラー情報が送信されることを監視する。そして、コンテナ管理部３１からエラー情報を受信すると、コンテナ設定変更部３２の処理はＳ１２に進む。なお、コンテナ設定変更部３２は、エラー情報といっしょに、エラーが発生したコンテナ１１の設定情報を受信するものとする。或いは、コンテナ設定変更部３２は、エラー情報を受信した後に、エラーが発生したコンテナ１１の設定情報を受信するものとする。 Figure 10 is a flowchart showing an example of the processing of the container setting change unit 32. In S11, the container setting change unit 32 monitors whether error information is sent from the container management unit 31. Then, when error information is received from the container management unit 31, the processing of the container setting change unit 32 proceeds to S12. Note that the container setting change unit 32 receives the setting information of the container 11 in which the error occurred together with the error information. Alternatively, the container setting change unit 32 receives the setting information of the container 11 in which the error occurred after receiving the error information.

Ｓ１２～Ｓ１３において、コンテナ設定変更部３２は、エラー情報に基づいて、ログ保存部２１に保存されているログ情報を参照する。このとき、コンテナ設定変更部３２は、エラー情報により指定されるコンテナ１１（即ち、エラーが発生したコンテナ１１）に係わるエラーログを検索する。 In S12 to S13, the container setting change unit 32 refers to the log information stored in the log storage unit 21 based on the error information. At this time, the container setting change unit 32 searches for an error log related to the container 11 specified by the error information (i.e., the container 11 in which the error occurred).

Ｓ１４において、コンテナ設定変更部３２は、Ｓ１３で検索したエラーログに対応するエラー原因を特定すると共に、そのエラーを解消するための対策を決定する。このとき、コンテナ設定変更部３２は、予め作成されている設定変更管理情報（例えば、図６に示す設定変更管理情報）を参照する。 In S14, the container setting change unit 32 identifies the cause of the error corresponding to the error log searched in S13, and determines a measure to resolve the error. At this time, the container setting change unit 32 refers to setting change management information that has been created in advance (for example, the setting change management information shown in FIG. 6).

Ｓ１５において、コンテナ設定変更部３２は、Ｓ１４で決定した対策に従って、エラーが発生したコンテナ１１の設定情報を変更する。このとき、コンテナ設定変更部３２は、例えば、コンテナ１１に割り当てるメモリ／ＣＰＵ資源の上限量が増加するように設定情報を変更する。或いは、コンテナ設定変更部３２は、コンテナシステム１のユーザの権限が追加されるように設定情報を変更する。 In S15, the container setting change unit 32 changes the setting information of the container 11 in which the error occurred in accordance with the countermeasure determined in S14. At this time, the container setting change unit 32 changes the setting information, for example, to increase the upper limit of memory/CPU resources to be allocated to the container 11. Alternatively, the container setting change unit 32 changes the setting information to add authority to the user of the container system 1.

Ｓ１６において、コンテナ設定変更部３２は、内容が変更された設定情報をコンテナ管理部３１に送信する。なお、コンテナ管理部３１は、図９に示すＳ５において、変更後の設定情報を受信する。よって、コンテナ設定変更部３２により内容が変更された設定情報に基づいてコンテナ１１が再デプロイされることになる。 In S16, the container setting change unit 32 transmits the changed setting information to the container management unit 31. The container management unit 31 receives the changed setting information in S5 shown in FIG. 9. Therefore, the container 11 is redeployed based on the setting information whose contents have been changed by the container setting change unit 32.

＜バリエーション＞
上述した実施例では、コンテナ１１がアプリケーションの状態を監視し、アプリケーションの異常終了等のエラーが発生したときには、コンテナ１１からコンテナ管理部３１に状態情報が通知されるが、本発明の実施形態はこの構成に限定されるものではない。例えば、コンテナ管理部３１がコンテナ１１の状態を監視してもよい。この場合、コンテナ管理部３１は、定期的に各コンテナ１１に生存確認信号を送信し、その返信の有無に基づいて各コンテナ１１が正常に動作しているか否かを判定してもよい。 <Variations>
In the above-described embodiment, the container 11 monitors the state of the application, and when an error such as an abnormal termination of the application occurs, the container 11 notifies the container management unit 31 of state information, but the embodiment of the present invention is not limited to this configuration. For example, the container management unit 31 may monitor the state of the container 11. In this case, the container management unit 31 may periodically transmit a survival confirmation signal to each container 11 and determine whether or not each container 11 is operating normally based on the presence or absence of a reply.

上述の実施例では、コンテナ管理部３１からコンテナ設定変更部３２に通知されるエラー情報が、エラーが発生した時刻を表す情報を含むが、本発明の実施形態はこの構成に限定されるものではない。例えば、エラー情報を受信したコンテナ設定変更部３２は、未検索のログ情報を検索することで、新たなエラーに対応するエラーログを抽出してもよい。なお、コンテナ毎にログ情報が記録されている場合には、エラー情報により指定されるコンテナ１１（即ち、エラーが発生したコンテナ１１）に係わるログ情報のうちで、未検索ログ情報を検索すればよい。 In the above-described embodiment, the error information notified from the container management unit 31 to the container setting change unit 32 includes information indicating the time when the error occurred, but the embodiment of the present invention is not limited to this configuration. For example, the container setting change unit 32 that has received the error information may extract an error log corresponding to a new error by searching for unsearched log information. Note that, if log information is recorded for each container, it is sufficient to search for unsearched log information among the log information related to the container 11 specified by the error information (i.e., the container 11 in which the error occurred).

＜ハードウェア構成＞
図１１は、コンテナシステム１のハードウェア構成の一例を示す。コンテナシステム１は、プロセッサ２０１、メモリ２０２、記憶装置２０３、入出力デバイス２０４、記録媒体読取り装置２０５、および通信インタフェース２０６を備えるコンピュータ２００により実現される。 <Hardware Configuration>
11 shows an example of a hardware configuration of the container system 1. The container system 1 is realized by a computer 200 including a processor 201, a memory 202, a storage device 203, an input/output device 204, a recording medium reader 205, and a communication interface 206.

プロセッサ２０１は、複数のプロセッサコアを含み、記憶装置２０３に保存されているコンテナ管理プログラムを実行することで、コンテナシステム１の動作を制御する。コンテナ管理プログラムは、図９～図１０に示すフローチャートの手順を記述したプログラムコードを含む。よって、プロセッサ２０１がこのプログラムを実行することで、図１に示すコンテナ実行部１２、コンテナ管理部３１、およびコンテナ設定変更部３２の機能が提供される。また、プロセッサ２０１を構成する複数のプロセッサコアの一部は、コンテナ１１を実現するために割り当てられる。 The processor 201 includes multiple processor cores, and controls the operation of the container system 1 by executing a container management program stored in the storage device 203. The container management program includes program code that describes the procedures of the flowcharts shown in Figures 9 and 10. Thus, when the processor 201 executes this program, the functions of the container execution unit 12, container management unit 31, and container setting change unit 32 shown in Figure 1 are provided. In addition, some of the multiple processor cores that make up the processor 201 are allocated to realize the container 11.

メモリ２０２は、プロセッサ２０１の作業領域として使用される。また、メモリ２０２の一部は、コンテナ１１を実現するために割り当てられる。記憶装置２０３は、コンテナ管理プログラムおよび他のプログラムを保存する。なお、ログ保存部２１は、記憶装置２０３を利用して実現される。 The memory 202 is used as a working area for the processor 201. A part of the memory 202 is allocated to realize the container 11. The storage device 203 stores the container management program and other programs. The log storage unit 21 is realized using the storage device 203.

入出力デバイス２０４は、キーボード、マウス、タッチパネル、マイクなどの入力デバイスを含む。また、入出力デバイス２０４は、表示装置、スピーカーなどの出力デバイスを含む。記録媒体読取り装置２０５は、記録媒体２１０に記録されているデータおよび情報を取得できる。記録媒体２１０は、コンピュータ２００に着脱可能なリムーバブル記録媒体である。また、記録媒体２１０は、例えば、半導体メモリ、光学的作用で信号を記録する媒体、または磁気的作用で信号を記録する媒体により実現される。なお、コンテナ管理プログラムは、記録媒体２１０からコンピュータ２００に与えられてもよい。通信インタフェース２０６は、ネットワークに接続する機能を提供する。なお、コンテナ管理プログラムがプログラムサーバ２２０に保存されているときには、コンピュータ２００は、プログラムサーバ２２０からコンテナ管理プログラムを取得してもよい。 The input/output device 204 includes input devices such as a keyboard, a mouse, a touch panel, and a microphone. The input/output device 204 also includes output devices such as a display device and a speaker. The recording medium reader 205 can acquire data and information recorded on the recording medium 210. The recording medium 210 is a removable recording medium that can be attached to and detached from the computer 200. The recording medium 210 is realized by, for example, a semiconductor memory, a medium that records signals by optical action, or a medium that records signals by magnetic action. The container management program may be provided to the computer 200 from the recording medium 210. The communication interface 206 provides a function for connecting to a network. When the container management program is stored in the program server 220, the computer 200 may acquire the container management program from the program server 220.

１コンテナシステム
１０アプリケーション実行基盤
１１（１１ａ～１１ｎ）コンテナ
１２コンテナ実行部
２０ログ管理基盤
２１ログ保存部
３０アプリケーション管理基盤
３１コンテナ管理部
３２コンテナ設定変更部 1 Container system 10 Application execution platform 11 (11a to 11n) Container 12 Container execution unit 20 Log management platform 21 Log storage unit 30 Application management platform 31 Container management unit 32 Container setting change unit

Claims

A container in which the application can run;
a container management unit that manages setting information related to the setting of the container;
a container execution unit that operates the container in accordance with the setting information;
a log storage unit for storing log information representing events occurring in the container;
a container setting change unit that changes the setting information by referring to the log information when an error occurs in the container,
The container system according to claim 1, wherein the container execution unit operates the container based on the setting information changed by the container setting change unit.

The container setting change unit includes:
Identifying the cause of the error that occurred on the container by referring to the log information;
The container system according to claim 1 , further comprising: changing the setting information in response to the identified cause.

the container setting change unit holds setting change management information in which a measure for eliminating an error related to the error log is set in association with the error log that may be described in the log information;
The container setting change unit includes:
searching the log information to identify an error log corresponding to an error that occurred on the container;
Referencing the setting change management information, and determining a measure to be taken in response to the identified error log;
The container system according to claim 1 , wherein the setting information is changed in accordance with the determined countermeasure.

the container outputs status information indicating that an error has occurred in the container when an application running on the container has abnormally terminated, and stores log information including an error log indicating the content of the error that has occurred in the container in the log storage unit;
The container system according to claim 3, characterized in that the container setting change unit searches the log information in response to the output of the status information to identify an error log corresponding to an error that has occurred in the container.

The setting information describes hardware resources to be allocated to the container,
The container system according to claim 1 , wherein the container setting change unit changes the setting information so that an amount of hardware resources allocated to the container is increased.