TWI795887B - Method, electronic equipment and storage medium for virtual machine migration - Google Patents
Method, electronic equipment and storage medium for virtual machine migration Download PDFInfo
- Publication number
- TWI795887B TWI795887B TW110131545A TW110131545A TWI795887B TW I795887 B TWI795887 B TW I795887B TW 110131545 A TW110131545 A TW 110131545A TW 110131545 A TW110131545 A TW 110131545A TW I795887 B TWI795887 B TW I795887B
- Authority
- TW
- Taiwan
- Prior art keywords
- virtual machine
- node
- state
- migration
- computing
- Prior art date
Links
- 230000005012 migration Effects 0.000 title claims abstract description 120
- 238000013508 migration Methods 0.000 title claims abstract description 105
- 238000000034 method Methods 0.000 title claims abstract description 51
- 238000012544 monitoring process Methods 0.000 claims abstract description 91
- 238000004891 communication Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 10
- 230000008439 repair process Effects 0.000 claims description 8
- 230000002159 abnormal effect Effects 0.000 description 33
- 238000007726 management method Methods 0.000 description 32
- 238000012423 maintenance Methods 0.000 description 13
- 230000008569 process Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 229920001621 AMOLED Polymers 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000007689 inspection Methods 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000013500 data storage Methods 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000010295 mobile communication Methods 0.000 description 2
- 239000002096 quantum dot Substances 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Landscapes
- Electric Double-Layer Capacitors Or The Like (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
本申請涉及雲計算技術領域,具體涉及一種虛擬機器遷移方法、電子設備及存儲介質。 The present application relates to the technical field of cloud computing, and in particular to a virtual machine migration method, electronic equipment and storage media.
目前,對於雲平臺之管理,通常會使用一套監控物理節點之系統,需要運維人員進行整體服務之運作管理。當某一物理節點發生異常或需要維護時,將運行於此物理節點之服務轉移至其他物理節點,而服務轉移之行為通常是由運維人員進行手動操作。對於突發狀況之反應,手動操作通常會有一定之時間延遲,影響運維之效率。 At present, for the management of cloud platforms, a system of monitoring physical nodes is usually used, requiring operation and maintenance personnel to manage the operation and management of the overall service. When a physical node is abnormal or requires maintenance, the services running on this physical node are transferred to other physical nodes, and the behavior of service transfer is usually manually operated by operation and maintenance personnel. For the response to emergencies, manual operations usually have a certain time delay, which affects the efficiency of operation and maintenance.
以Openstack雲平臺為例。當一個計算節點發生異常導致該計算節點關機或重啟,或當一個計算節點發生故障影響Openstack雲平臺之正常運行時,需要將該計算節點之虛擬機器遷移到其他空閒之計算節點。當運維人員發現Openstack雲平臺出現異常時,藉由監測資料判斷是否遷移虛擬機器(Virtual Machine,VM),並藉由Openstack Nova模組中之evacuate指令來遷移虛擬機器至其他計算節點。使用該指令遷移虛擬機器,需要運維人員手動輸入指令或是調用應用程式設計發展介面(Application Programming Interface,API),會產生一定之時間延遲。當遇到例外狀況時,還需花費額外之時間去復原虛擬機器。 Take the Openstack cloud platform as an example. When a computing node is abnormal and causes the computing node to shut down or restart, or when a computing node fails to affect the normal operation of the Openstack cloud platform, the virtual machine of the computing node needs to be migrated to other idle computing nodes. When the operation and maintenance personnel find that the Openstack cloud platform is abnormal, they can judge whether to migrate the virtual machine (Virtual Machine, VM) based on the monitoring data, and use the evacuate command in the Openstack Nova module to migrate the virtual machine to other computing nodes. Using this command to migrate a virtual machine requires the operation and maintenance personnel to manually input the command or call the application programming interface (Application Programming Interface, API), which will cause a certain time delay. When exceptions are encountered, additional time is required to restore virtual machines.
本申請提供一種虛擬機器遷移方法、電子設備及存儲介質,以提高虛擬機器之遷移效率。 The present application provides a virtual machine migration method, an electronic device and a storage medium, so as to improve the migration efficiency of the virtual machine.
本申請一實施例之虛擬機器遷移方法包括:監測計算節點之狀態。確定計算節點是否滿足觸發條件。其中,觸發條件包括計算節點之掉線時長達到預設時長,或計算節點之狀態不穩定。若計算節點滿足觸發條件,則向控制節點發送消息,以遷移虛擬機器。 The virtual machine migration method according to an embodiment of the present application includes: monitoring the status of computing nodes. Determines whether a compute node satisfies a trigger condition. Wherein, the trigger condition includes that the offline time of the computing node reaches a preset time, or the status of the computing node is unstable. If the computing node meets the trigger condition, it sends a message to the control node to migrate the virtual machine.
於其中一種實施方式中,計算節點掉線,包括:nova-compute代理服務掉線,處於掉線狀態之nova-compute代理服務之數目為1,10G網路掉線及1G網路掉線。 In one embodiment, the computing node goes offline, including: the nova-compute proxy service goes offline, the number of nova-compute proxy services in the offline state is 1, the 10G network goes offline and the 1G network goes offline.
於另一種實施方式中,計算節點之狀態不穩定,包括:nova-compute代理服務掉線,處於掉線狀態之nova-compute代理服務之數目為1,10G網路掉線及處於掉線狀態之10G網路中計算節點之數目為1。 In another embodiment, the state of the computing node is unstable, including: the nova-compute proxy service is offline, the number of nova-compute proxy services in the offline state is 1, and the 10G network is offline and in the offline state The number of computing nodes in the 10G network is 1.
於另一種實施方式中,當處於掉線狀態之nova-compute代理服務之數目為1,且nova-compute代理服務掉線之頻次達到第一預設頻次,及處於掉線狀態之10G網路中計算節點之數目為1,且10G網路掉線之頻次達到第二預設頻次,確定計算節點之狀態不穩定。 In another embodiment, when the number of nova-compute proxy services in the offline state is 1, and the frequency of nova-compute proxy service offline reaches the first preset frequency, and in the 10G network in the offline state The number of computing nodes is 1, and the frequency of 10G network disconnection reaches the second preset frequency, and it is determined that the status of the computing nodes is unstable.
於另一種實施方式中,虛擬機器遷移方法還包括:計算與顯示虛擬機器之遷移時間。 In another embodiment, the virtual machine migration method further includes: calculating and displaying the migration time of the virtual machine.
於另一種實施方式中,虛擬機器遷移方法還包括:儲存虛擬機器之遷移記錄。 In another implementation manner, the virtual machine migration method further includes: storing a migration record of the virtual machine.
本申請另一實施例之虛擬機器遷移方法包括:檢查雲平臺管理系統之狀態。向監測節點回饋虛擬機器遷移之消息。檢查虛擬機器之狀態是否與遷移之前保持一致。若虛擬機器之狀態與遷移之前不一致,則修復虛擬機器。 Another embodiment of the virtual machine migration method of the present application includes: checking the status of the cloud platform management system. Feedback the news of virtual machine migration to the monitoring node. Check whether the state of the virtual machine is consistent with that before migration. If the state of the virtual machine is inconsistent with that before the migration, the virtual machine is repaired.
於其中一種實施方式中,檢查雲平臺管理系統之狀態,包括:檢查是否存在空閒之計算節點。檢查nova服務之狀態是否正常。檢查電源之狀態是否正常。 In one of the implementation manners, checking the status of the cloud platform management system includes: checking whether there are idle computing nodes. Check whether the status of the nova service is normal. Check whether the status of the power supply is normal.
本申請另一實施例之電子設備包括通訊模組,顯示幕,記憶體,及處理器,處理器運行存儲於記憶體中之電腦程式或代碼,實現本申請實施例之上述虛擬機器遷移方法。 The electronic device in another embodiment of the present application includes a communication module, a display screen, a memory, and a processor. The processor runs computer programs or codes stored in the memory to implement the above-mentioned virtual machine migration method in the embodiment of the present application.
本申請另一實施例之存儲介質用於存儲電腦程式或代碼,當電腦程式或代碼被處理器執行時,實現本申請實施例之上述虛擬機器遷移方法。 The storage medium in another embodiment of the present application is used to store computer programs or codes. When the computer programs or codes are executed by the processor, the above-mentioned virtual machine migration method in the embodiment of the present application is realized.
本申請實施例於雲平臺管理系統中設定監測節點與觸發條件,當監測資料滿足觸發條件,則系統自動開始進行虛擬機器遷移工作,能夠自動遷移虛擬機器離開故障節點,恢復上線,並及時處理虛擬機器遷移過程中遇到之突發狀況,減少系統宕機之時間,提高虛擬機器之遷移效率,從而提升運維之效率。 In the embodiment of this application, monitoring nodes and triggering conditions are set in the cloud platform management system. When the monitoring data meets the triggering conditions, the system will automatically start the migration of the virtual machine, which can automatically migrate the virtual machine away from the faulty node, resume online, and process the virtual machine in time. Unexpected situations encountered in the process of machine migration can reduce system downtime, improve the migration efficiency of virtual machines, and thus improve the efficiency of operation and maintenance.
100:雲平臺管理系統 100: Cloud platform management system
110:監測節點 110: Monitoring node
120:控制節點 120:Control node
130:計算節點 130: computing node
111,121:處理器 111,121: Processor
112,122:記憶體 112,122: memory
113,123:通訊模組 113,123: communication module
114:顯示幕 114: display screen
S101-S106,S201-S210:步驟 S101-S106, S201-S210: steps
圖1是本申請一實施方式之雲平臺管理系統之結構示意圖。 FIG. 1 is a schematic structural diagram of a cloud platform management system according to an embodiment of the present application.
圖2是本申請一實施方式之虛擬機器遷移方法之流程圖。 FIG. 2 is a flowchart of a virtual machine migration method in an embodiment of the present application.
圖3是本申請一實施方式之監測節點之結構示意圖。 FIG. 3 is a schematic structural diagram of a monitoring node in an embodiment of the present application.
圖4是本申請另一實施方式之虛擬機器遷移方法之流程圖。 FIG. 4 is a flowchart of a virtual machine migration method according to another embodiment of the present application.
圖5是本申請一實施方式之控制節點之結構示意圖。 FIG. 5 is a schematic structural diagram of a control node according to an embodiment of the present application.
為能夠更清楚地理解本申請之上述目的、特徵與優點,下面結合附圖與具體實施例對本申請進行詳細描述。需要說明的是,於不衝突之情況下,本申請之實施例及實施例中之特徵可相互組合。於下面之描述中闡述了很多具體細節以便於充分理解本申請,所描述之實施例僅是本申請一部分實施例,而非全部之實施例。 In order to more clearly understand the above purpose, features and advantages of the present application, the present application will be described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments can be combined with each other. A lot of specific details are set forth in the following description to facilitate a full understanding of the application, and the described embodiments are only a part of the embodiments of the application, but not all of the embodiments.
需要說明的是,雖於流程圖中示出了邏輯順序,然於某些情況下,可不同於流程圖中之循序執行所示出或描述之步驟。本申請實施例中公開之方法包括用於實現方法之一個或多個步驟或動作。方法步驟與/或動作可於不脫離權利要求之範圍之情況下彼此互換。 It should be noted that although the logical order is shown in the flow chart, in some cases, the steps shown or described in the flow chart may be executed in a different order. The methods disclosed in the embodiments of the present application include one or more steps or actions for implementing the methods. The method steps and/or actions may be interchanged with one another without departing from the scope of the claims.
於本申請實施例中,物理節點包括監測節點(Monitor Node)110、控制節點(Control Node)120及計算節點(Compute Node)130。物理節點可是電子設備,例如智慧型電話、平板電腦、個人電腦(Personal Computer,PC)、個人數位助理(Personal Digital Assistant,PDA)、路由器、工作站或伺服器等。 In the embodiment of the present application, the physical nodes include a monitor node (Monitor Node) 110 , a control node (Control Node) 120 and a computing node (Compute Node) 130 . A physical node can be an electronic device, such as a smart phone, a tablet computer, a personal computer (Personal Computer, PC), a personal digital assistant (Personal Digital Assistant, PDA), a router, a workstation, or a server.
圖1是本申請一實施方式之雲平臺管理系統100之結構示意圖。
FIG. 1 is a schematic structural diagram of a cloud
可參閱圖1,雲平臺管理系統100包括監測節點110、控制節點120及計算節點130。其中,監測節點110通訊連接於計算節點130與控制節點120,控制節點120通訊連接於計算節點130。
Referring to FIG. 1 , the cloud
於本實施例中,通訊連接包括有線連接與無線連接。其中,有線連接是指藉由有線傳輸介質(例如光纖或雙絞線)進行連接。無線連接是指藉由無線傳輸介質(例如WiFi,藍牙,NFC,或2G/3G/4G/5G等無線通訊網路)進行連接。 In this embodiment, the communication connection includes wired connection and wireless connection. Wherein, wired connection refers to connection through a wired transmission medium (such as optical fiber or twisted pair). Wireless connection refers to connection through wireless transmission media (such as WiFi, Bluetooth, NFC, or wireless communication networks such as 2G/3G/4G/5G).
計算節點130用以運行一台或多台虛擬機器。
控制節點120用以控制一個或多個計算節點130,將一個計算節點130之虛擬機器遷移至另一個計算節點130。
The
監測節點110用以監測計算節點130之狀態,以確定計算節點130是否發生異常。當計算節點130發生異常時,監測節點110向控制節點120發送消息,以將發生異常之計算節點130之虛擬機器遷移至其他可用之計算節點130。
The
於本實施例中,監測節點110可藉由設置一定之觸發條件來判斷計算節點130是否發生異常。當計算節點130之狀態滿足觸發條件時,監測節點110確定計算節點130發生異常,向控制節點120發送消息,以遷移虛擬機器。當計算節點130之狀態不滿足觸發條件時,監測節點110確定計算節點130正常,繼續監測計算節點130之狀態。
In this embodiment, the
於其中一種實施方式中,觸發條件包括計算節點130之掉線時長達到預設時長,或計算節點130之狀態不穩定。其中,預設時長之取值範圍為1min-5min。例如,預設時間可取2min。
In one of the implementation manners, the trigger condition includes that the offline duration of the
計算節點130掉線包括以下四個條件:nova-compute代理服務掉線,處於掉線狀態之nova-compute代理服務之數目為1,10G網路掉線及1G網路掉線。其中,10G網路是指控制節點120到計算節點130之間之網路,虛擬機器於10G網路上進行遷移。1G網路是指監測節點110到計算節點130之間之網路,監測訊號於1G網路上進行傳輸。nova-compute代理服務可提供管理與配置虛擬機器之入口。nova-compute代理服務於計算節點130上運行,負責管理計算節點130上之實例(instance)。當上述四個條件同時滿足時,可確定計算節點130掉線。
The disconnection of the
計算節點130之狀態不穩定包括以下四個條件:nova-compute代理服務掉線,處於掉線狀態之nova-compute代理服務之數目為1,10G網路掉線及處於掉線狀態之10G網路中計算節點130之數目為1。當處於掉線狀態之nova-compute代理服務之數目為1,且nova-compute代理服務掉線之頻次達到第一預設頻次,及處於掉線狀態之10G網路中計算節點130之數目為1,且10G網路掉線之頻次達到第二預設頻次時,可確定計算節點130之狀態不穩定。其中,第
一預設頻次之取值範圍為8/10min-14/10min。例如,第一預設頻次可取14/10min。第二預設頻次之取值範圍為4/10min-7/10min。例如,第二預設頻次可取7/10min。
The unstable state of computing
具體而言,於10G網路中,nova-compute代理服務於正常工作時每30秒發送1次資料包,以管理計算節點130上之實例。10min之時段,nova-compute代理服務累計可發送20次資料包。當發包之丟包率達到40%-70%,即第一預設頻次取8/10min-14/10min時,可確定nova-compute代理服務之狀態不穩定。計算節點130於正常工作時每60秒發送1次資料包,以減小廣播風暴之干擾。10min之時段,計算節點130累計可發送10次資料包。當發包之丟包率達到40%-70%,即第二預設頻次取4/10min-7/10min時,可確定計算節點130之狀態不穩定。當確定nova-compute代理服務與計算節點130之工作狀態均不穩定時,監測節點110確定計算節點130發生異常,向控制節點120發送消息,以遷移虛擬機器。
Specifically, in the 10G network, the nova-compute proxy service sends a data packet every 30 seconds to manage instances on the
於其中一種實施方式中,當監測節點110確定多個計算節點130發生異常時,依次遷移發生異常之各個計算節點130上之虛擬機器,以減少將一個異常計算節點130上之虛擬機器遷移至另一個異常計算節點130之狀況。
In one of the implementation manners, when the
舉例而言,當監測節點110確定第一計算節點130與第二計算節點130發生異常,監測節點110向控制節點120發送第一消息,以遷移第一計算節點130上之虛擬機器。當第一計算節點130上之虛擬機器遷移完成之後,監測節點110向控制節點120發送第二消息,以遷移第二計算節點130上之虛擬機器。
For example, when the
於其中一種實施方式中,監測節點110可切換至保險狀態,以停止遷移其他計算節點130上之虛擬機器。
In one of the implementation manners, the
具體而言,當監測節點110確定多個計算節點130發生異常時,向控制節點120發送消息,以遷移虛擬機器。控制節點120查找空閒之計算節點
130,依次遷移發生異常之各個計算節點130上之虛擬機器。當控制節點120開始遷移一個異常計算節點130上之虛擬機器時,監測節點110切換至保險狀態,停止遷移其他異常計算節點130上之虛擬機器。當確定一個異常計算節點130上之虛擬機器遷移完成之後,監測節點110切換回工作狀態,以繼續監測計算節點130之狀態。
Specifically, when the
於其中一種實施方式中,當監測節點110向控制節點120發送消息時,監測節點110檢查控制節點120之執行緒標記(flag)與狀態鎖(lock),以減少重新進入執行緒(re-entry)之狀況。其中,執行緒標記用以表示控制節點120是否正於運行虛擬機器之遷移程式。狀態鎖用以表示控制節點120是否處於鎖定狀態。
In one of the implementation manners, when the
舉例而言,當執行緒標記flag=1時,表示控制節點120正於運行虛擬機器之遷移程式,控制節點120向監測節點110回饋狀態資訊,以停止新之遷移程式,等待當前之遷移程式運行完畢。當執行緒標記flag=0時,表示控制節點120處於空閒狀態,控制節點120向監測節點110回饋狀態資訊,以啟動遷移程式。當狀態鎖lock=1時,表示控制節點120處於鎖定狀態,此時控制節點120停止工作,不會回應監測節點110之消息。當狀態鎖lock=0時,表示控制節點120處於解鎖狀態,此時控制節點120重新開始工作,向監測節點110回饋狀態資訊。
For example, when the thread flag flag=1, it means that the
於其中一種實施方式中,當控制節點120遷移虛擬機器時,檢查雲平臺管理系統100之狀態。
In one of the implementation manners, when the
具體而言,控制節點120檢查是否存在空閒之計算節點130,檢查nova服務之狀態與電源之狀態,及記錄異常計算節點130上之虛擬機器之狀態。
Specifically, the
雲平臺管理系統100可按照計算節點130之工作狀態將計算節點130之存儲區域劃分為空閒區域與工作區域。當空閒區域不存在計算節點130,即不存在空閒之計算節點130時,控制節點120向監測節點110回饋計算節點130之狀態資訊,以停止遷移程式。當空閒區域存在計算節點130,即存在空閒之計
算節點130時,控制節點120向監測節點110回饋計算節點130狀態資訊,以啟動遷移程式。
The cloud
nova服務負責維護與管理雲平臺管理系統100之計算資源。nova服務之狀態包括正常狀態或異常狀態。當nova服務處於正常狀態時,控制節點120向監測節點110回饋nova服務之狀態資訊,以啟動遷移程式。當nova服務處於異常狀態時,控制節點120向監測節點110回饋nova服務之狀態資訊,以停止遷移程式。
The nova service is responsible for maintaining and managing the computing resources of the cloud
控制節點120可藉由基板管理控制器(Baseboard Management Controller,BMC)檢查電源之狀態。基板管理控制器可於機器未開機之狀態下,對機器進行固件升級、查看機器設備等操作。電源之狀態包括開機、關機或待機狀態。當電源處於開機狀態時,控制節點120向監測節點110回饋電源之狀態資訊,以啟動遷移程式。當電源處於關機或待機狀態時,雲平臺管理系統100掉線,監測節點110無法接收控制節點120之回饋消息。
The
控制節點120即時記錄計算節點130上之虛擬機器之狀態。虛擬機器之狀態包括運行(Active)狀態或故障(Error)狀態。於虛擬機器遷移完成之後,控制節點120檢查虛擬機器之狀態是否與遷移之前保持一致,以確定是否需要運行修復程式對虛擬機器進行處理,及向監測節點110回饋虛擬機器之消息。當遷移後虛擬機器之狀態與遷移之前不一致時,若經過一定之時段或重啟虛擬機器之後,虛擬機器之狀態不能恢復至遷移之前之狀態,則控制節點120運行修復程式對虛擬機器進行修復。當遷移後虛擬機器之狀態與遷移之前保持一致時,控制節點120向監測節點110回饋虛擬機器遷移成功之消息。
The
舉例而言,虛擬機器於遷移之前處於運行狀態,於遷移之後處於故障狀態,若虛擬機器於5min以內或重啟之後恢復至運行狀態,則計算節點130向控制節點120回饋虛擬機器遷移成功之消息。控制節點120再向監測節點110回
饋虛擬機器遷移成功之消息。其中,控制節點120可運行重啟程式,以重啟虛擬機器。若虛擬機器於超過5min或重啟之後仍處於故障狀態,則計算節點130向控制節點120回饋虛擬機器遷移失敗之消息。控制節點120再向監測節點110回饋虛擬機器遷移失敗之消息,並運行修復程式,以修復虛擬機器。
For example, the virtual machine is in the running state before the migration, and is in the fault state after the migration. If the virtual machine returns to the running state within 5 minutes or after restarting, the
於其中一種實施方式中,於虛擬機器遷移完成之後,監測節點110藉由顯示幕顯示虛擬機器之遷移時間。其中,虛擬機器之遷移時間包括檢查時間,實例時間及執行時間。檢查時間是指於虛擬機器遷移之前控制節點120檢查雲平臺管理系統100之狀態所消耗之時間。檢查時間之取值範圍為1min-5min。例如,檢查時間可取2min。實例時間是指於虛擬機器遷移之前控制節點120調用實例所消耗之時間。實例時間與實例之數目及實例調用之時隙相關,實例調用之時隙之取值範圍為1s-5s。例如,實例調用之時隙可取5s,表示每隔5s調用一個實例。執行時間是指虛擬機器遷移所消耗之時間。執行時間之取值範圍為30s-120s。例如,執行時間可取60s。
In one of the implementation manners, after the migration of the virtual machine is completed, the
舉例而言,於虛擬機器遷移之前,檢查時間t1=120s,實例調用之時隙t0=5s,實例之數目n=30,執行時間t2=60s,則監測節點110可計算虛擬機器之遷移時間T=t1+t0*n+t2=330s。
For example, before the migration of the virtual machine, check time t 1 =120s, time slot t 0 of instance calling =5s, number of instances n=30, execution time t 2 =60s, then the
於其中一種實施方式中,監測節點110可儲存虛擬機器遷移之工作記錄。例如,監測節點110可將虛擬機器遷移之全部或部分資料寫入evacuation_computeXX.log檔中,為運維人員改善系統性能提供有效之資料參考。
In one of the implementation manners, the
於本實施例中,雲平臺管理系統100藉由監測節點110、控制節點120及計算節點130之間之資訊交互,實現對虛擬機器遷移過程之即時監測與控制。當監測資料滿足觸發條件時,雲平臺管理系統100自動開始進行虛擬機器遷移工作,可自動遷移虛擬機器離開故障節點,恢復上線,並及時處理虛擬機器遷移過程中遇到之例外狀況,減少系統宕機之時間,提高虛擬機器之遷移效率,
從而提升運維之效率。於虛擬機器遷移完成之後,運維人員可隨時查看雲平臺管理系統100之工作記錄,以評估雲平臺管理系統100之有效性與可靠性。
In this embodiment, the cloud
圖2是本申請一實施方式之虛擬機器遷移方法之流程圖。 FIG. 2 is a flowchart of a virtual machine migration method in an embodiment of the present application.
虛擬機器遷移方法可應用於雲平臺管理系統100之監測節點110。可參閱圖2,虛擬機器遷移方法包括:
The virtual machine migration method can be applied to the
S101,監測節點110監測計算節點130之狀態。
S101 , the
其中,計算節點130之狀態包括正常狀態或異常狀態。異常狀態包括計算節點130掉線或宕機等。當計算節點130出現突發狀況,例如掉線或宕機,監測節點110捕捉該事件,確定計算節點130是否滿足觸發條件。
Wherein, the state of the
於其中一種實施方式中,運維人員可選擇一個或多個監測節點110,以監測計算節點130之狀態。
In one of the implementation manners, the operation and maintenance personnel can select one or
S102,監測節點110確定計算節點130是否滿足觸發條件。若計算節點130滿足觸發條件,則執行步驟S103。若計算節點130不滿足觸發條件,則返回執行步驟S101。
S102, the
於其中一種實施方式中,觸發條件包括計算節點130之掉線時長達到預設時長,或計算節點130之狀態不穩定。其中,預設時長之取值範圍為1min-5min。
In one of the implementation manners, the trigger condition includes that the offline duration of the
計算節點130掉線包括以下四個條件:nova-compute代理服務掉線,處於掉線狀態之nova-compute代理服務之數目為1,10G網路掉線及1G網路掉線。當上述四個條件同時滿足時,可確定計算節點130掉線。
The disconnection of the
計算節點130之狀態不穩定包括以下四個條件:nova-compute代理服務掉線,處於掉線狀態之nova-compute代理服務之數目為1,10G網路掉線及處於掉線狀態之10G網路中計算節點130之數目為1。當處於掉線狀態之nova-compute代理服務之數目為1,且nova-compute代理服務掉線之頻次達到第
一預設頻次,及處於掉線狀態之10G網路中計算節點130之數目為1,且10G網路掉線之頻次達到第二預設頻次時,可確定計算節點130之狀態不穩定。其中,第一預設頻次之取值範圍為8/10min-14/10min。第二預設頻次之取值範圍為4/10min-7/10min。
The unstable state of computing
當計算節點130滿足觸發條件時,監測節點110向控制節點120發送消息,以將發生異常之計算節點130之虛擬機器遷移至其他可用之計算節點130。
When the
S103,監測節點110向控制節點120發送消息,以遷移虛擬機器。
S103, the
於其中一種實施方式中,當監測節點110確定多個計算節點130發生異常時,依次遷移發生異常之各個計算節點130上之虛擬機器,以減少將一個異常計算節點130上之虛擬機器遷移至另一個異常計算節點130之狀況。
In one of the implementation manners, when the
於其中一種實施方式中,當監測節點110向控制節點120發送消息時,監測節點110檢查控制節點120之執行緒標記(flag)與狀態鎖(lock),以減少重新進入執行緒(re-entry)之狀況。
In one of the implementation manners, when the
S104,監測節點110切換至保險狀態,以停止遷移其他計算節點130上之虛擬機器。
S104 , the
當控制節點120開始遷移一個異常計算節點130上之虛擬機器時,監測節點110切換至保險狀態,停止遷移其他異常計算節點130上之虛擬機器。當確定一個異常計算節點130上之虛擬機器遷移完成之後,監測節點110切換回工作狀態,以繼續監測計算節點130之狀態。
When the
S105,監測節點110藉由顯示幕顯示虛擬機器之遷移時間。
S105, the
其中,虛擬機器之遷移時間包括檢查時間,實例時間及執行時間。檢查時間之取值範圍為1min-5min。實例時間與實例之數目及實例調用之時隙相關,實例調用之時隙之取值範圍為1s-5s。執行時間之取值範圍為30s-120s。
監測節點110計算與顯示虛擬機器之遷移時間,運維人員可根據虛擬機器之遷移時間設定其他執行緒之啟動時間,以提升系統之運行效率。
Wherein, the migration time of the virtual machine includes checking time, instance time and execution time. The value range of inspection time is 1min-5min. The instance time is related to the number of instances and the time slot of instance calling. The value range of the time slot of instance calling is 1s-5s. The range of execution time is 30s-120s.
The
S106,監測節點110儲存虛擬機器之遷移記錄。
S106, the
於其中一種實施方式中,監測節點110可將虛擬機器遷移之全部或部分資料寫入evacuation_computeXX.log檔中,為運維人員改善系統性能提供有效之資料參考。
In one implementation manner, the
圖3是本申請一實施方式之監測節點110之結構示意圖。
FIG. 3 is a schematic structural diagram of a
可參閱圖3,監測節點110可包括處理器111,記憶體112,通訊模組113及顯示幕114。處理器111電連接於上述其他部件。處理器111可藉由運行存儲於記憶體112中之電腦程式或代碼,實現本申請實施例之上述虛擬機器遷移方法。
Referring to FIG. 3 , the
處理器111可包括一個或多個處理單元,例如:處理器111可包括應用處理器(Application Processor,AP),調製解調處理器,圖形處理器(Graphics Processing Unit,GPU),圖像訊號處理器(Image Signal Processor,ISP),控制器,視頻轉碼器,數位訊號處理器(Digital Signal Processor,DSP),基帶處理器,與/或神經網路處理器(Neural-Network Processing Unit,NPU)等。其中,不同之處理單元可是獨立之器件,亦可集成於一個或多個處理器中。
The
記憶體112可包括外部記憶體介面與內部記憶體。其中,外部記憶體介面可用於連接外部存儲卡,例如Micro SD卡,實現擴展監測節點110之存儲能力。外部存儲卡藉由外部記憶體介面與處理器111通訊,實現資料存儲功能。內部記憶體可用於存儲電腦可執行程式碼,所述可執行程式碼包括指令。內部記憶體可包括存儲程式區與存儲資料區。其中,存儲程式區可存儲作業系統,至少一個功能所需之應用程式(比如聲音播放功能,圖像播放功能等)等。存儲資料區可存儲監測節點110使用過程中所創建之資料(比如音訊資料,電話本
等)等。此外,內部記憶體可包括高速隨機存取記憶體,還可包括非易失性記憶體,例如至少一個磁碟記憶體件,快閃記憶體器件,通用快閃記憶體記憶體(Universal Flash Storage,UFS)等。處理器111藉由運行存儲於內部記憶體之指令,與/或存儲於設置於處理器111中之記憶體之指令,執行監測節點110之各種功能應用以及資料處理。
The
通訊模組113可包括移動通訊模組與無線通訊模組。其中,移動通訊模組可提供應用於監測節點110上之包括2G/3G/4G/5G等無線通訊之解決方案。無線通訊模組可提供應用於監測節點110上之包括無線區域網(Wireless Local Area Networks,WLAN)(如無線保真(Wireless Fidelity,Wi-Fi)網路),藍牙(Bluetooth,BT),全球導航衛星系統(Global Navigation Satellite System,GNSS),調頻(Frequency Modulation,FM),近距離無線通訊技術(Near Field Communication,NFC),紅外技術(Infrared,IR)等無線通訊之解決方案。
The
顯示幕114用於顯示圖像、視頻等。顯示幕114包括顯示面板。顯示面板可採用液晶顯示幕(Liquid Crystal Display,LCD),有機發光二極體(Organic Light-Emitting Diode,OLED),有源矩陣有機發光二極體或主動矩陣有機發光二極體(Active-Matrix Organic Light Emitting Diode,AMOLED),柔性發光二極體(Flex Light-Emitting Diode,FLED),Miniled,MicroLed,Micro-oLed,量子點發光二極體(Quantum Dot Light Emitting Diodes,QLED)等。於一些實施例中,監測節點110可包括1個或N個顯示幕114,N為大於1之正整數。
The
可理解,本實施例示意之結構並不構成對監測節點110之具體限定。於本申請另一些實施例中,監測節點110可包括比圖示更多或更少之部件,或者組合某些部件,或者拆分某些部件,或者不同之部件佈置。
It can be understood that the structure shown in this embodiment does not constitute a specific limitation on the
圖4是本申請另一實施方式之虛擬機器遷移方法之流程圖。 FIG. 4 is a flowchart of a virtual machine migration method according to another embodiment of the present application.
虛擬機器遷移方法可應用於雲平臺管理系統100之控制節點120。可參閱圖4,虛擬機器遷移方法包括:
The virtual machine migration method can be applied to the
S201,控制節點120檢查是否存在空閒之計算節點130。若存在空閒之計算節點130,則執行步驟S202。若不存在空閒之計算節點130,則執行步驟S205。
S201, the
於其中一種實施方式中,雲平臺管理系統100可按照計算節點130之工作狀態將計算節點130之存儲區域劃分為空閒區域與工作區域。當空閒區域存在計算節點130,即存在空閒之計算節點130時,控制節點120向監測節點110回饋計算節點130狀態資訊,以啟動遷移程式。當空閒區域不存在計算節點130,即不存在空閒之計算節點130時,控制節點120向監測節點110回饋計算節點130之狀態資訊,以停止遷移程式。
In one of the implementation manners, the cloud
S202,控制節點120檢查nova服務之狀態是否正常。若nova服務之狀態正常,則執行步驟S203。若nova服務之狀態異常,則執行步驟S205。
S202, the
當控制節點120確定空閒之計算節點130之後,可建立源主機到目標主機之遷移路徑,再檢查雲平臺管理系統100之nova服務之狀態是否正常。其中,源主機是指待遷移虛擬機器之計算節點130。目標主機是指待接收虛擬機器之計算節點130。
After the
當nova服務處於正常狀態時,控制節點120向監測節點110回饋nova服務之狀態資訊,以啟動遷移程式。當nova服務處於異常狀態時,控制節點120向監測節點110回饋nova服務之狀態資訊,以停止遷移程式。
When the nova service is in a normal state, the
S203,控制節點120檢查電源之狀態是否正常。若電源之狀態正常,則執行步驟S204。若電源之狀態異常,則執行步驟S205。
S203, the
於其中一種實施方式中,控制節點120可藉由基板管理控制器檢查雲平臺管理系統100之電源之狀態。其中,電源之狀態包括開機、關機或待機
狀態。當電源處於開機狀態時,控制節點120向監測節點110回饋電源之狀態資訊,以啟動遷移程式。當電源處於關機或待機狀態時,雲平臺管理系統100掉線,監測節點110無法接收控制節點120之回饋消息。
In one of the implementation manners, the
S204,控制節點120啟動遷移程式。
S204, the
當控制節點120遷移虛擬機器時,檢查雲平臺管理系統100之狀態,包括上述步驟S201-S203。當雲平臺管理系統100之狀態正常時,控制節點120啟動遷移程式。
When the
S205,控制節點120停止遷移程式。
S205, the
當雲平臺管理系統100之狀態異常時,控制節點120停止遷移程式。
When the state of the cloud
S206,控制節點120檢查虛擬機器之狀態是否與遷移之前保持一致。若虛擬機器之狀態與遷移之前保持一致,則執行步驟S207。若虛擬機器之狀態與遷移之前不一致,則執行步驟S208。
S206, the
控制節點120可即時記錄計算節點130上之虛擬機器之狀態。其中,虛擬機器之狀態包括運行狀態或故障狀態。
The
於其中一種實施方式中,於虛擬機器遷移完成之後,控制節點120檢查虛擬機器之狀態是否與遷移之前保持一致。
In one of the implementation manners, after the migration of the virtual machine is completed, the
S207,控制節點120向監測節點110回饋虛擬機器遷移成功之消息。
S207, the
當遷移後虛擬機器之狀態與遷移之前保持一致時,控制節點120向監測節點110回饋虛擬機器遷移成功之消息。
When the state of the virtual machine after migration is consistent with that before the migration, the
S208,控制節點120確定是否需要修復虛擬機器。若不需要修復虛擬機器,則執行步驟S209。若需要修復虛擬機器,則執行步驟S210。
S208, the
當遷移後虛擬機器之狀態與遷移之前不一致時,控制節點120確定是否需要修復虛擬機器。
When the state of the virtual machine after migration is inconsistent with that before migration, the
可理解,當遷移後虛擬機器之狀態與遷移之前不一致時,存在以下兩種情況:一,虛擬機器於一定之時段內發生宕機,經過一段時間或重啟之後,可自動恢復至遷移之前之狀態。二,虛擬機器發生故障,如果不進行修復,無法恢復至遷移之前之狀態。 It can be understood that when the state of the virtual machine after migration is inconsistent with that before migration, there are the following two situations: 1. The virtual machine goes down within a certain period of time, and after a period of time or after restarting, it can automatically return to the state before the migration . Second, if the virtual machine fails, if it is not repaired, it cannot be restored to the state before the migration.
S209,控制節點120重啟虛擬機器。
S209, the
當出現上述第一種情況時,不需要運行修復程式,重啟虛擬機器或等待虛擬機器自動恢復即可。其中,控制節點120可運行重啟程式,以重啟虛擬機器。
When the above first situation occurs, there is no need to run the repair program, just restart the virtual machine or wait for the virtual machine to recover automatically. Wherein, the
於其中一種實施方式中,當重啟虛擬機器之後,虛擬機器之狀態仍然不能恢復,則控制節點120向監測節點110回饋虛擬機器遷移失敗之消息,並執行步驟S210。
In one of the implementation manners, when the status of the virtual machine cannot be recovered after the virtual machine is restarted, the
S210,控制節點120修復虛擬機器。
S210, the
當出現上述第二種情況時,需要運行修復程式,以修復發生故障之虛擬機器。 When the above second situation occurs, a recovery program needs to be run to recover the faulty virtual machine.
圖5是本申請一實施方式之控制節點120之結構示意圖。
FIG. 5 is a schematic structural diagram of a
可參閱圖5,控制節點120可包括處理器121,記憶體122及通訊模組123。處理器121可藉由運行存儲於記憶體122中之電腦程式或代碼,實現本申請實施例之上述虛擬機器遷移方法。
Referring to FIG. 5 , the
處理器121包括基板管理控制器(BMC)與通用處理器。其中,控制節點120可藉由基板管理控制器檢查電源之狀態。通用處理器可參閱處理器111之相關描述,記憶體122可參閱記憶體112之相關描述,通訊模組123可參閱通訊模組113之相關描述,此處不再贅述。
The
可理解,本實施例示意之結構並不構成對控制節點120之具體限定。於本申請另一些實施例中,控制節點120亦可包括比圖示更多或更少之部件,或者不同之部件佈置。
It can be understood that the structure shown in this embodiment does not constitute a specific limitation on the
本申請實施例還提供一種存儲介質,用於存儲電腦程式或代碼,當所述電腦程式或代碼被處理器執行時,實現本申請實施例之虛擬機器遷移方法。 The embodiment of the present application also provides a storage medium for storing computer programs or codes, and when the computer programs or codes are executed by a processor, the virtual machine migration method of the embodiments of the present application is implemented.
存儲介質包括於用於存儲資訊(諸如電腦可讀指令、資料結構、程式模組或其它資料)之任何方法或技術中實施之易失性與非易失性、可移除與不可移除介質。存儲介質包括,但不限於,隨機存取記憶體(Random Access Memory,RAM)、唯讀記憶體(Read-Only Memory,ROM)、帶電可擦可程式設計唯讀記憶體(Electrically Erasable Programmable Read-Only Memory,EEPROM)、快閃記憶體或其它記憶體、唯讀光碟(Compact Disc Read-Only Memory,CD-ROM)、數位通用光碟(Digital Versatile Disc,DVD)或其它光碟存儲、磁盒、磁帶、磁片存儲或其它磁存儲裝置、或者可用於存儲期望之資訊並且可被電腦訪問之任何其它之介質。 Storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information, such as computer readable instructions, data structures, program modules, or other data . Storage media include, but are not limited to, Random Access Memory (Random Access Memory, RAM), Read-Only Memory (Read-Only Memory, ROM), Electrically Erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read- Only Memory, EEPROM), flash memory or other memory, CD-ROM (Compact Disc Read-Only Memory, CD-ROM), digital versatile disc (Digital Versatile Disc, DVD) or other optical disc storage, magnetic box, magnetic tape , disk storage or other magnetic storage device, or any other medium that can be used to store desired information and that can be accessed by a computer.
上面結合附圖對本申請實施例作了詳細說明,但本申請不限於上述實施例,於所屬技術領域普通具通常技藝者所具備之知識範圍內,還可於不脫離本申請宗旨之前提下做出各種變化。 The embodiments of the present application have been described in detail above in conjunction with the accompanying drawings, but the present application is not limited to the above embodiments, within the scope of knowledge possessed by ordinary skilled persons in the technical field, it can also be done without departing from the purpose of the present application. Various changes.
S101-S106:步驟 S101-S106: Steps
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW110131545A TWI795887B (en) | 2021-08-25 | 2021-08-25 | Method, electronic equipment and storage medium for virtual machine migration |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW110131545A TWI795887B (en) | 2021-08-25 | 2021-08-25 | Method, electronic equipment and storage medium for virtual machine migration |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| TW202309742A TW202309742A (en) | 2023-03-01 |
| TWI795887B true TWI795887B (en) | 2023-03-11 |
Family
ID=86690765
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW110131545A TWI795887B (en) | 2021-08-25 | 2021-08-25 | Method, electronic equipment and storage medium for virtual machine migration |
Country Status (1)
| Country | Link |
|---|---|
| TW (1) | TWI795887B (en) |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW201327207A (en) * | 2011-12-28 | 2013-07-01 | Ind Tech Res Inst | Virtual resources management methods |
| TW201423591A (en) * | 2012-12-11 | 2014-06-16 | Inventec Corp | Management system for virtual machine and method thereof |
| CN109783199A (en) * | 2019-03-28 | 2019-05-21 | 浪潮商用机器有限公司 | A kind of virtual machine migration method and device |
| CN110908832A (en) * | 2019-10-24 | 2020-03-24 | 烽火通信科技股份有限公司 | Virtual machine fault evacuation method and system for cloud platform and computer readable medium |
| CN111355605A (en) * | 2019-10-18 | 2020-06-30 | 烽火通信科技股份有限公司 | Virtual machine fault recovery method and server of cloud platform |
| CN111399978A (en) * | 2020-03-02 | 2020-07-10 | 中铁信弘远(北京)软件科技有限责任公司 | OpenStack-based fault migration system and migration method |
| CN112000504A (en) * | 2020-08-19 | 2020-11-27 | 浪潮云信息技术股份公司 | Fault processing method and device for computing node and electronic equipment |
| US20210248000A1 (en) * | 2020-02-12 | 2021-08-12 | Red Hat, Inc. | Virtual machine migration to multiple destination nodes |
-
2021
- 2021-08-25 TW TW110131545A patent/TWI795887B/en active
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW201327207A (en) * | 2011-12-28 | 2013-07-01 | Ind Tech Res Inst | Virtual resources management methods |
| TW201423591A (en) * | 2012-12-11 | 2014-06-16 | Inventec Corp | Management system for virtual machine and method thereof |
| CN109783199A (en) * | 2019-03-28 | 2019-05-21 | 浪潮商用机器有限公司 | A kind of virtual machine migration method and device |
| CN111355605A (en) * | 2019-10-18 | 2020-06-30 | 烽火通信科技股份有限公司 | Virtual machine fault recovery method and server of cloud platform |
| CN110908832A (en) * | 2019-10-24 | 2020-03-24 | 烽火通信科技股份有限公司 | Virtual machine fault evacuation method and system for cloud platform and computer readable medium |
| US20210248000A1 (en) * | 2020-02-12 | 2021-08-12 | Red Hat, Inc. | Virtual machine migration to multiple destination nodes |
| CN111399978A (en) * | 2020-03-02 | 2020-07-10 | 中铁信弘远(北京)软件科技有限责任公司 | OpenStack-based fault migration system and migration method |
| CN112000504A (en) * | 2020-08-19 | 2020-11-27 | 浪潮云信息技术股份公司 | Fault processing method and device for computing node and electronic equipment |
Also Published As
| Publication number | Publication date |
|---|---|
| TW202309742A (en) | 2023-03-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN102708018B (en) | An exception handling method and system, agent equipment and control device | |
| CN105095001B (en) | Virtual machine abnormal restoring method under distributed environment | |
| CN103167004A (en) | Cloud platform host system failure repair method and cloud platform front-end control server | |
| US20080307258A1 (en) | Distributed Job Manager Recovery | |
| CN109656742B (en) | Node exception handling method and device and storage medium | |
| CN105468717B (en) | Database operation method and device | |
| CN102394791A (en) | Downtime recovery method and system | |
| CN104038376A (en) | Method and device for managing real servers and LVS clustering system | |
| US11662803B2 (en) | Control method, apparatus, and electronic device | |
| WO2016045439A1 (en) | Vnfm disaster-tolerant protection method and device, nfvo and storage medium | |
| CN110134518A (en) | A method and system for improving the high availability of multi-node applications in a big data cluster | |
| CN104503861A (en) | Abnormality handling method and system, agency device and control device | |
| CN106130763A (en) | Server cluster and be applicable to the database resource group method for handover control of this cluster | |
| WO2018137520A1 (en) | Service recovery method and apparatus | |
| US11720455B2 (en) | Method, apparatus, and non-transitory computer readable medium for migrating virtual machines | |
| CN115981919A (en) | Management control method, device, equipment and storage medium of database cluster | |
| TWI795887B (en) | Method, electronic equipment and storage medium for virtual machine migration | |
| JP5387767B2 (en) | Update technology for running programs | |
| US8812900B2 (en) | Managing storage providers in a clustered appliance environment | |
| CN116192885A (en) | High-availability cluster architecture artificial intelligence experiment cloud platform data processing method and system | |
| CN111966469A (en) | High-availability method and system for cluster virtual machine | |
| WO2022044270A1 (en) | Updating device, updating method, and program | |
| CN114750774B (en) | Safety monitoring method and automobile | |
| CN105187482A (en) | PaaS platform fault self-healing realization method and message server | |
| CN110287066A (en) | A server partition migration method and related device |