WO2018193608A1 - Système de stockage, procédé de commande pour dispositif de stockage et dispositif de commande de stockage - Google Patents
Système de stockage, procédé de commande pour dispositif de stockage et dispositif de commande de stockage Download PDFInfo
- Publication number
- WO2018193608A1 WO2018193608A1 PCT/JP2017/015988 JP2017015988W WO2018193608A1 WO 2018193608 A1 WO2018193608 A1 WO 2018193608A1 JP 2017015988 W JP2017015988 W JP 2017015988W WO 2018193608 A1 WO2018193608 A1 WO 2018193608A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- power consumption
- restriction
- drive
- semiconductor drive
- nonvolatile semiconductor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/28—Supervision thereof, e.g. detecting power-supply failure by out of limits supervision
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present invention relates to a flash drive, a storage system including the flash drive, and a control method thereof.
- Flash drives and SSDs (Solid State Drives) using NAND Flash Memory are becoming cheaper and larger in capacity due to higher integration of NAND Flash Memory.
- HDD Hard Disk Drive
- AFA All Flash Array
- NAND Flash Memory has a feature that it consumes a large amount of power when performing Write.
- power consumption and heat generation associated with the power increase.
- the SSD has a feature of performing autonomous data movement (for example, garbage collection) within a NAND Flash Memory chip or between chips by a drive controller as an internal process. For this reason, the amount of writing to the NAND Flash Memory chip inside the SSD, that is, the power consumption cannot be controlled only by limiting the write amount by the storage controller of the AFA.
- autonomous data movement for example, garbage collection
- AFA has power restrictions due to the installation environment of power supply equipment and air conditioning equipment, power supply units that can be mounted, and cooling performance that discharges heat generated by power consumption. Therefore, by providing a power limiting system as in Patent Document 1 for an SSD mounted on an AFA in the past and performing a uniform power limitation on all SSDs in accordance with the amount of power of the mounted device, The power restrictions of AFA are observed.
- Flash drives such as SSD can improve performance by increasing simultaneous access to NAND Flash Memory chip during Write.
- the power consumption increases by increasing the number of simultaneous accesses, it is necessary to determine the upper limit of SSD power according to the storage system to be mounted and to limit the write performance according to the determined upper limit.
- the power constraint on the SSD and the write processing of the storage controller are not linked, even if there is a power margin as an AFA, the write performance is uniformly affected by the power upper limit setting of the SSD itself. For this reason, the AFA has a problem in that the performance of the entire AFA is limited to the SSD in which the write is concentrated even though there is a power margin.
- the present invention is a storage device including a storage device including a plurality of nonvolatile semiconductor drives, and a control device that includes a processor and a memory to control the storage device, and the processor includes the nonvolatile semiconductor drive And when the processor detects a non-volatile semiconductor drive whose write load exceeds a predetermined threshold, the processor selects the non-volatile semiconductor drive as a power consumption restriction relaxation target, and the processor Selects a non-volatile semiconductor drive that can allow a reduction in power consumption as a non-restricted target for power consumption, and the processor limits the power consumption. After reducing the power consumption of the nonvolatile semiconductor drive to be strengthened, Cost increases the power consumption of the power of the restriction relaxation target nonvolatile semiconductor drive.
- FIG. 3 is a diagram illustrating a physical structure of a storage system and a configuration of a parity group according to an embodiment of this invention.
- 5 is a flowchart illustrating an example of processing performed by the storage controller according to the embodiment of this invention. It is a figure which shows the Example of this invention and shows an example of an enclosure cooling state table.
- FIG. 3 is a diagram showing an embodiment of the present invention and showing a physical structure of a storage system and a configuration of a parity group when a flash drive failure occurs. It is a figure which shows the Example of this invention and shows an example of the drive operation mode state table at the time of failure of a flash drive. It is a figure which shows the Example of this invention and shows an example of the drive operation mode state table during RAID structure reconstruction.
- FIG. 1 is a block diagram showing an example of the storage system of the present invention.
- FIG. 1 shows a configuration of a storage system in which a host 100 and a storage system 101 are connected.
- the host 100 executes the application and issues commands such as Read and Write to the storage system 101.
- the host 100 is a computer including an interface for connecting to the storage system 101, a CPU (Central Processing Unit), a memory, and the like.
- a CPU Central Processing Unit
- the storage system 101 includes a storage controller 102 and a drive enclosure 103.
- the storage controller 102 has one or more controller units 104 mounted therein.
- the controller unit 104 includes a front-end interface (FE I / F in the figure) 110, a CPU 111, a memory 112, a back-end interface (BE I / F in the figure) 113, and a control I / F 114.
- FE I / F front-end interface
- CPU 111 central processing unit
- memory 112 a memory 112
- BE I / F in the figure back-end interface
- control I / F 114 a protocol such as FC (Fibre Channel) or IP can be adopted.
- FC Fibre Channel
- IP IP
- SAS Serial Attached SCSI
- NVMe Non-Volatile Memory Host Host Controller Interface
- the drive enclosure 103 includes enclosure interfaces (ENC I / F in the figure) 120-1 and 120-2, power supply and fan units 121-1 and 121-2, and a plurality of flash drives (nonvolatile semiconductor drives) 122. Is done.
- controller units 104-1 and 104-2 are each set to two.
- enclosure interface (ENC I / F) 120-1 and 120-2 are each set to two.
- power supply and fan units 121-1 and 121-2 are each set to two.
- other configurations are possible.
- a plurality of drive enclosures 103 can be connected to one storage controller 102.
- the front-end interface (FE I / F) 110 of the storage controller 102 is connected to the host 100 via a network (not shown) and mediates communication with the host 100.
- the CPU 111 executes various processes using each program, management information, etc. stored in the memory 112.
- the CPUs 111 mounted on each controller unit 104 can communicate with each other to ensure data consistency and perform processing and data synchronization.
- the memory 112 stores a program executed by the CPU 111, management information used by the program, and the like.
- the memory 112 may be used for other purposes, such as storing a cache of data accessed by the host 100.
- the memory 112 is known to be composed of SDRAM (Synchronous Dynamic Random Access Memory), but may be composed of memory elements other than those described above.
- each controller unit 104 is connected to the flash drive 122 via the enclosure interface (ENC I / F) 120-1 and 120-2, and the CPU 111 accesses the flash drive 122. To mediate.
- control I / F 114 of each controller unit 104 is connected to the power supply and fan unit 121 mounted on the drive enclosure 103 via enclosure interfaces 120-1 and 120-2, and mediates control of fan cooling performance.
- the enclosure interface 120 mediates communication between the controller unit 104 and the power supply / fan unit 121 and the flash drive 122.
- the communication performed by the control I / F 114 can be mediated by the back-end interface 113, the installation and communication of the control I / F 114 are not necessary.
- the enclosure interface 120 is unnecessary in the configuration in which the back-end interface 113 of the controller unit 104 and the flash drive 122 can be directly connected.
- the case where the storage controller 102 and the drive enclosure 103 are mounted as a single enclosure corresponds to the above case.
- the power supply and fan unit 121 supplies power to the entire drive enclosure 103 and performs cooling to prevent temperature rise of the flash drive 122 by blowing air with an air cooling fan (not shown).
- the air cooling fan included in the power supply and fan unit 121 functions as a cooling device.
- the rotation speed of the air cooling fan can be controlled from the controller unit 104 via the control I / F 114 and the enclosure interface 120. It is assumed that a predetermined cooling performance equal to or less than the maximum cooling performance can be obtained by changing the rotation speed of the air cooling fan.
- stepwise control of the rotation speed of the air cooling fan it is possible to implement stepwise control of the rotation speed of the air cooling fan.
- the power supply and the fan unit 121 are not necessarily integrated, and may be separately mounted.
- an example in which cooling is performed by an air cooling fan has been described.
- any cooling device capable of controlling the cooling performance may be used.
- a water cooling type cooling device may be employed.
- the flash drive 122 is a device in which a NAND flash memory is mounted as a nonvolatile storage area.
- the flash drive 122 may use a highly reliable technology such as RAID (Redundant Arrays of Independent Disks) by grouping a plurality of flash drives into units called parity groups.
- RAID Redundant Arrays of Independent Disks
- FIG. 2 is a block diagram showing programs and information stored in the memory 112 of the controller unit 104.
- the memory 112 includes a storage control program 150 for controlling the flash drive 122 of the drive enclosure 103 as a storage device, a power performance optimization program 200, a write load monitoring program 201, an optimum power allocation calculation program 202, and drive operation.
- a mode change notification program 203 and a cooling performance control program 204 are stored.
- the memory 112 includes, as data used by the control program, an enclosure power upper limit table 210, an enclosure cooling capacity table 211, an enclosure cooling state table 212, an installed drive information table 220, and a parity group configuration information table 221.
- the IO pattern learning table 222 and the drive operation mode state table 223 are also held. Note that the memory 112 may also hold control programs other than those described above, control data, and the like.
- the storage control program 150 receives a read request or a write request from the host 100 and controls reading / writing of the flash drive 122 in the drive enclosure 103.
- the power performance optimization program 200 links the write load monitoring program 201, the optimal power allocation calculation program 202, the drive operation mode change notification program 203, and the cooling performance control program 204 to optimize the power and performance of the storage system 101. Turn into.
- the write load monitoring program 201 monitors the write processing load on each flash drive 122 and monitors the occurrence of bottlenecks.
- the write processing load for example, IOPS (Input / Output Operations Per Second) of the Write command can be used.
- the writing speed (MB / S) for each flash drive 122 may be used.
- the write load monitoring program 201 measures IOPS (load information) for each flash drive 122, and determines that the load of the flash drive 122 in which the IOPS exceeds a preset first threshold is “high load”. Further, the write load monitoring program 201 treats the flash drive 122 determined to be “high load” as a bottleneck.
- IOPS load information
- the optimal power allocation calculation program 202 operates when a bottleneck is detected by the write load monitoring program 201. This program is based on the write load information of the flash drive 122, the number of flash drives 122 mounted in the drive enclosure 103, the operation mode set for each flash drive 122, and the upper limit of power supplied by the drive enclosure 103. In addition, it is determined whether or not the power distribution can be increased for the flash drive 122 that is a bottleneck.
- the optimum power allocation calculation program 202 determines that it is possible to increase the distribution of power consumption to the bottleneck flash drive 122, as described later, the power consumption of the bottleneck flash drive 122 is increased.
- the operation mode of the flash drive 122 is determined.
- the operation mode of the flash drive 122 selects a mode ID that relaxes the restriction on power consumption.
- the operation mode of the flash drive 122 to be reduced is determined in order to reduce the allocation of power consumption to the flash drive 122 that can reduce power consumption among the other flash drives 122.
- a mode ID that strengthens the restriction on power consumption is selected.
- the optimum power allocation calculation program 202 is a “medium load” drive in which Write processing IOPS is equal to or lower than the first threshold and equal to or higher than the second threshold as the flash drive 122 that reduces allocation of power consumption. Are selected for reduction. However, the second threshold value is smaller than the first threshold value.
- the flash drive 122 that is the target of power consumption reduction may be a drive that does not have a problem in write performance even if power consumption is reduced.
- the optimal power allocation calculation program 202 changes a drive whose write processing IOPS is equal to or lower than the first threshold among the flash drives 122 whose write load state 2234 is “high load” to “medium load”, The operation mode 2233 can be changed to “2” to reduce the allocated power consumption.
- the drive operation mode change notification program 203 is actually used when the optimum power allocation calculation program 202 determines to change the allocation of power consumption for each flash drive 122 or when returning to the state before the change. To change the operation mode.
- the cooling performance control program 204 determines whether or not the cooling performance has been changed when it is determined by the optimum power allocation calculation program 202 to change the power consumption allocation for each flash drive 122. In other words, the cooling performance control program 204 controls the power supply and the fan unit 121 to change the rotation speed of the fan when the fan rotation speed needs to be changed or when returning to the state before the change.
- the CPU 111 operates as a functional unit that provides a predetermined function by processing according to each program.
- the CPU 111 functions as a power performance optimization unit by performing processing according to the power performance optimization program 200.
- the CPU 111 also operates as a function unit that provides each function of a plurality of processes executed by each program.
- a computer and a computer system are an apparatus and a system including these functional units.
- Information such as programs and tables for realizing each function of the controller unit 104 is stored in a storage subsystem, a nonvolatile semiconductor memory, a hard disk drive, a nonvolatile storage device such as an SSD, or a computer reading such as an IC card, an SD card, a DVD, etc. It can be stored on possible non-transitory data storage media.
- FIGS. 3A to 3F show that the power performance optimization program 200, the write load monitoring program 201, the optimum power allocation calculation program 202, the drive operation mode change notification program 203, and the cooling performance control program 204 are operated. It is a figure which shows an example of the information used.
- FIG. 3A is a diagram illustrating an example of the enclosure power upper limit table 210.
- the enclosure power upper limit table 210 includes an enclosure ID 2101 for storing identification information of each drive enclosure 103, a chassis power upper limit 2102 set for each drive enclosure 103, and an upper limit value of power supplied to a single flash drive installed in the drive enclosure 103. Is included in one entry.
- FIG. 1 shows an example in which the drive enclosure 103 is one, a plurality of drive enclosures 103 can be connected to the storage controller 102. For this reason, the drive enclosure 103 is specified by the enclosure ID 2101.
- the housing power upper limit 2102 has been degraded due to a failure of the power source and fan unit 121-1 or 121-2 even when a plurality of power sources and fan units 121-1 and 121-2 are constantly supplied with power.
- the upper limit value (maximum value) that can be supplied is shown.
- the maximum supply power (maximum as a rating, not a momentary one) when a redundant power supply (two units) is adopted needs to be a power that can be supplied even during degeneration. .
- the maximum supply power is the power that can be supplied by 2 units when one unit fails. In this case, the sum of 2 units is the maximum supply power. It becomes.
- the information in the enclosure power upper limit table 210 is set in advance by a management computer or the like (not shown).
- FIG. 3B is a diagram illustrating an example of the enclosure cooling capacity table 211.
- the enclosure cooling capacity table 211 includes an enclosure ID 2111 for storing identification information of each drive enclosure 103, a maximum cooling performance 2112 for storing the maximum value of the cooling capacity set for each drive enclosure 103, and the cooling capacity of the drive enclosure 103.
- the standard cooling performance 2113 for storing the rated value and the cooling function degeneration cooling performance 2114 for storing the cooling capability at the time of degeneration of the power supply and fan unit 121 when the power supply and fan unit 121 are made redundant are combined into one. Include in entry.
- the maximum cooling performance 2112 indicates the cooling performance (W) when the cooling device including the redundant system is operated to the maximum out of the cooling performance of each drive enclosure 103.
- the standard cooling performance 2113 indicates the cooling performance of the power supply and fan unit 121 in the standard load state.
- the cooling performance 2114 when the cooling function is reduced indicates the maximum cooling performance when the redundant system is degenerated.
- FIG. 3C is a diagram illustrating an example of the enclosure cooling state table 212.
- the enclosure cooling state table 212 includes an enclosure ID 2121 that stores identification information of each drive enclosure 103, a cooling performance 2122 that indicates the cooling performance of each drive enclosure 103 that is currently connected to the controller unit 104, and the redundant system is degenerated.
- One entry includes a cooling state degeneration presence / absence 2123 indicating whether or not there is.
- the enclosure cooling state table 212 is periodically updated by the controller unit 104 and updated to the latest state.
- FIG. 3D is a diagram illustrating an example of the mounted drive information table 220.
- the mounted drive information table 220 includes an enclosure ID 2201 that stores an identifier of the drive enclosure 103 in which the flash drive 122 is mounted, a drive ID 2202 that stores an identifier for each flash drive 122, and an operation mode that can be changed by the flash drive 122.
- Mode number 2203 for storing the number
- mode 1 power 2204 for storing the power consumption (W) when the operation mode is 1
- mode 1 write performance for storing the write performance (MB / s) when the operation mode is 1.
- mode 2 power 2206 for storing power consumption (W) when the operation mode is 2
- mode 2 write performance 2207 for storing write performance (MB / s) when the operation mode is 2
- operation not shown Power consumption and writing when mode is 3 Including the performance to one of the entry.
- the information in the mounted drive information table 220 is set in advance by a management computer (not shown) or the like.
- FIG. 3E is a diagram illustrating an example of the parity group configuration information table 221.
- the parity group configuration information table 221 holds the combination information of the flash drive 122 groups constituting the RAID as a combination of a drive ID and a parity group ID.
- the parity group configuration information table 221 includes an enclosure ID 2211 that stores an identifier of the drive enclosure 103 in which the flash drive 122 is mounted, a drive ID 2212 that stores an identifier for each flash drive 122, and a parity group to which the flash drive 122 belongs.
- One entry includes a parity group ID 2213 for storing an identifier and a power allocation change enable / disable flag 2214 for setting whether or not to change the allocation of power consumption.
- a drive set as a spare drive so that availability can be maintained by changing the role of the flash drive 122 in which a failure has occurred can be identified as “SPARE” in the parity group ID 2213. Note that the mapping of SPARE with a specific parity group ID can be changed to a value suitable for each storage system.
- the power allocation change enable / disable flag 2214 sets whether or not to allow a change in power allocation. If the power allocation change enable / disable flag 2214 is “permitted”, the power consumption change by the storage controller 102 is accepted, and if it is “no”, the power consumption change is rejected.
- the flash drive 122 is used in RAID 5 or RAID 6 is shown, but the present invention is not limited to this.
- the flash drives 122 can be managed by a group of storage areas instead of a parity group.
- the IO pattern learning table 222 records or learns the data access history for each flash drive 122 or parity group 601 to 605.
- history information to be recorded and information to be learned include access source host 100 information, access destination flash drive 122 information, access destination parity group information, access command information, and access frequency.
- FIG. 3F is a diagram illustrating an example of the drive operation mode state table 223.
- the drive operation mode state table 223 holds the operation status of the flash drive 122 currently mounted in each drive enclosure 103.
- the drive operation mode state table 223 includes an enclosure ID 2231 that stores an identifier of the drive enclosure 103, a drive ID 2232 that stores an identifier for each flash drive 122, an operation mode 2233 that stores an operation mode for each flash drive 122, and a flash drive.
- Each entry includes a write load state 2234 for storing a write load state for each 122.
- the write load state for each flash drive 122 is high load if the above-mentioned IOPS exceeds the first threshold, medium load if the IOPS is below the first threshold, and exceeds the second threshold, and second If it is below this threshold value, it can be set as a low load.
- the resolution of the write load does not necessarily need to be three stages of high load, medium load, and low load, and finer resolution can be adopted.
- FIG. 4 is a block diagram showing an example of the configuration of the flash drive 122.
- the flash drive 122 includes a disk interface (DISK I / F in the figure) 400, a drive controller 401, a memory 402, a plurality of flash memory access channels 403, and a plurality of flash memory chips 404.
- the disk interface 400 is connected to the controller unit 104 via the enclosure interface 120 and the back-end interface 113.
- the connection between the enclosure interface 120 and the back-end interface 113 is shown as an example of two systems for redundancy.
- the drive controller 401 is a type of CPU that processes commands received from the disk interface 400.
- the drive controller 401 executes the chip control program 420 stored in the memory 402, refers to the data, and executes various processes.
- the memory 402 stores programs and data executed by the drive controller 401.
- the memory 402 holds an operation mode change program 10, an operation mode table 411, and a chip control program 420.
- the memory 402 may be used for storing other control programs and cache data.
- the memory 402 can be composed of SDRAM (Synchronous Dynamic Random Access Memory), but may be composed of memory elements other than those described above.
- the chip control program 420 processes a command received by the disk interface 400 and controls reading and writing to the flash memory chip 404.
- the chip control program 420 controls autonomous data movement (for example, garbage collection and leveling of the number of writes) according to the storage status of valid data in the flash memory chip 404 and the number of rewrites.
- autonomous data movement for example, garbage collection and leveling of the number of writes
- the flash memory access channel 403 is a path for connecting the drive controller 401 and the flash memory chip 404 and accessing the drive controller 401.
- the number of flash memory access channels 403 differs depending on the drive controller 401 to be used, and the larger the number, the more flash memory chips 404 can be connected. Also, by simultaneously activating a plurality of flash memory access channels 403, it becomes possible to access a plurality of flash memory chips 404 at the same time, improving performance while consuming a large amount of power during Write.
- the flash memory chip 404 is a NAND flash memory chip connected to the drive controller 401 via the flash memory access channel 403, and stores actual data and the like. A plurality of flash memory chips 404 are connected to each flash memory access channel 403.
- the operation mode change program 410 is a program stored in the memory 402 and executed by the drive controller 401.
- the operation mode change program 410 changes the operation mode of the flash drive 122 in accordance with an instruction from the controller unit 104.
- the operation mode of the flash drive 122 is stored in the operation mode table 411.
- the operation mode table 411 is stored in the memory 402 and is referred to when the operation mode is changed by the operation mode change program 410.
- Information included in the operation mode table 411 is collected by the controller unit 104 and set in the mounted drive information table 220.
- FIG. 5 is a diagram illustrating an example of information held in the operation mode table 411.
- the operation mode table 411 includes a mode ID 4111 that can be set in the flash drive 122, drive power consumption (W) 4112 in the operation mode, write performance (MB / s) 4113 in the operation mode, and flash memory at the time of writing.
- Write multiplicity 4114 indicating the number of simultaneous accesses to the chip 404 is held as a combination of mode IDs.
- the maximum value of power consumption is controlled by the write multiplicity 4114 indicating the number of simultaneous accesses to the flash memory chip 404 at the time of writing to the flash drive 122.
- the mode ID 1
- the number of simultaneous accesses to the flash memory chip 404 at the time of writing becomes 256, and the power consumption is maximized and the write performance is maximized.
- the mode ID 3
- the number of simultaneous accesses to the flash memory chip 404 at the time of writing is 64, and the power consumption is minimum and the writing performance is minimum.
- mode ID 2
- the number of simultaneous accesses to the flash memory chip 404 at the time of writing is 128, and the power consumption and the writing performance are intermediate values.
- the flash drive 122 having a larger number of modes to be held than in FIG. 5 can be configured.
- the performance information can be configured to include a range such as a minimum value or a maximum value. Further, the above information may be collected and used by the controller unit 104.
- the mode ID (operation mode) 4111 of the flash drive 122 has been shown as an example of three stages, but can be changed by increasing or decreasing the write multiplicity classification to be written simultaneously.
- the power consumption and the write performance are controlled in the operation mode of the flash drive 122 is shown, but the present invention is not limited to this.
- the power consumption and write performance of the flash drive 122 may be controlled by the write multiplicity of the flash drive 122.
- FIG. 6 is a diagram showing the physical structure of the storage system 101 and the structure of the parity group.
- FIG. 6 shows an example in which the storage controller 102 and the drive enclosure 103 are mounted as different enclosures and connected by cables.
- the enclosure controller 102 can be identified by setting the enclosure IDs so that they do not overlap.
- 24 flash drives 122 are mounted on the drive enclosure 103, but it is not always necessary to mount all of them.
- the configuration of these parity groups 601 to 605 is also shown as an example, and it is possible to adopt a different configuration or a configuration in which no parity group is set.
- connection between the power supply and fan unit on the storage controller 102 side, the host 100, and the host 100 and the front end interface 110 is omitted.
- FIG. 7 is a flowchart showing an example of processing performed by the storage controller.
- a control method in the storage system 101 as illustrated in FIGS. 1 and 6, that is, processing of the power performance optimization program 200 will be described using flowcharts.
- the power performance optimization program 200 and the like shown in this flowchart are loaded into the memory 112 and executed by the CPU 111 of the controller unit 104, but function as a part of the storage control program 150 that controls the entire storage system 101. Also good. This process is repeatedly executed.
- the storage control program 150 that controls the entire storage system 101 calls the power performance optimization program 200 to start processing (700).
- the power performance optimization program 200 detects the state of the power supply and the fan unit 121 mounted in the drive enclosure 103, and determines whether there is an abnormality such as degeneration of the cooling device (720). The power performance optimization program 200 proceeds to step 701 if degeneration has not occurred in the power source and fan unit 121, and proceeds to step 710 if degeneration has occurred.
- the power performance optimization program 200 starts the write load monitoring program 201.
- the write load monitoring program 201 detects the load (for example, IOPS) of the flash drive 122 mounted in the drive enclosure 103, and the bottleneck (load is detected) when the IOPS exceeds a preset first threshold. (Concentrated) device.
- the write load monitoring program 201 detects the load of each flash drive 122 and calculates the load state in three stages (high load, medium load, and low load) from the first threshold and the second threshold as described above.
- the drive operation mode state table 223 is updated.
- the power performance optimization program 200 acquires the detection result of the write load monitoring program 201 and determines whether or not there is a flash drive 122 that is a bottleneck in the write process.
- the power performance optimization program 200 proceeds to step 703 if the write processing load is concentrated and there is a flash drive 122 that is a bottleneck, and proceeds to step 710 if there is no bottleneck.
- step 703 the power performance optimization program 200 activates the optimum power allocation calculation program 202.
- the optimal power allocation calculation program 202 first reads the drive operation mode state table 223 and identifies the flash drive 122 with a high write load.
- the optimum power allocation calculation program 202 relaxes the limitation on the power consumption of the flash drive 122 whose write load status 2234 of the drive operation mode status table 223 is “high load” and selects it as an increase target to increase the allocation of power consumption. .
- the present invention is not limited to the above, and the parity group to which the flash drive 122 having a high write load belongs may be used as the flash drive 122 whose power consumption is to be increased.
- the optimum power allocation calculation program 202 refers to the parity group configuration information table 221, and among the flash drives 122 in which the power allocation change enable / disable flag 2214 is set to “permitted”, the drive that can shift to the operation mode with low power consumption.
- the flash drive 122 whose load state 2234 is “medium load” is selected as a power consumption reduction target (power consumption restriction enhancement target).
- the optimal power allocation calculation program 202 selects the flash drive 122 whose load state 2234 is “medium load”. However, the optimal power allocation calculation program 202 refers to the IO pattern learning table 222, You may make it select as the flash drive 122 which lowers
- the optimum power allocation calculation program 202 indicates that the number of flash drives 122 that can be changed to the low power consumption operation mode is the sum of the power consumption of the flash drives 122 to be increased in power consumption and the consumption of other flash drives 122.
- the total sum of the powers only needs to be equal to or less than the chassis power upper limit 2102 of the enclosure power upper limit table 210.
- the optimum power allocation calculation program 202 calculates the total power value, the value of the enclosure power upper limit 2102 obtained from the enclosure power upper limit table 210, and the value of the maximum cooling performance 2112 of the enclosure cooling capacity table 211. Each is compared to determine whether to change the power allocation.
- the optimum power allocation calculation program 202 proceeds to step 705 in order to change the power allocation when the total power value is lower than both the value of the chassis power upper limit 2102 and the value of the maximum cooling performance 2112. On the other hand, if the total power value exceeds either the value of the chassis power upper limit 2102 or the value of the maximum cooling performance 2112, the optimum power allocation calculation program 202 prohibits the power allocation change and returns to step 720. Repeat the above process.
- the optimum power allocation calculation program 202 does not use an operation mode in which the drive power consumption in the operation mode table 411 is larger than the single drive supply power upper limit 2103 in the enclosure power upper limit table 210.
- step 705 the power performance optimization program 200 activates the drive operation mode change notification program 203 in order to change the operation mode of each flash drive 122.
- the drive operation mode change notification program 203 updates the operation mode 2233 of the drive operation mode state table 223 for the flash drive 122 whose mode ID has been changed.
- the power performance optimization program 200 starts the cooling performance control program 204.
- the cooling performance control program 204 determines whether the cooling performance needs to be enhanced.
- the cooling performance control program 204 refers to the cooling performance 2122 of the enclosure cooling state table 212 and strengthens the cooling performance 2122 if the total power value calculated by the optimum power allocation calculation program 202 exceeds the cooling performance 2122. Proceed to step 707.
- the cooling performance control program 204 maintains the cooling performance 2122 and proceeds to step 708.
- step 707 the cooling performance control program 204 increases the rotational speed of the cooling fan mounted on the power supply and fan unit 121. After the cooling fan rotation speed change is completed, the power performance optimization program 200 starts the drive operation mode change notification program 203 again. If the designed cooling performance exceeds all the modes of each flash drive 122, the cooling performance control program 204 and its reference information can be omitted.
- the power (operation mode) allocated to the flash drive 122 by the series of processing in steps 701 to 708 is changed according to the write load status, and the write performance is changed by changing the operation mode of the flash drive 122 that has become a bottleneck. And the overall performance of the storage system 101 is improved.
- the write load monitoring program 201 monitors whether the write load is concentrated on the specific flash drive 122 (702). If the load is not concentrated, the process proceeds to step 710.
- step 710 the power performance optimization program 200 determines whether or not the power allocation of the flash drive 122 has been changed. If it has been changed, the process proceeds to step 711 to start processing for returning the operation mode.
- step 710 determines that the power allocation has not been changed. If it is determined in step 710 that the power allocation has not been changed, the process returns to step 720 and the above processing is repeated. That is, when the cooling device is degenerated and the power consumption of the flash drive 122 is not changed, the power performance optimization program 200 prohibits the power consumption allocation change.
- the drive operation mode change notification program 203 updates the operation mode 2233 of the drive operation mode state table 223 for the flash drive 122 whose mode ID has been changed.
- the power performance optimization program 200 starts the cooling performance control program 204.
- step 712 the cooling performance control program 204 refers to the enclosure cooling state table 212 and determines whether or not the cooling performance has been enhanced (changed). If the cooling performance is strengthened (changed), the process proceeds to step 713 and the cooling performance control program 204 changes the rotation speed of the cooling fan mounted on the power supply and fan unit 121 to the normal state.
- the cooling performance control program 204 updates the cooling performance 2122 of the enclosure cooling state table 212 to a value corresponding to the rotational speed by reducing the rotational speed.
- the cooling performance corresponding to the rotation speed of the cooling fan is set in advance.
- the drive operation mode change notification program 203 updates the operation mode 2233 of the drive operation mode state table 223 for the flash drive 122 whose mode ID has been changed.
- the power allocation state is optimized by performing the processing from 710 to 714. Then, control is performed so that the power supply and fan unit 121 can operate even in a degenerated state.
- control in step 720 can be expanded, and for example, when the power supply and fan unit 121 is degenerated, all the flash drives 122 can be changed to a mode with less power consumption.
- the maximum cooling performance of the power supply and fan unit 121 can be lowered, and the cost of the storage system 101 can be reduced by adopting a lower priced power supply and fan unit 121.
- termination condition of the power performance optimization program 200 shown in this flowchart is not explicitly shown, but it is assumed that the termination is performed according to the termination condition of the program that controls the entire storage system.
- FIG. 7 An example in which power optimization is performed according to the flowchart of the present invention shown in FIG. 7 will be described with reference to FIGS. 8A, 8B, 9A, and 9B.
- FIG. 8A is a diagram illustrating an example of the enclosure cooling state table 212 before the power allocation is changed.
- FIG. 8B is a diagram illustrating an example of the drive operation mode state table 223 before the power allocation is changed.
- FIG. 9A is a diagram showing an example of the enclosure cooling state table 212 after the power allocation is changed.
- FIG. 8B is a diagram showing an example of the drive operation mode state table 223 after the power allocation is changed.
- the write load is concentrated on the parity group 2 (602), resulting in a high load and a bottleneck.
- the parity group 1 (601), the parity group 3 (603), and the spare group (605) are in a low load state, and the parity group 4 (604) is in a medium load state.
- FIG. 9A and FIG. 9B show the result of performing the power reallocation by the power performance optimization program 200 on the storage system 101 in this state.
- the write load from the host 100 as shown in FIGS. 6, 8A, 8B, 9A, and 9B is predicted using the IO pattern learning table 222, and the power allocation is recalculated (703). In addition to the above, it can also be applied to the write processing generated by the storage controller 102 itself. A specific example is shown using FIG.
- FIG. 10 is a diagram showing the physical structure of the storage system 101 and the configuration of the parity group when the flash drive 122 fails.
- FIG. 11 shows the state of the flash drive 122 in FIG.
- FIG. 11 is a diagram illustrating an example of a drive operation mode state table when a failure occurs in the flash drive.
- Parity group 1 (degenerate) 601A is in a state where redundancy by RAID 5 is lost, and since one flash drive 122 of parity group 1 (active spare) 601B is added, the RAID configuration of storage controller 102 A rebuild is performed.
- the data is read from the three flash drives 122 belonging to the parity group 1 (degenerate) 601A, the parity calculation is performed, and the calculation result is written to the flash drive 122 of the parity group 1 (active spare) 601B. .
- Read is concentrated on the flash drive 122 of the parity group 1 (degenerate) 601A, while Write is concentrated on the flash drive 122 of the parity group 1 (active spare) 601B.
- the power performance optimization program 200 including the processing shown in FIG. 7 of the present invention, the power of the storage system 101 can be optimized and the performance can be improved.
- FIG. 12 is a diagram showing an example of a drive operation mode state table during reconstruction of a RAID configuration.
- the operation mode of the flash drive 122 belonging to the parity group 1 (active spare) 601B is changed to the mode ID.
- the rebuilding time of the RAID configuration can be greatly shortened.
- parity group configuration information table 221 by having a plurality of power allocation change enable / disable flags, for example, in the case of data access from the host 100 and in the case of access such as RAID configuration reconstruction by the storage controller 102, respectively Implementations that change the range of power reallocation are also possible.
- the storage controller 102 when the storage controller 102 detects the flash drive 122 in which writing is concentrated, the allocation of power consumption of the flash drive 122 with a low writing load is reduced.
- the storage controller 102 can improve the performance of the entire storage system 101 within the range of power that can be supplied by increasing the power by relaxing the restriction on the power consumption of the flash drive 122 where writing is concentrated. .
- the flash drive 122 in which writing is concentrated has a write load exceeding the first threshold value and is subject to power consumption restriction relaxation, and the flash drive 122 with a low writing load has a power consumption that can allow a reduction in power consumption. It is a target for power reduction (target for tightening restrictions).
- the cooling performance can be increased, so that the flash drive 122 can be prevented from overheating and the performance of the flash drive 122 can be stabilized. it can.
- the selection of the flash drive 122 whose power consumption is to be increased and the selection of the flash drive 122 whose power consumption is to be reduced are not limited to the above embodiment.
- the operation mode may be set to “3” with all but the flash drive 122 whose power consumption is to be increased as the power consumption reduction target, depending on the balance between the performance required for the storage system 101 and the power consumption. Can be changed as appropriate.
- the flash drive 122 employing the NAND flash memory as the nonvolatile semiconductor memory is shown as the nonvolatile semiconductor drive.
- the present invention is not limited to this example. Therefore, the present invention can be applied to a nonvolatile semiconductor drive in which power consumption increases with the increase of.
- this invention is not limited to the above-mentioned Example, Various modifications are included.
- the above-described embodiments are described in detail for easy understanding of the present invention, and are not necessarily limited to those having all the configurations described.
- a part of the configuration of one embodiment can be replaced with the configuration of another embodiment, and the configuration of another embodiment can be added to the configuration of one embodiment.
- any of the additions, deletions, or substitutions of other configurations can be applied to a part of the configuration of each embodiment, either alone or in combination.
- each of the above-described configurations, functions, processing units, processing means, and the like may be realized by hardware by designing a part or all of them with, for example, an integrated circuit.
- each of the above-described configurations, functions, and the like may be realized by software by the processor interpreting and executing a program that realizes each function.
- Information such as programs, tables, and files that realize each function can be stored in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.
- control lines and information lines indicate what is considered necessary for the explanation, and not all the control lines and information lines on the product are necessarily shown. Actually, it may be considered that almost all the components are connected to each other.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Power Sources (AREA)
Abstract
L'invention concerne un dispositif de stockage ayant un dispositif de stockage comprenant une pluralité de lecteurs semi-conducteurs non volatiles et un dispositif de commande qui comprend un processeur et une mémoire et commande le dispositif de stockage. Le processeur mesure une charge d'écriture de lecteur de chacun des lecteurs semi-conducteurs non volatiles, sélectionne, dans le cas où un lecteur semi-conducteur non volatil, dont la charge d'écriture dépasse une valeur de seuil prédéterminée est détectée, le lecteur de semi-conducteur non volatil en tant que lecteur devant être soumis à une relaxation de restriction sur la consommation d'énergie, et sélectionne, parmi les lecteurs semi-conducteurs non volatils, à l'exception de lecteur à soumettre à la relaxation de la restriction sur la consommation d'énergie, un lecteur de semi-conducteur non volatil capable d'accepter une réduction de la consommation d'énergie en tant que lecteur devant être soumis à un renforcement de la limitation de la consommation d'énergie. Après que la consommation d'énergie du lecteur semi-conducteur non volatil devant être soumise au renforcement de la restriction sur la consommation d'énergie est réduite, le processeur augmente la consommation d'énergie du lecteur semi-conducteur non volatil à soumettre à la relaxation de la restriction sur la consommation d'énergie.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2017/015988 WO2018193608A1 (fr) | 2017-04-21 | 2017-04-21 | Système de stockage, procédé de commande pour dispositif de stockage et dispositif de commande de stockage |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2017/015988 WO2018193608A1 (fr) | 2017-04-21 | 2017-04-21 | Système de stockage, procédé de commande pour dispositif de stockage et dispositif de commande de stockage |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2018193608A1 true WO2018193608A1 (fr) | 2018-10-25 |
Family
ID=63856539
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2017/015988 Ceased WO2018193608A1 (fr) | 2017-04-21 | 2017-04-21 | Système de stockage, procédé de commande pour dispositif de stockage et dispositif de commande de stockage |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2018193608A1 (fr) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2021096819A (ja) * | 2019-12-18 | 2021-06-24 | エスケーハイニックス株式会社SK hynix Inc. | 電力使用量を管理するために人工知能を使用するデータ処理システム |
| CN114110964A (zh) * | 2021-11-26 | 2022-03-01 | 珠海格力电器股份有限公司 | 基于flash闪存的切换控制方法、装置及空调 |
| US12124712B2 (en) | 2022-08-30 | 2024-10-22 | Hitachi, Ltd. | Storage system |
| US12204775B1 (en) | 2023-11-24 | 2025-01-21 | Hitachi, Ltd. | Storage system and a method of controlling energy consumption therein |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH10307648A (ja) * | 1997-05-09 | 1998-11-17 | Toshiba Corp | コンピュータシステムおよびそのクーリング制御方法 |
| JP2012523593A (ja) * | 2009-07-15 | 2012-10-04 | 株式会社日立製作所 | ストレージシステム、ストレージ装置の制御方法 |
| JP2016184371A (ja) * | 2015-03-27 | 2016-10-20 | 富士通株式会社 | ストレージ制御装置,ストレージ装置およびストレージ制御プログラム |
-
2017
- 2017-04-21 WO PCT/JP2017/015988 patent/WO2018193608A1/fr not_active Ceased
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH10307648A (ja) * | 1997-05-09 | 1998-11-17 | Toshiba Corp | コンピュータシステムおよびそのクーリング制御方法 |
| JP2012523593A (ja) * | 2009-07-15 | 2012-10-04 | 株式会社日立製作所 | ストレージシステム、ストレージ装置の制御方法 |
| JP2016184371A (ja) * | 2015-03-27 | 2016-10-20 | 富士通株式会社 | ストレージ制御装置,ストレージ装置およびストレージ制御プログラム |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2021096819A (ja) * | 2019-12-18 | 2021-06-24 | エスケーハイニックス株式会社SK hynix Inc. | 電力使用量を管理するために人工知能を使用するデータ処理システム |
| JP7592417B2 (ja) | 2019-12-18 | 2024-12-02 | エスケーハイニックス株式会社 | 電力使用量を管理するために人工知能を使用するデータ処理システム |
| CN114110964A (zh) * | 2021-11-26 | 2022-03-01 | 珠海格力电器股份有限公司 | 基于flash闪存的切换控制方法、装置及空调 |
| CN114110964B (zh) * | 2021-11-26 | 2022-11-18 | 珠海格力电器股份有限公司 | 基于flash闪存的切换控制方法、装置及空调 |
| US12124712B2 (en) | 2022-08-30 | 2024-10-22 | Hitachi, Ltd. | Storage system |
| US12204775B1 (en) | 2023-11-24 | 2025-01-21 | Hitachi, Ltd. | Storage system and a method of controlling energy consumption therein |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR102414047B1 (ko) | 통합 메모리 디바이스 및 그의 동작 방법 | |
| US10372185B2 (en) | Enhanced power control of data storage arrays | |
| US10241877B2 (en) | Data storage system employing a hot spare to proactively store array data in absence of a failure or pre-failure event | |
| US7434097B2 (en) | Method and apparatus for efficient fault-tolerant disk drive replacement in raid storage systems | |
| US9519554B2 (en) | Storage system with rebuild operations | |
| JP4818812B2 (ja) | フラッシュメモリストレージシステム | |
| US20090271564A1 (en) | Storage system | |
| KR20130091628A (ko) | Raid의 개선된 리빌드 시스템 및 방법 | |
| JP2016530637A (ja) | Raidパリティストライプ再構成 | |
| US8495295B2 (en) | Mass storage system and method of operating thereof | |
| JP2007156597A (ja) | ストレージ装置 | |
| CN111164567A (zh) | 用于在存储设备中选择功率状态的方法和装置 | |
| US20100100677A1 (en) | Power and performance management using MAIDx and adaptive data placement | |
| US10338844B2 (en) | Storage control apparatus, control method, and non-transitory computer-readable storage medium | |
| US20200285551A1 (en) | Storage system, data management method, and data management program | |
| WO2018193608A1 (fr) | Système de stockage, procédé de commande pour dispositif de stockage et dispositif de commande de stockage | |
| US9569329B2 (en) | Cache control device, control method therefor, storage apparatus, and storage medium | |
| US11385815B2 (en) | Storage system | |
| US9032150B2 (en) | Storage apparatus and control method of storage apparatus | |
| US20140244928A1 (en) | Method and system to provide data protection to raid 0/ or degraded redundant virtual disk | |
| US20120215966A1 (en) | Disk array unit and control method thereof | |
| JP4714720B2 (ja) | ストレージ装置及びその制御方法、並びにディスク装置 | |
| US10712956B2 (en) | Management method and storage system using the same | |
| US20250004512A1 (en) | Priority based thermal management for data storage device enclosures | |
| US9183154B2 (en) | Method and system to maintain maximum performance levels in all disk groups by using controller VDs for background tasks |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17906271 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 17906271 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: JP |