Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Through investigation and research of a plurality of data centers, the electromechanical hardware construction of the cold source BA control system is A-level or Tier III +, while the intelligent monitoring intelligent level is only B-level or Tier II, even lower, so that the monitoring fault is frequent. However, a large data center usually adopts a centralized refrigeration system, and once a problem occurs in the centralized refrigeration system, the problem is not found and treated in time, which causes a high-temperature downtime of the whole data center. The cold source BA control system of the data center is used for managing a large number of refrigeration equipment start-stop, valve switching, mode switching, fault emergency switching and the like, the cold source BA control system is eyes and hands of the data center, and once the BA control system is in fault or is paralyzed, various faults of the existing electromechanical hardware cannot be effectively and timely discovered, and large faults can be caused. Analyzing the fault condition of the data center in recent years, the level mismatching of the intelligent cold source BA control system forms hidden danger to the stable operation of the data center, and a high-reliability cold source BA control system architecture of the large-scale data center is urgently needed to guide the construction of the BA control system of the data center. At present, the mainstream of domestic data centers is a national standard A-level data center, which is equivalent to the TIV-or TIIII + level of UTIME. The embodiment of the invention provides a high-reliability cold source BA control system of a large-scale data center by taking an A-level data center as a research object.
Fig. 1 is a schematic structural diagram of a cold source BA control system according to an embodiment of the present invention, and as shown in fig. 1, the control system adopts a four-layer network communication structure: the first layer is a field collection layer, consists of hardware components of a cold source system, provides monitoring control points and comprises an air supply and exhaust system, a sewage pit system, an integrated water chilling unit, an integrated cold accumulation tank, an integrated water source heat pump unit and the like. The second layer is a system access layer, and a DDC (direct Data controller) controller and an I/O module or a gateway are adopted to connect with the bottom layer equipment. The third layer is a system control layer, and control equipment NAE (enhanced Network Communication equipment) is interconnected, and a BACNET Communication protocol is adopted for connecting control equipment (DDC and I/O modules or gateways), performing logic operation and issuing a decision control command. The fourth layer is a system display layer, uses TCP/IP protocol, and is used for connecting a system server, a workstation and a third-party system.
Fig. 2 is a flow chart of optimizing a cold source BA control system according to an embodiment of the present invention, and the embodiment of the present invention is described below with reference to the accompanying drawings. The cold source BA control system for controlling a data center provided by this embodiment includes a plurality of cold source BA control subsystems, each of which is respectively used for controlling a corresponding refrigeration subsystem; all refrigeration equipment included in the data center is divided into a plurality of refrigeration subsystems, and the number of cold source BA control subsystems is the same as that of the refrigeration subsystems. For example, a large amount of refrigeration equipment is assembled in a cold source system of the data center, the refrigeration equipment comprises pipelines, valves, a cooling tower, a fan, a water chilling unit, a plate heat exchanger, a water collecting and distributing device, a cold storage tank, a cooling water pump, a freezing water pump, a cold storage pump, a water source heat pump, an electric tracing, a water replenishing pump, a sewage pump, a water collecting pit, various related sensors, an execution mechanism and the like, 3 sets of cold source systems are formed as refrigeration subsystems, and effective operation and fault monitoring of the facilities all depend on a cold source BA control system of the data center. In this embodiment, 3 cold source BA control subsystems are also provided, and the 3 cold source BA control subsystems work independently and are respectively used for controlling the corresponding refrigeration subsystems. In this embodiment, each of the cold source BA control subsystems includes: a plurality of DDC controllers and a NAE device which are connected in series on a bus, wherein the DDC controllers are all connected with the NAE device in a communication way. Specifically, the high-reliability cold source BA control system firstly classifies DDCs of a refrigeration station and a machine room, and a group of DDCs for controlling the same system are connected in series on a bus and controlled by an NAE (network access entity) to ensure that the monitored protection level is consistent with the protection level of electromechanical configuration.
The high reliability cold source BA control system architecture is described as an example below. At present, 3 machine rooms have been built in China mobile (huhaote) data centers, which are all national standard A-level. The electromechanical facilities of the national standard A-level large-scale data center are configured according to a fault-tolerant system, the cold source BA control system of the A-level data center is also configured according to the fault-tolerant system, and single-point faults of the cold source BA control system are avoided, so that the whole system or the whole cold source BA control system is out of control, and further large faults are caused. The cold source BA control system is used as a control brain of a data center centralized water cooling system, and the importance of the cold source BA control system is self-evident that a single set of system bears the refrigeration requirements of more than 3000 cabinets and more than 3000 thousands of servers. The DDC of the access layer of the cold source BA control system adopts an RS-485 bus architecture, various refrigeration equipment of a refrigeration subsystem is accessed to a DDC controller through an MS/TP protocol, the DDC controller is accessed to NAE equipment through an RS-485 bus, the NAE equipment is accessed to a BA intelligent monitoring server through a TCP/IP protocol, and the BA intelligent monitoring server is accessed to a data center comprehensive management platform through OPC and SNMP protocols.
If according to prior art's setting, a computer lab building includes a plurality of DDC controllers of refrigeration station and establishes ties on a bus, one of them connecting wire trouble between certain DDC equipment binding post trouble or the DDC trouble will lead to the DDC that the lower extreme links to each other all unable normal work, and then causes the out of control of DDC lower extreme equipment. Meanwhile, only 1 NAE device is configured in a single machine room building and a refrigerating station, and when a single NAE software fault, a hardware fault, a link fault from the NAE to an upstream switch or a bus fault to the DCC direction, the conventional cold source BA control system cannot be normally used, even the refrigeration system is out of control or even interrupted, and a large safety risk exists. Even if the refrigerating machine system is protected by 2+1 and the power supply system is protected by 1+1 at the moment, the BA intelligent control system is in an out-of-control state and cannot control a plurality of high-pressure water chilling units, water source heat pump units, circulating water pumps, cooling towers, pipeline valves and other equipment in the on-site refrigerating station. I.e., class a of the electromechanical configuration, cannot exert the protection effect applied in design because of the low level of intelligent monitoring. In the embodiment of the invention, taking a centralized refrigeration system of a China Mobile (Hohaote) data center as an example, each machine room is provided with a refrigerating machine system protected by N + M (B01/B02/B03 machine room is 2+1), each refrigerating machine system protects a high-pressure water chilling unit, a circulating cooling water pump, a circulating chilled water pump, a cooling tower, a plate heat exchanger, a pipeline valve and other equipment, and when the equipment is required to be normally controllable, the refrigerating system can normally work in a coordinated and consistent manner. If the equipment is decentralized and controlled by different NAEs, or one NAE controls the components of a multi-chiller system, the multi-chiller system cannot operate when one NAE fails. Similarly, when one NAE controls different parts of 3 sets of refrigeration systems, once the NAE equipment fails, all 3 sets of refrigeration systems cannot work normally, so that the cold sources of the refrigeration stations are interrupted, and the machine room is shut down at high temperature. Therefore, each component of each set of system should be controlled by one NAE, and N + M sets of refrigeration systems should be provided with N + M NAE control devices. The 2+1 protected cold machine system of each building of China Mobile (HUNHAOTE) data center needs to be configured with 3 NAEs, and simultaneously, the original 1 bus structure is changed into 4 bus structures, and the 4 th is various fire-fighting exhaust fans, fresh air fans and the like of a machine room. And two sets of refrigerating systems of the machine room can be used when any one NAE fails, so that the stable operation of the data center can be guaranteed.
On the basis of the embodiment, the system also optimizes the power supply reliability, namely, a plurality of DDC controllers and a plurality of NAE devices in the system are all connected into a UPS (uninterrupted power supply) to supply power, and the system comprises two sets of UPS (uninterrupted power supplies), and the two sets of UPS are mutually used as main and standby devices. Specifically, the system server usually only has one UPS to guarantee power supply, and has a single-point hidden danger and insufficient reliability. If the power supply is interrupted, the whole refrigeration system is paralyzed. The DDC does not have UPS guarantee power supply, if meet commercial power and cut off, pipeline valve and control system all can't normally work, also can't send any instruction to lower extreme equipment. The pipeline valve stops after power failure, and the valve needs to be closed and then opened when power is supplied again; if the DDC is powered off and the pipeline valve is powered on, the pipeline valve is directly closed, so that chilled water or cooling water flow is interrupted, and great hidden danger exists. Therefore, the loads of important equipment such as a control power supply of a water chilling unit of the cold source BA control system, an electric valve, a DDC (direct digital control), an NAE (network access equipment) and the like are required to be connected into the UPS, so that the situation that once the power supply of a mains supply of the control power supply is interrupted, the time required for starting the chiller is prolonged, the refrigeration system stops running, and the machine room is high in temperature is avoided.
In the system, the loads of important equipment such as a water chilling unit control power supply, an electric valve, a DDC (direct digital control) and an NAE (network Address extension) of the existing cold source system are all connected into a UPS (uninterrupted Power supply), the single power supply equipment needs to be connected into a 1+1 UPS parallel system consisting of 2 UPSs, the double power supply equipment needs to be connected into a 2N or even 3N UPS protection system formed by 2 UPSs, and the two sets of UPS uninterrupted power supplies are mutually used as main and standby supplies so as to improve the power supply reliability of related control components of the cold source system.
On the basis of the embodiment, the system also performs network protection optimization, namely, the DDC controller in each cold source BA control subsystem and the NAE device communicate by adopting multi-link aggregation. In the existing system, a system measurement point is connected to a DDC controller to an NAE or a gateway is connected to a server, and when any one of a network cable, a switch, a router, a tail fiber and an optical fiber in the link fails, even if electromagnetic interference is large, optical fiber attenuation is large, and the like, the intelligent monitoring system is broken. Generally, in a cold source BA control system existing network, network connection of a DDC controller-NAE control device, network connection of the NAE control device-BA control system server, and network physical single-link connection of a BA control system server-monitoring center are realized, and meanwhile, the BA server is also in physical single-link connection with a data center comprehensive management platform, so that great hidden danger exists in network safety. In the system provided by the embodiment of the invention, link aggregation, device active/standby redundancy or device load sharing and other modes are adopted, the robustness of a link is enhanced between two converged or core devices, and physical links for link aggregation, device active/standby redundancy or device load sharing require physically different routes, namely different laying modes, different optical cables, different wavelength division systems and the like, so that the existence of single-point hidden danger is avoided.
When the data center carries out electromechanical design, a high-reliability cold source BA control system with the same level as the electromechanical protection level needs to be synchronously designed. The cold source BA control system provided by the embodiment of the invention comprises a high-reliability cold source BA control system framework, power supply reliability optimization of BA control system equipment and network protection optimization of the BA control system. The cold source system does not cause the interruption of the operation of the electronic information system due to single equipment failure, external power interruption, maintenance and overhaul during the operation period, and provides reference for the design or the transformation of a cold source BA control system of a national standard A-level large-scale data center.
The embodiment of the invention also provides a data center, which comprises a plurality of refrigeration subsystems, wherein all refrigeration equipment included in the data center is divided into the plurality of refrigeration subsystems; a large amount of refrigeration equipment is assembled in a cold source system of the data center, and comprises pipelines, valves, a cooling tower, a fan, a water chilling unit, a plate heat exchanger, a water collector, a cold storage tank, a cooling water pump, a chilled water pump, a cold storage pump, a water source heat pump, electric tracing, a water replenishing pump, a sewage pump, a sump and various related sensors, an actuating mechanism and the like, and the equipment is divided into a plurality of refrigeration subsystems.
The cold source BA control system is used for controlling the data center; the cold source BA control system includes: the system comprises a plurality of mutually independent cold source BA control subsystems, wherein each cold source BA control subsystem is respectively used for controlling a corresponding refrigeration subsystem, and the number of the cold source BA control subsystems is the same as that of the refrigeration subsystems; each cold source BA control subsystem comprises: a plurality of DDC controllers and a NAE device which are connected in series on a bus, wherein the DDC controllers are all connected with the NAE device in a communication way.
Further, in this data center, all refrigeration plant, a plurality of DDC controller and a plurality of NAE equipment all insert UPS uninterrupted power source and supply power, and include two sets of UPS uninterrupted power source, just two sets of UPS uninterrupted power source are mutually for activestandby use.
Further, in the data center, the DDC controller in each cold source BA control subsystem and the NAE equipment adopt multi-link aggregation to communicate.
Further, in the data center, the refrigeration equipment in each refrigeration subsystem is connected to the DDC controller through an MS/TP protocol, and the cold source BA controls the DDC controller in the subsystem and is connected to the NAE equipment through an RS-485 bus.
The cold source BA control system included in the data center provided by this embodiment may adopt the cold source BA control system provided by the above embodiment, and the specific structure and function thereof are described in detail in the foregoing description and will not be described herein again.
The data center provided by the embodiment comprises a high-reliability cold source BA control system so as to achieve fault-tolerant system configuration of monitoring facilities, and during the system operation period, the cold source system facilities are not required to cause the cold source system operation interruption due to BA control system misoperation, equipment failure, external power interruption, maintenance and overhaul. The optimization method comprises the steps of optimizing the architecture of the high-reliability cold source BA control system, optimizing the power supply reliability of the BA system equipment and optimizing the network protection of the cold source BA control system.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.