CN119814519A

CN119814519A - A method for automatically counting assets of a host system

Info

Publication number: CN119814519A
Application number: CN202411767376.5A
Authority: CN
Inventors: 唐培全; 任寒; 申大伟; 严明; 赵宗慧
Original assignee: Shandong Yuanlu Information Technology Co ltd; Huaneng Shandong Power Generation Co Ltd
Current assignee: Shandong Yuanlu Information Technology Co ltd; Huaneng Shandong Power Generation Co Ltd
Priority date: 2024-12-04
Filing date: 2024-12-04
Publication date: 2025-04-11

Abstract

The present invention relates to the technical field of host asset management, and discloses a method for automatic inventory of host system assets, including the following steps: multi-protocol adaptation and dynamic selection: according to the type of host, environmental attributes and network characteristics, a protocol selection engine is used to realize dynamic protocol selection and adaptation; data collection and integration: through asynchronous concurrent collection tasks, multi-source data is collected in real time to complete data cleaning, deduplication and standardization; distributed storage and association analysis: the collected host asset data is stored in a distributed database, and the database is used to establish a dependency model between assets. A multi-protocol adaptation and dynamic selection technical solution based on host attribute priority rules is adopted, and the protocol selection engine is combined with a dynamic switching mechanism to realize efficient collection of multi-type host assets in a heterogeneous environment, and achieve the technical effect of selecting the optimal protocol for different host type systems.

Description

Method for automatically checking host system assets

Technical Field

The invention relates to the technical field of host asset management, in particular to a method for automatically checking host system assets.

Background

With the rapid development of informatization and cloud computing technologies, IT environments of enterprises are increasingly complicated, and types, distributions and running states of host assets show dynamic and various characteristics. Conventional asset management techniques typically rely on a single protocol static data collection scheme, combined with manual configuration to complete the recording and maintenance of host assets. However, these technologies gradually expose problems of insufficient suitability, instantaneity and intelligence in complex environments, and IT is difficult to meet the requirements of modern IT operation and maintenance.

Conventional asset management tools typically support only a certain or a few protocols (e.g., SNMP, SSH, or WMI), which are limited in scope. The SNMP protocol is widely applied to data acquisition of network equipment, but has insufficient support for information (such as service state and hardware configuration) of an operating system level, and SSH and WMI are respectively applicable to Linux and Windows systems, but are difficult to cover dynamic resources in a modern cloud environment and a containerized platform. The prior art is starved of support for emerging protocols (e.g., redfish, kubernetesAPI) that make it difficult to efficiently collect comprehensive asset data in heterogeneous environments. This lack of protocol suitability results in a limited range of asset acquisitions, requiring a large number of manual interventions to configure different protocols, complex operations and error-prone.

The conventional asset checking tool adopts a serial acquisition mode, namely, data acquisition is completed according to a protocol which is called one by a host, and as the number of hosts in an enterprise IT environment increases, the efficiency problem of the serial mode is increasingly prominent, and in a large-scale asset environment, serial acquisition can consume a great deal of time, so that acquired data is lagged behind an actual running state.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a method for automatically checking the host system assets, which solves the problems of insufficient suitability, low data acquisition real-time performance, poor dynamic environment response capability and imperfect asset dependency modeling and visual interaction in the existing host asset management technology.

In order to achieve the purpose, the invention is realized by the following technical scheme that the method for automatically checking the assets of the host system comprises the following steps:

multi-protocol adaptation and dynamic selection, namely realizing dynamic protocol selection and adaptation by utilizing a protocol selection engine according to the type, the environmental attribute and the network characteristic of a host;

The data acquisition and integration, namely acquiring multi-source data in real time through asynchronous concurrent acquisition tasks to finish data cleaning, de-duplication and standardization;

Storing the collected host asset data in a distributed database, and establishing a dependency relationship model among assets by utilizing the database;

and the interactive visualization is that the host asset state and the associated topological relation thereof are displayed through a dynamic instrument panel and a topological view, and a real-time alarm and prediction function is provided.

Preferably, in the step of multi-protocol adaptation and dynamic selection, the protocol selection engine performs protocol selection based on the following rules:

a priority selection protocol based on host attributes, comprising:

a. The SNMP protocol is preferentially used for the network equipment;

b. Preferentially using an SSH protocol for the Linux system;

c. preferably using a WMI protocol for the Windows system;

and the dynamic switching mechanism is used for automatically switching to the standby protocol when the main protocol fails.

Preferably, the step of data acquisition and integration comprises the substeps of:

distributing and prioritizing the acquisition tasks using a distributed scheduling system (including Celery);

Implementing multi-protocol concurrency acquisition based on asynchronous 1/0 technology (including asyncio of Python or Go concurrency model);

The data integration adopts the following formula:

D_s＝{d₁,d₂,…,d_n},d_i＝f⁽p_i,c_i ⁾

where D _s represents the normalized asset dataset, D _i is a single asset record, p _i is the protocol collected data item, and c _i is the data cleansing rule.

Preferably, the step of storing and analyzing the association in a distributed manner includes:

storing dynamic attribute data of the asset using an elastiscearch;

The graph database (comprising Neo4 j) is used for storing the host and the dependency relationship thereof, wherein the dependency relationship modeling formula is as follows:

G= ⁽V,E⁾,V＝{v₁,v₂,…,v_m},E＝{e_ij|v_i dependent on v _j }

Wherein G represents a host topology, V is a host and a set of serving nodes, and E is a dependency relationship between hosts.

Preferably, in the step of interactive visualization, the following functions are implemented through a dynamic dashboard:

the real-time update of the asset state comprises the online state of a host computer and the utilization rate (CPU, memory) of resources;

The alarm information display comprises an abnormal host computer and an operation fault;

Trend prediction curve, based on the time series analysis result of the host history data.

Preferably, the method further comprises an event-driven data update mechanism, in particular:

For the cloud environment, updating the states of the virtual machine and the storage device by monitoring Webhook events of an API of a cloud manufacturer;

For the container environment, through KubernetesAPI subscription Pod life cycle event, real-time resource change synchronization is realized.

Preferably, the intelligent analysis module includes:

The abnormality detection sub-module detects unauthorized access equipment based on IsolationForest algorithm, specifically:

s (x) is an abnormal score of the host, N is the number of random trees, and h (x, t) is the path length of the sample x in the t-th tree;

and the capacity prediction sub-module predicts the utilization trend of host resources based on LSTM (long-short-term memory network).

Preferably, the rule for cleaning repeated data in the data integration is as follows:

for data collected by multiple protocols, deduplication is performed based on unique identifiers (comprising MAC addresses and UUIDs);

for time series data, the latest records are overwritten based on the time stamps.

Preferably, the distributed scheduling system used in the data acquisition step supports load balancing, and specifically includes:

Dynamically distributing acquisition tasks based on the current CPU and network load of the host;

asynchronous monitoring and feedback of task status is achieved using message queues (including RabbitMQ).

Preferably, the topology visualization view provided by the system supports the following interactive functions:

clicking the node to display the detailed attribute information of the host;

dragging nodes to rearrange the topology layout;

and multidimensional filtering, and screening and displaying according to the state of the host, the service type or the resource utilization rate.

The invention provides a method for automatically checking assets of a host system. The beneficial effects are as follows:

1. The invention adopts the technical scheme of multi-protocol adaptation and dynamic selection based on host attribute priority rules, and realizes the efficient collection of multi-type host assets in heterogeneous environments by combining a protocol selection engine with a dynamic switching mechanism, thereby achieving the technical effect of selecting the optimal protocol for different host types (such as network equipment, a Linux system and a Windows system) and remarkably improving the accuracy and efficiency of data collection. Compared with the technical scheme in the prior art that only a single protocol is supported or manual protocol configuration is needed, the technical problems of insufficient suitability, complex configuration and low data acquisition efficiency in a complex environment are solved, and the system is effectively suitable for a dynamically changeable IT environment.

2. The invention adopts the technical scheme of combining distributed scheduling with asynchronous concurrency, and realizes the efficient acquisition and real-time integration of multi-source asset data through dynamic task allocation and priority scheduling. The technical effect of simultaneously carrying out high concurrency acquisition on a large number of hosts is achieved, the acquisition delay is obviously reduced, and the throughput capacity of the system is improved. Compared with the technical scheme that the traditional serial acquisition mode is adopted in the prior art to cause low resource utilization rate and high acquisition delay, the method solves the problems of insufficient real-time performance and efficiency in a large-scale asset environment, ensures consistency and usability of data through data cleaning and standardized processing, and improves the overall quality of asset management.

3. The invention adopts the technical scheme of combining dynamic topology visualization with multidimensional interaction function, and realizes visual display and convenient management of host assets and dependency relations thereof through various interaction modes such as node clicking, free dragging, screening and filtering and the like. The technical effects of rapidly positioning, analyzing and operating the target node in the complex asset network are achieved, and the understanding and management efficiency of the user to the host asset is remarkably improved. Compared with the technical scheme that only static views or interaction functions are provided in the prior art, the method solves the problems of poor display flexibility and insufficient user experience in large-scale complex topology, and provides more powerful support for asset fault investigation and optimization analysis.

Drawings

FIG. 1 is a flow chart of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, an embodiment of the present invention provides a method for automatically checking assets in a host system, including the following steps:

Specifically, multiprotocol adaptation and dynamic selection

In complex heterogeneous environments, different types of hosts and devices may support different communication protocols. The method realizes the comprehensive support of host assets through the multi-protocol adaptation and dynamic selection module, and mainly comprises the following steps:

Protocol support range:

And selecting an optimal protocol to collect data according to the attribute and the environmental characteristic of the host. The supported protocols include:

SNMP is mainly used for data acquisition of network equipment, such as switches, routers and the like, and is used for acquiring hardware states and operation indexes of the equipment;

SSH is suitable for Linux system, and hardware and software information and running state are obtained through remote command execution;

WMI is applicable to Windows systems for accessing service status, system logs, hardware information, etc.;

The cloud platform API is used for dynamically collecting the states of the virtual machine, the storage and other cloud resources through an API (such as AWS, azure, googleCloudAPI) provided by a cloud manufacturer;

A container protocol, such as KubernetesAPI, for collecting Pod status, resource usage and lifecycle data thereof in a containerized environment;

other modern protocols, such as Redfish, mainly collect the state of the underlying hardware of the server, including BIOS, hard disk health status, fan running status, etc.

Protocol selection engine:

constructing a dynamic protocol selection engine, and automatically selecting a proper protocol according to the following rules:

Environmental attributes such as operating system type (Linux uses SSH and Windows uses WMI).

And the equipment type is that the network equipment selects SNMP preferentially, and cloud resources directly call cloud manufacturer APIs.

Network characteristics such as preference for low load protocols under bandwidth limitations.

Dynamic switching mechanism:

when the main protocol cannot be used due to network interruption, authorization failure and the like, the standby protocol is automatically switched to, so that the continuity and stability of data acquisition are ensured.

Plug-in extension support:

each protocol is independently realized as a plug-in, and the system is accessed through a unified interface, so that the hot plug and dynamic expansion of the protocol are supported, and the flexibility of the system is ensured.

Data acquisition and integration

The method comprises the steps of realizing real-time acquisition of host asset data through a distributed scheduling and asynchronous acquisition mechanism, and cleaning, integrating and standardizing the host asset data, and specifically comprises the following steps:

And (3) distributed task scheduling:

acquisition tasks are distributed to multiple nodes using a distributed task scheduling system (e.g., celery or ApacheAirflow). The task scheduling strategy comprises the following steps:

Dynamic allocation, namely dynamically adjusting task allocation according to the utilization rate of host resources and network load, so as to avoid overload of a single node;

Priority policy-dynamic resources (e.g., container state) are collected preferentially, and static resources (e.g., hardware configuration) are collected at low frequency.

Asynchronous concurrent acquisition:

and an asynchronous I/O technology (such as asyncio of Python) is utilized to initiate multi-protocol call at the same time, so that the acquisition efficiency is improved. For the same host, hardware information (SSH), service status (WMI) and operation index (SNMP) of the host can be collected concurrently.

Data cleaning and de-duplication:

The collected data may have duplication or conflict, and duplicate items are combined based on unique identifiers (such as MAC address and UUID);

and (3) carrying out standardized processing on the field values, wherein the unified hard disk capacity is GB, and the memory utilization rate is percentage.

Data standardization integration:

A unified asset data model is designed to cover static information (such as name, IP address, hardware configuration) and dynamic information (such as resource utilization, service status) of the host. And after data standardization, storing the data into a database for subsequent analysis.

Real-time integration:

and constructing a data pipeline based on a stream processing framework (such as APACHEKAFKA), so as to ensure that the acquired multi-source data can be transmitted to an integration module for processing in real time.

Distributed storage and association analysis

After the acquisition is completed, the data are stored in a distributed database, and a relationship model among the assets is established through association analysis, so that a foundation is provided for subsequent asset visualization and intelligent analysis. The method comprises the following specific steps:

And (3) distributed storage:

the real-time asset data is stored by using an elastic search, so that efficient searching and querying are supported;

Historical asset data is stored in an HDFS or other large data storage system supporting trend analysis and archive management.

Asset relationship modeling:

Constructing a dependency relationship model between hosts by using a graph database (such as Neo4 j), wherein the dependency relationship model comprises calling relationships between hosts and services, between hosts and network equipment and between services;

The asset relationship is expressed by means of nodes and edges, wherein the nodes represent asset entities such as hosts, services and the like, and the edges represent dependency or calling relationships.

Dynamic association analysis:

generating a topological graph of the host by analyzing the dependency relationship of the asset in real time;

And analyzing cascade effect possibly caused by host faults, and predicting the potential service interruption risk.

Intelligent analysis support:

Anomaly detection, namely identifying unauthorized equipment or an abnormal state host based on historical data;

And (3) trend prediction, namely predicting future resource demands by combining time sequence analysis and assisting capacity planning.

Interactive visualization

Through the interactive dynamic instrument panel and the topological view, the asset inventory result is intuitively presented to the user, and meanwhile, the real-time warning and trend prediction functions are provided, and the method specifically comprises the following steps:

Dynamic instrument panel:

Displaying the number of hosts, the online state and the resource utilization rate (such as CPU, memory, network bandwidth and the like) in real time;

providing alarm information, including unauthorized equipment access, high-load hosts and other anomalies.

Topology view:

displaying the dependency relationship between a host and service and between the host and network equipment by using a dynamic topological graph;

And supporting an interactive function, such as clicking the node to display detailed information, dragging the node to adjust layout, and screening and displaying part of assets according to conditions.

Real-time alarm system:

Triggering a real-time alarm based on an anomaly detection result of the rule engine or the machine learning model;

the alarm types comprise host faults, resource bottlenecks, configuration anomalies and the like, and inform a user of timely processing.

Trend prediction function:

And generating a prediction curve of future resource demands according to the historical data of the host computer, and assisting a user in capacity planning and resource optimization.

In the step of multi-protocol adaptation and dynamic selection, the protocol selection engine performs protocol selection based on the following rules:

a priority selection protocol based on host attributes, comprising:

a. The SNMP protocol is preferentially used for the network equipment;

b. Preferentially using an SSH protocol for the Linux system;

c. preferably using a WMI protocol for the Windows system;

The protocol selection engine preferably selects the communication protocol suitable for the current host according to the type and the running environment of the host so as to improve the acquisition efficiency and accuracy, wherein the protocol selection engine preferably selects the SNMP protocol for network equipment (such as a router, a switch and the like), the SNMP is a lightweight protocol, hardware information, interface state and flow statistics data of the equipment can be quickly acquired through accessing a Management Information Base (MIB) of the equipment, bandwidth cost is small, the protocol selection engine is suitable for resource limiting characteristics of the network equipment, the protocol selection engine preferably selects the SSH protocol for the host running a Linux system, the SSH protocol has the advantages of high safety and strong flexibility, the hardware configuration, the system state and dynamic performance data of the host can be acquired through remote command execution, the protocol selection engine preferably selects the WMI protocol for the host running the Windows system, the WMI can acquire detailed data such as hardware information, service state and log record through an access management interface of the Windows, and deep system level management is supported. In addition, the protocol selection engine is also provided with a dynamic switching mechanism, when the main protocol cannot work normally due to network interruption, authority problems or equipment limitation and the like, the system can be automatically switched to a standby protocol to ensure continuity of data acquisition, for example, when an SSH protocol fails, the system can be switched to an acquisition mode based on an API, or when an SNMP protocol is unavailable, necessary information is tried to be acquired through other ways, so that robustness and adaptability of the system are improved, and the priority rule based on the host attribute is combined with the design of the dynamic switching mechanism, so that flexibility and acquisition efficiency of protocol adaptation are improved, and stability and reliability of the system in a complex environment are remarkably enhanced.

The step of data acquisition and integration comprises the following substeps:

The acquisition tasks are assigned and prioritized using a distributed scheduling system (including Celery),

The data integration adopts the following formula:

D_s＝{d₁,d₂,…,d_n},d_i＝f⁽p_i,c_i)

Specifically, the system adopts a distributed scheduling framework (such as Celery) to allocate and schedule the acquisition tasks with priority, the acquisition tasks are dynamically allocated to a plurality of task execution nodes according to the real-time state, task type and acquisition requirement of the host, the priority scheduling strategy ensures that the critical tasks (such as dynamic resource state update) are preferentially executed, and the static tasks (such as hardware information acquisition) are arranged in a low-priority queue according to the resource load, so that reasonable utilization of the resource and load balancing of the acquisition tasks are realized. Secondly, the system realizes the concurrent call of multiple protocols by using an asynchronous I/O technology (such as asyncio of Python or a concurrent model of Go), realizes the parallelization of protocol acquisition tasks by asynchronous programming, can simultaneously call SNMP to acquire network states, SSH to acquire system information and call KubernetesAPI to acquire container states when acquiring a single host, thereby remarkably improving the efficiency of data acquisition and reducing delay. Finally, the collected multi-source data is cleaned and integrated and then stored as a standardized data model, so that the format consistency and usability of the data are ensured, in the data integration process, the system can perform deduplication operation on the collected repeated information, data from different protocols are combined through matching of unique identifiers (such as MAC addresses or UUIDs), field values are normalized according to data cleaning rules, for example, the unified memory size is GB, the unified CPU utilization rate is a percentage, and the result after data integration contains the static attribute (such as name, IP address and hardware configuration) and dynamic state (such as resource utilization rate and active service) of a host computer and is provided for subsequent analysis and storage modules. By the method, the system realizes data acquisition and integration supported by high concurrency and multiple protocols, and meets the requirements of real-time performance, accuracy and expansibility in a complex IT environment.

The step of distributed storage and association analysis comprises:

storing dynamic attribute data of the asset using an elastiscearch;

g= (V, E), v= { V ₁,v₂,…,v_m},E＝{e_ij|v_i dependent on V _j }

Specifically, the system adopts an elastic search as a distributed storage engine for storing dynamic attribute data of host assets, including real-time states (such as CPU utilization rate and memory occupancy rate) of the host, running service lists, network flow statistics and the like, and the storage mode supports efficient full-text retrieval and real-time query, can quickly respond to query requests of users for specific hosts or attributes, and simultaneously realizes expansibility support of large-scale asset data through a distributed architecture. Second, to fully describe the relationships between hosts and analyze complex dependency structures, the system uses a graph database (e.g., neo4 j) to store the dependencies between hosts and services. Each host or service is represented in the graph database as a node, and dependencies between hosts are represented as edges between nodes. In this way, the association of hosts with services can be intuitively described, e.g., how a service run by a host depends on databases or network devices on other hosts. In order to efficiently manage and analyze the relationships, the system builds a set of topological model, gathers the dependency relationship of all hosts and services into a network graph, nodes of the graph represent the hosts or the services, edges represent the dependency relationship, a Web server can depend on a database server and a load balancer, the dependency relationship can be clearly inquired out through the graph database, the storage of the dependency relationship not only supports visual display, but also supports complex analysis functions, such as identifying key nodes (such as a database server with high dependency), analyzing fault propagation paths (such as downstream services possibly affected by a certain host fault), and the method realizes dynamic state management and comprehensive association analysis of host assets through the combination of distributed storage and the graph database, so that strong technical support is provided for asset management, fault diagnosis and dependency analysis.

In the step of interactive visualization, the following functions are realized through a dynamic instrument panel:

Specifically, the dynamic instrument panel integrates a real-time data updating module, key state information of the host is dynamically displayed through the visualization component, the key state information comprises core resource indexes such as the on-line state of the host, the CPU utilization rate, the memory occupancy rate and the like, and the instrument panel can be refreshed in real time when the state of the host changes through real-time connection with the distributed database, so that a user is ensured to acquire the latest asset information. Secondly, the system designs an alarm information display module for capturing and displaying real-time alarm information of abnormal hosts and operation faults. When the host computer is abnormal (such as overrun of resources, offline and unavailable service), the system triggers an alarm, details of the relevant host computer and abnormal indexes are displayed in an alarm area of the instrument panel and are highlighted in a color, a mark and the like, so that a user can quickly locate the problem. In addition, the instrument panel also provides a trend prediction function, generates a trend prediction curve of host resource use through time series analysis based on historical data, shows the change condition of CPU, memory and storage utilization rate in a period of time in the future, helps a user to find potential resource bottlenecks in advance and conduct capacity planning, and the user can customize analysis views through interactive functions of the instrument panel, such as selecting a time range, filtering host categories and the like, so that the flexibility and efficiency of operation are further improved. The whole visual system constructs an interactive platform integrating state monitoring, alarm processing and prediction analysis through the tight integration with a storage and analysis module, and provides powerful support for the refinement and the intellectualization of asset management.

The method further comprises an event-driven data update mechanism, in particular:

The intelligent analysis module comprises:

The rule for cleaning repeated data in the data integration is as follows:

Specifically, in the cloud environment, the system captures the state change of cloud resources such as the virtual machine and the storage device in real time by monitoring APIWebhook events provided by cloud manufacturers, when a user creates, destroys or modifies a virtual machine instance on the cloud platform, webhook pushes a corresponding event notification to the system, and the system immediately updates relevant records in the asset database after analyzing event data, including the state (such as starting, stopping or deleting) of the instance, configuration change (such as increasing storage capacity) and network attribute (such as IP address and bandwidth adjustment). By the event-driven mechanism, the system can quickly respond to dynamic changes of resources, avoid the problem of delay of data update in the traditional polling mode, and secondly, for a containerized environment, the system captures change information of the lifecycle of a container (Pod) by subscribing KubernetesAPI event streams. When the Pod is destroyed, the system can clear the corresponding asset record, and through the real-time updating mechanism, the system can accurately reflect the transient change of the resource in the containerized environment, in addition, the system designs an event filtering and priority mechanism, ensures that only the key events related to asset management are processed, avoids the influence of irrelevant events on performance, greatly improves the adaptability of the system to the dynamic environment, ensures that the state of the cloud resource or the container resource can be reflected in the asset management system in real time, and provides high-quality data support for subsequent monitoring, analysis and optimization.

The distributed scheduling system used in the data acquisition step supports load balancing, and specifically comprises the following steps:

Specifically, the system dynamically adjusts the allocation strategy of the acquisition task by monitoring the CPU and network load conditions of the task execution nodes in real time, when the system receives a batch of acquisition tasks to be executed, the scheduling system firstly evaluates the current state of each task execution node, including core indexes such as CPU utilization rate, available memory, network bandwidth utilization rate and the like, and according to the evaluation results, the system preferentially allocates high-load tasks to nodes which are idle in operation so as to avoid delay or interruption of tasks due to resource bottlenecks of certain nodes. Meanwhile, the system can update the node state data in real time, and ensure that optimal task allocation is always realized in a dynamically-changed running environment. To further increase the reliability and efficiency of task execution, the system incorporates a message queuing mechanism (e.g., rabbitMQ) for managing asynchronous monitoring and feedback of tasks. In the task distribution process, a scheduling system pushes collected tasks to a message queue, each task execution node subscribes and pulls tasks suitable for self loads, after the task execution is completed, the nodes feed execution results and state information back to the message queue, the scheduling system performs unified processing and updating, and the use of the message queue not only realizes the asynchronization of task distribution, but also provides reliable tracking capability of task states, and even if some tasks cannot be completed due to node faults, the scheduling system can reschedule the tasks through the queue, so that final completion is ensured. In addition, the system designs a task priority mechanism to push the acquisition task with high priority (such as dynamic resource status update) to the idle node for execution preferentially, and the task with low priority is scheduled for execution when the resources are sufficient. Through the distributed scheduling and load balancing strategy, the system realizes high performance and high reliability in a large-scale data acquisition scene, and meets the complex and changeable IT asset management requirements.

The topology visualization view provided by the system supports the following interactive functions:

clicking the node to display the detailed attribute information of the host;

dragging nodes to rearrange the topology layout;

Specifically, each node in the topology view represents a host or service, a user can obtain detailed attribute information of the nodes by clicking the nodes, the attribute information comprises names, IP addresses, operating system types, hardware configurations (such as CPU, memory and storage), current resource utilization (such as CPU load and memory occupancy) and running service lists of the hosts, the function ensures that the user views up-to-date data by being connected with a database of the system in real time, supports unfolding and folding operations, enables the user to focus on specific information layers, and enables the user to freely drag the topology view supporting nodes, and can drag and rearrange positions of the nodes through a mouse, so that a layout structure is adjusted according to own requirements, for example, the nodes with logic association are manually grouped or highlighted to display certain key nodes. The user may choose to only display hosts whose resource utilization exceeds a threshold in order to quickly identify nodes that may have performance bottlenecks. The implementation of the above interactive functions relies on the real-time communication support of modern front-end technologies (e.g., d3.Js or cytoscape. Js) and back-end data interfaces, ensuring that the user's operations can be reflected in the visual view instantaneously. Through the functions, the topology visualization view of the system not only provides clear asset structure display, but also greatly improves the efficiency and experience of a user in managing and analyzing host assets.

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations may be made therein without departing from the spirit and scope of the invention as defined by the appended claims and their equivalents.

Claims

1. A method for automatically taking inventory of host system assets, characterized in that it comprises the following steps:

Multi-protocol adaptation and dynamic selection: Based on the host type, environmental attributes and network characteristics, the protocol selection engine is used to achieve dynamic protocol selection and adaptation;

Data collection and integration: Through asynchronous concurrent collection tasks, multi-source data is collected in real time to complete data cleaning, deduplication and standardization;

Distributed storage and association analysis: Store the collected host asset data in a distributed database, and use the database to establish a dependency model between assets;

Interactive visualization: Displays the host asset status and its associated topological relationships through dynamic dashboards and topology views, and provides real-time alarm and prediction functions.

2. A method for automatic inventory of host system assets according to claim 1, characterized in that in the multi-protocol adaptation and dynamic selection step, the protocol selection engine performs protocol selection based on the following rules:

The protocol is selected based on the host properties, including:

a. Use SNMP protocol first for network devices;

b. Use SSH protocol first for Linux system;

c. Use WMI protocol first for Windows system;

Dynamic switching mechanism: When the main protocol fails, it automatically switches to the backup protocol.

3. The method for automatic inventory of host system assets according to claim 1, wherein the step of data collection and integration comprises the following sub-steps:

Use distributed scheduling systems (including Celery) to allocate and prioritize collection tasks;

Implement multi-protocol concurrent acquisition based on asynchronous I/O technology (including Python's asyncio or Go concurrency model);

The data integration uses the following formula:

D _s = {d ₁ , d ₂ ,..., d _n }, d _i = f ₍ p _i , c _i ⁾

Among them, _Ds represents the standardized asset data set, _di is a single asset record, _pi is the data item collected by the protocol, and _ci is the data cleaning rule.

4. A method for automatically counting host system assets according to claim 1, characterized in that the steps of distributed storage and association analysis include:

Use Elasticsearch to store dynamic property data of assets;

Use graph databases (including Neo4j) to store hosts and their dependencies, where the dependency modeling formula is:

G＝ ⁽ V,E ⁾ ,V＝{v ₁ ,v ₂ ,…,v _m },E＝{e _ij |v _i depends on v _j }

Among them, G represents the host topology graph, V is the set of hosts and service nodes, and E is the dependency relationship between hosts.

5. The method for automatic inventory of host system assets according to claim 1, characterized in that in the interactive visualization step, the following functions are realized through a dynamic dashboard:

Real-time updates of asset status, including host online status and resource utilization (CPU, memory);

Display of alarm information, including abnormal hosts and operational failures;

Trend prediction curve, based on the time series analysis results of the host's historical data.

6. A method for automatic inventory of host system assets according to claim 1, characterized in that the method further comprises an event-driven data update mechanism, specifically:

For cloud environments, update the status of virtual machines and storage devices by monitoring Webhook events of cloud vendor APIs;

For container environments, subscribe to Pod lifecycle events through the Kubernetes API to achieve real-time resource change synchronization.

7. The method for automatically counting assets of a host system according to claim 1, wherein the intelligent analysis module comprises:

The anomaly detection submodule detects unauthorized access devices based on the IsolationForest algorithm. Specifically:

Where S(x) is the anomaly score of the host, N is the number of random trees, and h(x,t) is the path length of sample x in the tth tree;

The capacity prediction submodule predicts the host resource utilization trend based on LSTM (Long Short-Term Memory Network).

8. A method for automatic inventory of host system assets according to claim 1, characterized in that the rule for cleaning duplicate data in the data integration is:

For data collected by multiple protocols, duplicates are removed based on unique identifiers (including MAC addresses and UUIDs);

For time series data, the latest record is overwritten based on the timestamp.

9. The method for automatic inventory of host system assets according to claim 1, characterized in that the distributed scheduling system used in the data collection step supports load balancing, specifically comprising:

Dynamically allocate collection tasks based on the host's current CPU and network load;

Use message queues (including RabbitMQ) to implement asynchronous monitoring and feedback of task status.

10. The method for automatically counting host system assets according to claim 1, wherein the topology visualization view provided by the system supports the following interactive functions:

Click a node to display detailed attribute information of the host;

Drag nodes to rearrange the topology layout;

Multi-dimensional filtering: filter and display by host status, service type or resource utilization.