[go: up one dir, main page]

CN115996334A - A network fault early warning method, device, electronic equipment and storage medium - Google Patents

A network fault early warning method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115996334A
CN115996334A CN202111210421.3A CN202111210421A CN115996334A CN 115996334 A CN115996334 A CN 115996334A CN 202111210421 A CN202111210421 A CN 202111210421A CN 115996334 A CN115996334 A CN 115996334A
Authority
CN
China
Prior art keywords
user
network
early warning
device identifier
onu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111210421.3A
Other languages
Chinese (zh)
Inventor
王俊峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ultrapower Software Co ltd
Original Assignee
Ultrapower Software Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ultrapower Software Co ltd filed Critical Ultrapower Software Co ltd
Priority to CN202111210421.3A priority Critical patent/CN115996334A/en
Publication of CN115996334A publication Critical patent/CN115996334A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Small-Scale Networks (AREA)

Abstract

The application provides a network fault early warning method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: receiving an interrupt alarm message sent by network equipment, wherein the interrupt alarm message comprises equipment identification of the network equipment; querying all user states corresponding to the equipment identifier; judging whether all user states corresponding to the equipment identifier are off-line; if yes, sending early warning information to all users corresponding to the equipment identifier. By judging whether all user states carried by the network equipment are off-line or not, determining that the network equipment is truly faulty only under the condition that all user states are off-line, and sending early warning information to all users corresponding to the equipment identification at the moment, the method effectively avoids the probability that error early warning is caused by sending early warning information under the condition that some users can still normally use the network equipment to surf the internet, and accordingly the accuracy of sending early warning information to corresponding users according to interrupt alarm information is effectively improved.

Description

一种网络故障预警方法、装置、电子设备及存储介质A network fault early warning method, device, electronic equipment and storage medium

技术领域technical field

本申请涉及网络通信、故障预警和大数据的技术领域,具体而言,涉及一种网络故障预警方法、装置、电子设备及存储介质。The present application relates to the technical fields of network communication, fault early warning and big data, and specifically relates to a network fault early warning method, device, electronic equipment and storage medium.

背景技术Background technique

目前,在网络运营商提供服务通过网络设备的过程中,网络常常出现一些故障,具体例如:通过网元告警关联网元信息实时发现设备故障,例如光线路终端(Optical LineTerminal,OLT)脱网、无源光纤网络(Passive Optical Network,PON)中断、光网络单元(Optical Network Unit,ONU)离线等网元级故障。这些故障都是存储在历史记录数据中,然后通过系统在这些历史记录中发现故障后,主动推送故障预警消息给相应的用户。然而,在具体的实践过程中,发现存在网络设备的故障误判的情况,高发的错误预警给用户及运营者带来极大困扰及资源浪费。At present, in the process of network operators providing services through network equipment, some failures often occur in the network, such as: real-time discovery of equipment failures through network element alarms associated with network element information, such as optical line terminals (Optical LineTerminal, OLT) disconnected from the network, Network element-level faults such as passive optical network (Passive Optical Network, PON) interruption, optical network unit (Optical Network Unit, ONU) offline, etc. These faults are stored in the historical record data, and then after the system finds faults in these historical records, it will actively push the fault warning message to the corresponding users. However, in the specific practice process, it is found that there are misjudgments of network equipment failures, and the high frequency of false alarms brings great troubles and waste of resources to users and operators.

发明内容Contents of the invention

本申请实施例的目的在于提供一种网络故障预警方法、装置、电子设备及存储介质,用于改善根据中断告警消息给相应用户发送预警消息的正确率不高的问题。The purpose of the embodiments of the present application is to provide a network failure early warning method, device, electronic equipment and storage medium, which are used to improve the problem that the correct rate of sending early warning messages to corresponding users according to interruption warning messages is not high.

本申请实施例提供了一种网络故障预警方法,包括:接收网络设备发送的中断告警消息,中断告警消息包括网络设备的设备标识;根据设备标识查询网络设备承载的所有用户,确定设备标识对应的所有用户状态;判断设备标识对应的所有用户状态是否均为不在线;若是,则向设备标识对应的所有用户发送预警信息。在上述的实现过程中,通过判断网络设备对应承载的所有用户状态是否均为不在线,只有在所有用户状态均为不在线的情况下,才确定该网络设备是真正的故障,此时才向设备标识对应的所有用户发送预警信息,这种方式有效地避免了一些用户仍然可以正常使用该网络设备上网的情况下发送预警信息导致错误预警的概率,从而有效地提高了根据中断告警消息给相应用户发送预警消息的正确率。An embodiment of the present application provides a network fault early warning method, including: receiving an interruption warning message sent by a network device, where the interruption warning message includes a device identifier of the network device; querying all users carried by the network device according to the device identifier, and determining the corresponding All user status; determine whether all user status corresponding to the device identifier is offline; if so, send early warning information to all users corresponding to the device identifier. In the above implementation process, by judging whether all the user states carried by the network device are offline, only when all user states are offline, it is determined that the network device is a real fault, and at this time it is reported to the All users corresponding to the device identification send early warning information. This method effectively avoids the probability of sending early warning information and causing false early warnings when some users can still use the network device to access the Internet. The correct rate of users sending warning messages.

可选地,在本申请实施例中,根据设备标识查询网络设备承载的所有用户,确定设备标识对应的所有用户状态,包括:在键值对存储数据库中查询网络设备对应承载的所有用户标识,键值对存储数据库中存储有设备标识与用户标识之间的关联关系;查询所有用户标识中每个用户标识对应的用户状态,获得所有用户状态。在上述的实现过程中,通过在键值对存储数据库中查询设备标识对应的所有用户标识,并在键值对存储数据库中查询所有用户标识中每个用户标识对应的用户状态,采用键值对存储数据库作为关联数据的中间数据存储组件,从而有效地提高数据的查询速度。进一步地,数据存储可以在存储时就设置该数据的过期时间,确保了过期数据不会对网络故障预警系统的准确性造成影响。Optionally, in this embodiment of the present application, querying all users carried by the network device according to the device identifier, and determining the status of all users corresponding to the device identifier includes: querying all user identifiers carried by the network device in the key-value pair storage database, The key-value pair storage database stores the association relationship between the device ID and the user ID; query the user status corresponding to each user ID in all user IDs to obtain all user status. In the above implementation process, by querying all user IDs corresponding to the device ID in the key-value pair storage database, and querying the user status corresponding to each user ID in the key-value pair storage database, the key-value pair The storage database is used as an intermediate data storage component for associated data, thereby effectively improving the query speed of data. Furthermore, the data storage can set the expiration time of the data during storage, so as to ensure that the expired data will not affect the accuracy of the network fault early warning system.

可选地,在本申请实施例中,在键值对存储数据库中查询网络设备对应承载的所有用户标识之前,还包括:搭建分布式流数据流引擎服务器集群;在分布式流数据流引擎服务器集群上以分布式的方式部署键值对存储数据库。在上述的实现过程中,通过搭建分布式流数据流引擎服务器集群,并在分布式流数据流引擎服务器集群上以分布式的方式部署键值对存储数据库,避免了集群与单个服务器网络进行通信的网络问题和程序运行瓶颈的问题,从而有效地提高数据的查询速度。Optionally, in the embodiment of the present application, before querying the key-value pair storage database for all user identifiers correspondingly carried by the network device, it also includes: building a cluster of distributed stream data stream engine servers; The key-value pair storage database is deployed in a distributed manner on the cluster. In the above implementation process, by building a distributed stream data stream engine server cluster, and deploying a key-value pair storage database on the distributed stream data stream engine server cluster in a distributed manner, the communication between the cluster and a single server network is avoided Network problems and program running bottleneck problems, thus effectively improving the data query speed.

可选地,在本申请实施例中,向设备标识对应的所有用户发送预警信息,包括:针对设备标识对应的所有用户中的每个用户,根据每个用户的用户信息获取每个用户的实际影响结果;生成每个用户的预警信息,然后根据每个用户的实际影响结果确定是否向设备标识对应的所有用户发送预警信息。Optionally, in this embodiment of the present application, sending early warning information to all users corresponding to the device identifier includes: for each user among all users corresponding to the device identifier, obtaining each user's actual Influence results: generate early warning information for each user, and then determine whether to send early warning information to all users corresponding to the device identifier according to the actual impact results of each user.

可选地,在本申请实施例中,用户信息包括:用户下线原因字段、掉电原因字段和异常下线原因字段;判断针对设备标识对应的所有用户中的每个用户,获取每个用户的实际影响结果,包括:判断每个用户的用户信息是否满足第一预设条件,第一预设条件包括:用户下线原因字段是异常下线,且掉电原因字段是非掉电原因,且异常下线原因字段没有被回填,并且中断告警消息的接收时刻与该用户的最后一次下线时刻之间的时长是否超过预设时长;若是,则确认实际影响结果是对该用户有实际影响,否则,确认实际影响结果是对该用户没有实际影响。Optionally, in this embodiment of the application, the user information includes: a user offline reason field, a power-down reason field, and an abnormal offline reason field; it is judged that for each user among all users corresponding to the device identifier, obtain each user The actual impact results include: judging whether the user information of each user satisfies the first preset condition, the first preset condition includes: the user offline reason field is abnormal offline, and the power-off reason field is a non-power-off reason, and The abnormal offline reason field has not been backfilled, and whether the time between the receiving time of the interruption warning message and the user’s last offline time exceeds the preset time; if so, confirm that the actual impact result has an actual impact on the user, Otherwise, confirm that the actual impact result is no actual impact on the user.

可选地,在本申请实施例中,网络设备为无源光网络PON设备,在接收网络设备发送的中断告警消息之后,还包括:判断在预设时间范围内统计出PON设备的设备标识对应的ONU消息记录是否满足第二预设条件,第二预设条件为ONU消息记录的条数大于或者等于预设数量,且ONU消息记录中告警类型为ONU掉电的消息比例大于或者等于预设比例;若是,则将ONU消息记录的告警类型由ONU断纤修改为ONU掉电。在上述的实现过程中,通过在预设时间范围内统计出PON设备的设备标识对应的ONU消息记录是否满足第二预设条件,则将ONU消息记录的告警类型由ONU断纤修改为ONU掉电,从而避免了对ONU设备的告警类型进行误判的问题,从而有效地提高了确定ONU设备的告警类型的正确率。Optionally, in this embodiment of the present application, the network device is a PON device of a passive optical network. After receiving the interruption alarm message sent by the network device, it further includes: Whether the ONU message record of ONU meets the second preset condition, the second preset condition is that the number of ONU message records is greater than or equal to the preset number, and the proportion of messages whose alarm type is ONU power-off in the ONU message record is greater than or equal to the preset Ratio; if yes, change the alarm type recorded in the ONU message from ONU fiber cut to ONU power down. In the above implementation process, by counting whether the ONU message record corresponding to the device identifier of the PON device meets the second preset condition within the preset time range, the alarm type of the ONU message record is changed from ONU fiber broken to ONU broken Electricity, thereby avoiding the problem of misjudging the alarm type of the ONU device, thereby effectively improving the correct rate of determining the alarm type of the ONU device.

本申请实施例还提供了一种网络故障预警装置,包括:告警消息接收模块,用于接收网络设备发送的中断告警消息,中断告警消息包括网络设备的设备标识;用户状态查询模块,用于根据设备标识查询网络设备承载的所有用户,确定设备标识对应的所有用户状态;用户状态判定模块,用于判断设备标识对应的所有用户状态是否均为不在线;预警消息发送模块,用于若设备标识对应的所有用户状态均为不在线,则向设备标识对应的所有用户发送预警信息。The embodiment of the present application also provides a network failure early warning device, including: an alarm message receiving module, configured to receive an interrupt alarm message sent by a network device, where the interrupt alarm message includes the device identifier of the network device; a user status query module, configured to The device identification queries all users carried by the network device, and determines the status of all users corresponding to the device identification; the user status determination module is used to determine whether all user statuses corresponding to the device identification are offline; the early warning message sending module is used for if the device identification If the status of all corresponding users is offline, an early warning message is sent to all users corresponding to the device identifier.

可选地,在本申请实施例中,用户状态查询模块,包括:用户标识查询模块,用于在键值对存储数据库中查询网络设备对应承载的所有用户标识,键值对存储数据库中存储有设备标识与用户标识之间的关联关系;用户状态获得模块,用于查询所有用户标识中每个用户标识对应的用户状态,获得所有用户状态。Optionally, in this embodiment of the application, the user status query module includes: a user identification query module, configured to query all user identifications carried by the network device in the key-value pair storage database, where the key-value pair storage database stores The association relationship between the device ID and the user ID; the user state acquisition module is used to query the user status corresponding to each user ID in all user IDs, and obtain all user statuses.

可选地,在本申请实施例中,用户状态查询模块,还包括:服务集群搭建模块,用于搭建分布式流数据流引擎服务器集群;键值对数据库模块,用于在分布式流数据流引擎服务器集群上以分布式的方式部署键值对存储数据库。Optionally, in this embodiment of the application, the user status query module further includes: a service cluster building module, used to build a distributed stream data flow engine server cluster; a key-value pair database module, used to The key-value pair storage database is deployed in a distributed manner on the engine server cluster.

可选地,在本申请实施例中,预警消息发送模块,包括:实际影响结果获取模块,用于针对设备标识对应的所有用户中的每个用户,根据每个用户的用户信息获取每个用户的实际影响结果;消息生成发送模块,用于生成每个用户的预警信息,然后根据每个用户的实际影响结果确定是否向设备标识对应的所有用户发送预警信息。Optionally, in the embodiment of the present application, the early warning message sending module includes: an actual impact result acquisition module, configured to, for each user among all users corresponding to the device identifier, obtain the information of each user according to the user information of each user. The actual impact results; the message generation and sending module is configured to generate early warning information for each user, and then determine whether to send early warning information to all users corresponding to the device identifier according to the actual impact results of each user.

可选地,在本申请实施例中,用户信息包括:用户下线原因字段、掉电原因字段和异常下线原因字段;实际影响结果获取模块,包括:告警消息判断模块,用于判断每个用户的用户信息是否满足第一预设条件,第一预设条件包括:用户下线原因字段是异常下线,且掉电原因字段是非掉电原因,且异常下线原因字段没有被回填,并且中断告警消息的接收时刻与该用户的最后一次下线时刻之间的时长是否超过预设时长;实际影响确认模块,用于若该用户的用户信息满足第一预设条件,则确认实际影响结果是对该用户有实际影响,否则,确认实际影响结果是对该用户没有实际影响。Optionally, in the embodiment of the present application, the user information includes: user offline reason field, power failure reason field and abnormal offline reason field; the actual impact result acquisition module includes: alarm message judging module for judging each Whether the user's user information satisfies the first preset condition, the first preset condition includes: the user offline reason field is abnormal offline, and the power-off reason field is a non-power-off reason, and the abnormal offline reason field is not backfilled, and Whether the duration between the receiving moment of the interrupt warning message and the last offline moment of the user exceeds the preset duration; the actual impact confirmation module is used to confirm the actual impact result if the user information of the user satisfies the first preset condition If it has actual impact on the user, otherwise, confirm that the actual impact result has no actual impact on the user.

可选地,在本申请实施例中,网络设备为无源光网络PON设备,网络故障预警装置,还包括:消息记录判断模块,用于判断在预设时间范围内统计出PON设备的设备标识对应的ONU消息记录是否满足第二预设条件,第二预设条件为ONU消息记录的条数大于或者等于预设数量,且ONU消息记录中告警类型为ONU掉电的消息比例大于或者等于预设比例;告警类型修改模块,用于若设备标识对应的ONU消息记录满足第二预设条件,则将ONU消息记录的告警类型由ONU断纤修改为ONU掉电。Optionally, in the embodiment of the present application, the network device is a passive optical network PON device, and the network fault early warning device further includes: a message record judging module, which is used to judge that the device identification of the PON device is counted within a preset time range Whether the corresponding ONU message record satisfies the second preset condition, the second preset condition is that the number of ONU message records is greater than or equal to the preset number, and the alarm type in the ONU message record is that the proportion of ONU power-down messages is greater than or equal to the preset Set the proportion; the alarm type modification module is used to modify the alarm type of the ONU message record from ONU fiber cut to ONU power down if the ONU message record corresponding to the device identifier satisfies the second preset condition.

可选地,在本申请实施例中,网络设备包括:无源光网络PON设备、光线路终端设备或者光网络单元ONU。Optionally, in this embodiment of the present application, the network device includes: a passive optical network PON device, an optical line terminal device, or an optical network unit ONU.

本申请实施例还提供了一种电子设备,包括:处理器和存储器,存储器存储有处理器可执行的机器可读指令,机器可读指令被处理器执行时执行如上面描述的方法。The embodiment of the present application also provides an electronic device, including: a processor and a memory, the memory stores machine-readable instructions executable by the processor, and the machine-readable instructions execute the method as described above when executed by the processor.

本申请实施例还提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如上面描述的方法。The embodiment of the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the method as described above is executed.

附图说明Description of drawings

为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例中所需要使用的附图作简单地介绍,应当理解,以下附图仅示出了本申请实施例的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。In order to more clearly illustrate the technical solutions of the embodiments of the present application, the following will briefly introduce the accompanying drawings that need to be used in the embodiments of the present application. It should be understood that the following drawings only show some embodiments of the embodiments of the present application , so it should not be regarded as a limitation on the scope. For those skilled in the art, other related drawings can also be obtained according to these drawings without creative work.

图1示出的本申请实施例提供的网络故障预警方法的流程示意图;FIG. 1 shows a schematic flow diagram of a network failure early warning method provided by an embodiment of the present application;

图2示出的本申请实施例提供的网络故障预警系统的网络架构示意图;Figure 2 shows a schematic diagram of the network architecture of the network failure early warning system provided by the embodiment of the present application;

图3示出的本申请实施例提供的修正ONU告警类型的流程示意图;The schematic flow diagram of the modified ONU alarm type provided by the embodiment of the present application shown in Fig. 3;

图4示出的本申请实施例提供的网络故障预警装置的结构示意图;FIG. 4 shows a schematic structural diagram of a network failure early warning device provided by an embodiment of the present application;

图5示出的本申请实施例提供的电子设备的结构示意图。FIG. 5 shows a schematic structural diagram of an electronic device provided by an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施例中附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请实施例一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本申请实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本申请实施例的实施例的详细描述并非旨在限制要求保护的本申请实施例的范围,而是仅仅表示本申请实施例的选定实施例。基于本申请实施例的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本申请实施例保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings in the embodiments of the present application. Apparently, the described embodiments are only part of the embodiments of the present application, not all of them. The components of the embodiments of the application generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Therefore, the following detailed description of the embodiments of the present application provided in the drawings is not intended to limit the scope of the claimed embodiments of the present application, but merely represents selected embodiments of the embodiments of the present application. Based on the embodiments of the embodiments of the present application, all other embodiments obtained by those skilled in the art without making creative efforts belong to the protection scope of the embodiments of the present application.

在介绍本申请实施例提供的网络故障预警方法之前,先介绍本申请实施例中所涉及的一些概念:Before introducing the network failure early warning method provided by the embodiment of this application, some concepts involved in the embodiment of this application are first introduced:

无源光网络(Passive Optical Network,PON),又称被动式光纤网络,是光纤通信网络的一种,其特色为不用电源就可以完成信号处理,就像家里的镜子,不需要电就能反射影像,除了终端设备需要用到电以外,其中间的节点则以精致小巧的光纤组件构成。Passive Optical Network (PON), also known as passive optical network, is a kind of optical fiber communication network. Its characteristic is that signal processing can be completed without power supply, just like a mirror at home, which can reflect images without electricity , except that the terminal equipment needs to use electricity, the nodes in the middle are composed of exquisite and compact optical fiber components.

光网络单元(Optical Network Unit,ONU)是光纤接入的终端设备,具有光/电转换和电/光转换的功能,同时还具有对语声信号的数/模和模/数转换功能以及复用、信令处理和维护管理的功能,通常可以与光线路终端(Optical Line Terminal,OLT)配合使用,能够向用户提供多个业务接口。Optical Network Unit (ONU) is a terminal device for optical fiber access, which has the functions of optical/electrical conversion and electrical/optical conversion, and also has the function of digital/analog and analog/digital conversion of voice signals and complex The functions of use, signaling processing, and maintenance management can usually be used in conjunction with an Optical Line Terminal (OLT) to provide users with multiple service interfaces.

光线路终端(Optical Line Terminal,OLT),又被称为光链接终端,是在PON技术应用中的OLT设备,也是非常重要的局端设备。一个PON设备包含一个中央办公节点,称为光链接终端(OLT),一个或多个用户节点,称为光纤网络单元(ONU)或者光纤网络终端(Optical Network Terminals,ONT)。Optical Line Terminal (OLT), also known as optical link terminal, is an OLT device in the application of PON technology, and it is also a very important local end device. A PON device consists of a central office node, called Optical Link Terminal (OLT), and one or more user nodes, called Optical Network Unit (ONU) or Optical Network Terminals (ONT).

键值对存储数据库,是一种非关系数据库,它使用简单的键值方法来存储数据。键值数据库将数据存储为键值对集合,其中键作为唯一标识符。键和值都可以是从简单对象到复杂复合对象的任何内容。键值数据库是高度可分区的,并且允许以其他类型的数据库无法实现的规模进行水平扩展。A key-value store database is a non-relational database that uses a simple key-value approach to storing data. A key-value database stores data as a collection of key-value pairs, where the key acts as a unique identifier. Both keys and values can be anything from simple objects to complex compound objects. Key-value databases are highly partitionable and allow horizontal scaling at scales not possible with other types of databases.

Flink是一种开源流处理框架,又被称为Apache Flink,其核心是用Java语言和Scala语言编写的分布式流数据流引擎;Flink以数据并行和管道方式执行任意流数据程序,Flink的流水线运行时系统可以执行批处理和流处理程序。Flink is an open source stream processing framework, also known as Apache Flink, whose core is a distributed stream data flow engine written in Java language and Scala language; Flink executes arbitrary stream data programs in data parallel and pipeline mode, and Flink's pipeline The runtime system can execute both batch and stream processing programs.

需要说明的是,本申请实施例提供的网络故障预警方法可以被电子设备执行,这里的电子设备是指具有执行计算机程序功能的设备终端或者上述的服务器,设备终端例如:智能手机、个人电脑(personal computer,PC)、平板电脑、个人数字助理(personaldigital assistant,PDA)或者移动上网设备(mobile Internet device,MID)等。服务器例如:x86服务器以及非x86服务器,非x86服务器包括:大型机、小型机和UNIX服务器。It should be noted that the network failure early warning method provided in the embodiment of the present application can be executed by an electronic device, where the electronic device refers to a device terminal with the function of executing a computer program or the above-mentioned server, such as a smart phone, a personal computer ( personal computer (PC), tablet computer, personal digital assistant (personal digital assistant, PDA) or mobile Internet device (mobile Internet device, MID), etc. Examples of servers: x86 servers and non-x86 servers, non-x86 servers include: mainframes, minicomputers and UNIX servers.

下面介绍该网络故障预警方法适用的应用场景,这里的应用场景包括但不限于:基于以太网上的点对点协议(Point-to-Point Protocol Over Ethernet,PPPoE)的应用家宽场景。这些场景具体例如:使用该网络故障预警方法对OLT、PON或者ONU等等网络设备出现的故障进行预警,以提高使用根据中断告警消息给相应用户发送预警消息的正确率;还可以使用该网络故障预警方法增强家宽智慧监控系统的功能,例如,在使用家宽智慧监控系统根据中断告警消息给相应用户发送预警消息时,可以提高给相应用户发送预警消息的正确率。The applicable application scenarios of the network fault early warning method are introduced below, and the application scenarios here include but are not limited to: application home broadband scenarios based on Point-to-Point Protocol Over Ethernet (PPPoE) on Ethernet. These scenarios are specific for example: use the network fault early warning method to warn the faults of network equipment such as OLT, PON or ONU, so as to improve the correctness of sending early warning messages to corresponding users according to the interruption alarm message; you can also use the network fault The early warning method enhances the functions of the home broadband smart monitoring system. For example, when using the home broadband smart monitoring system to send early warning messages to corresponding users according to interruption alarm messages, the correct rate of sending early warning messages to corresponding users can be improved.

在具体的实践过程中,发现存在网络设备的故障误判的情况,例如:虽然接收到该网络设备发送的中断告警消息,但是,同一个网络设备下对应承载的多个用户中,有一些用户仍然可以正常使用该网络设备上网,说明该网络设备仍然是正常工作的。在该网络故障预警方法中,确保所有用户均不在线的情况下,才确认该网络设备真实故障,此时才发送预警消息给相应的用户。因此,使用该网络故障预警方法可以减少出现错误预警的情况,提高使用根据中断告警消息给相应用户发送预警消息的正确率。In the specific practice process, it is found that there is a misjudgment of the fault of the network device. For example, although the interruption alarm message sent by the network device is received, some of the multiple users corresponding to the bearer under the same network device You can still use the network device to surf the Internet normally, which means that the network device is still working normally. In the network fault early warning method, it is confirmed that the network device is actually faulty only when all users are offline, and then an early warning message is sent to the corresponding user. Therefore, the use of the network failure early warning method can reduce the occurrence of false early warnings, and improve the correct rate of sending early warning messages to corresponding users according to the interruption alarm messages.

请参见图1示出的本申请实施例提供的网络故障预警方法的流程示意图;该网络故障预警方法的主要思路是,通过判断网络设备对应承载的所有用户状态是否均为不在线,只有在所有用户状态均为不在线的情况下,才确定该网络设备是真正的故障,此时才向设备标识对应的所有用户发送预警信息,这种方式有效地避免了一些用户仍然可以正常使用该网络设备上网的情况下发送预警信息导致错误预警的概率,从而有效地提高了根据中断告警消息给相应用户发送预警消息的正确率。上述的网络故障预警方法可以包括:Please refer to the schematic flow diagram of the network fault early warning method provided by the embodiment of the present application shown in Fig. 1; When the user status is not online, it is determined that the network device is a real failure, and then an early warning message is sent to all users corresponding to the device identification. This method effectively prevents some users from still using the network device normally. In the case of surfing the Internet, the probability of sending early warning information leads to false early warning, thereby effectively improving the correct rate of sending early warning messages to corresponding users according to the interruption warning message. The above-mentioned network failure early warning methods may include:

步骤S110:电子设备接收网络设备发送的中断告警消息,中断告警消息包括网络设备的设备标识。Step S110: the electronic device receives the interruption warning message sent by the network device, and the interruption warning message includes the device identification of the network device.

请参见图2示出的本申请实施例提供的网络故障预警系统的网络架构示意图;网络故障预警系统包括:网络设备、面向宽带远程接入网关(Broadband Remote AccessServer,BRAS)、家宽智慧监控系统、远程身份验证拨入用户服务(Remote AuthenticationDial In User Service,RADIUS)系统、综合资源系统、短信网关和集中故障管理系统;其中,家宽智慧监控系统分别与RADIUS系统、综合资源系统、短信网关和集中故障管理系统相互通信,BRAS分别与网络设备和RADIUS系统连接通信。上述的网络设备可以包括:无源光网络(PON)、光线路终端(OLT)和光网络单元(ONU)等。该网络设备通过光猫(又被称为光调制解调器)与宽带用户的个人计算机等设备相互连接通信。Please refer to the schematic diagram of the network architecture of the network fault early warning system provided by the embodiment of the present application shown in Figure 2; the network fault early warning system includes: network equipment, broadband-oriented remote access gateway (Broadband Remote Access Server, BRAS), home broadband intelligent monitoring system , remote authentication dial in user service (Remote Authentication Dial In User Service, RADIUS) system, integrated resource system, SMS gateway and centralized fault management system; among them, the home broadband smart monitoring system is connected with RADIUS system, integrated resource system, SMS gateway and The centralized fault management systems communicate with each other, and the BRAS communicates with the network equipment and the RADIUS system respectively. The foregoing network equipment may include: a passive optical network (PON), an optical line terminal (OLT), an optical network unit (ONU), and the like. The network device communicates with devices such as personal computers of broadband users through an optical modem (also called an optical modem).

上述的步骤S110的实施方式例如:电子设备上可以运行家宽智慧监控系统,当网络设备发生中断告警时,可以依次通过BRAS和RADIUS系统发送中断告警消息,其中,中断告警消息可以包括网络设备的设备标识。当然在具体的实践过程中,如果网络设备与电子设备之间的网络连接路线不同,也可以通过不同的网络连接路线发送网络设备发送的中断告警消息。然后,电子设备接收网络设备发送的中断告警消息,中断告警消息包括网络设备的设备标识。在上面的网络故障预警系统中,还可以将PPPoE场景与宽带认证RADIUS系统相融合,可以使用运行家宽智慧监控系统的电子设备来采集综合资源系统、集中故障管理系统和RADIUS系统的网络大数据,并对这些大数据进行分析和实时监测RADIUS在线用户的情况,从而实时地发现影响业务的家宽网络故障。The implementation of the above step S110 is for example: the home broadband smart monitoring system can be run on the electronic device, and when the network device has an interruption alarm, the interruption alarm message can be sent sequentially through the BRAS and RADIUS systems, wherein the interruption alarm message can include the network device. Equipment Identity. Of course, in a specific practical process, if the network connection route between the network device and the electronic device is different, the interruption alarm message sent by the network device may also be sent through a different network connection route. Then, the electronic device receives the interruption warning message sent by the network device, where the interruption warning message includes the device identification of the network device. In the above network fault warning system, PPPoE scenarios can also be integrated with the broadband authentication RADIUS system, and the electronic equipment running the home broadband smart monitoring system can be used to collect network big data of the integrated resource system, centralized fault management system and RADIUS system , and analyze the big data and monitor the situation of RADIUS online users in real time, so as to discover the home broadband network failures that affect the business in real time.

在步骤S110之后,执行步骤S120:电子设备根据设备标识查询网络设备承载的所有用户,确定设备标识对应的所有用户状态。After step S110, step S120 is executed: the electronic device queries all users carried by the network device according to the device identifier, and determines the status of all users corresponding to the device identifier.

可以理解的是,上述的用户可以是RADIUS系统中的用户。上述步骤S120的实施方式有很多种,包括但不限于如下几种:It can be understood that the above-mentioned users may be users in the RADIUS system. There are many ways to implement the above step S120, including but not limited to the following:

第一种实施方式,在键值对存储数据库中查询设备标识对应的所有用户状态,该实施方式可以包括:The first implementation mode is to query all user states corresponding to the device identifier in the key-value pair storage database. This implementation mode may include:

步骤S121:在键值对存储数据库中查询网络设备对应承载的所有用户标识,键值对存储数据库中存储有设备标识与用户标识之间的关联关系。Step S121: Query all user identifiers carried by the network device in the key-value pair storage database, where the association relationship between the device identifier and the user identifier is stored.

可选地,在使用键值对存储数据库中查询之前,还需要搭建键值对存储数据库,搭建键值对存储数据库的实施方式可以包括:基于Flink搭建分布式流数据流引擎服务器集群,此处的集群可以采用高可用(High-Availability,HA)集群。在分布式流数据流引擎服务器集群上以分布式的方式部署键值对存储数据库。Optionally, before using key-value pairs to store queries in the database, it is also necessary to build a key-value pair storage database. The implementation of building a key-value pair storage database may include: building a distributed stream data flow engine server cluster based on Flink, where The cluster can adopt high-availability (High-Availability, HA) cluster. Deploy the key-value pair storage database in a distributed manner on the distributed flow data flow engine server cluster.

上述步骤S121的实施方式例如:在Redis或者Memcached等等键值对存储数据库中查询设备标识对应的所有用户标识,键值对存储数据库中存储有设备标识与用户标识之间的关联关系。The implementation of the above step S121 is, for example: query all user IDs corresponding to the device ID in a key-value pair storage database such as Redis or Memcached, and the key-value pair storage database stores the association relationship between the device ID and the user ID.

步骤S122:查询所有用户标识中每个用户标识对应的用户状态,获得所有用户状态。Step S122: Query the user status corresponding to each user ID among all user IDs, and obtain all user statuses.

上述步骤S122的实施方式例如:上述的键值对存储数据库中还可以存储用户标识和用户状态的对应关系,那么可以在上述Redis或者Memcached等等键值对存储数据库中查询所有用户标识中每个用户标识对应的用户状态,获得所有用户状态。The implementation manner of the above-mentioned step S122 is for example: the above-mentioned key-value pair storage database can also store the corresponding relationship between the user identifier and the user state, then you can query all user identifiers in the above-mentioned key-value pair storage database such as Redis or Memcached The user state corresponding to the user ID, to obtain all user states.

第二种实施方式,在关系型数据库或者非关系型数据库查询设备标识对应的所有用户状态,该实施方式可以包括:在关系型数据库或者非关系型数据库中查询设备标识对应承载的所有用户标识,并在关系型数据库或者非关系型数据库中查询所有用户标识中每个用户标识对应的用户状态,获得所有用户状态;其中,可以使用的关系型数据库例如:Mysql、PostgreSQL、Oracle和SQLSever等,可以使用的非关系型数据库包括:grakn数据库、Neo4j图数据库、Hadoop子系统HBase、MongoDB和CouchDB等。The second implementation mode is to query all user statuses corresponding to the device identification in a relational database or a non-relational database. This implementation may include: querying in a relational database or a non-relational database for all user identifications carried by a device identification, And query the user status corresponding to each user ID in all user IDs in the relational database or non-relational database to obtain all user status; among them, the relational databases that can be used, such as: Mysql, PostgreSQL, Oracle and SQLSever, etc., can be The non-relational databases used include: grakn database, Neo4j graph database, Hadoop subsystem HBase, MongoDB and CouchDB, etc.

在步骤S120之后,执行步骤S130:电子设备判断设备标识对应的所有用户状态是否均为不在线。After step S120, step S130 is executed: the electronic device determines whether all user statuses corresponding to the device identification are offline.

在具体实践过程中,针对不同的网络设备有不同的判断方法,判断设备标识对应承载的所有用户状态是否均不在线的作用在于,确定该网络设备是否真实地发生故障。上述的网络设备可以是OLT设备、PON设备或者ONU设备,因此,上面的步骤S130的实施方式包括但不限于如下几种:In the specific practice process, there are different judgment methods for different network devices. The function of judging whether all the user statuses carried by the device identification are not online is to determine whether the network device actually fails. The above-mentioned network equipment can be OLT equipment, PON equipment or ONU equipment, therefore, the implementation mode of above step S130 includes but not limited to the following several:

第一种实施方式,如果网络设备是OLT设备,判断OLT设备的设备标识对应的所有用户状态是否均不在线,具体例如:在接收到OLT设备发送的中断告警消息之后,可以同步查询OLT设备的设备标识(例如OLT设备的IP地址)对应的所有RADIUS用户的用户状态是否均不在线,从而确定是否是真实的OLT设备中断故障。查询OLT设备的设备标识对应的所有RADIUS用户的用户状态的具体过程例如:先查询出设备标识对应的所有RADIUS用户的用户记录,用户记录中包括用户ID和用户状态等信息,然后再从这些用户记录中找到对应的用户状态(即找到当前用户在RADIUS业务系统中的在线状态)。这部分的用户记录数据是通过长期的RADIUS系统累积得到的,也就是说,每当电子设备接收到RADIUS系统发送的一条用户记录,就从该用户记录解析出该用户ID关联的OLT设备标识(例如OLT_IP)或/和PON设备标识(例如PON_ID)等网络设备标识,然后将用户ID和网络设备标识存放在Redis集群中,其中,键(key)设置为OLT_IP或OLT_IP+PON_ID,值(value)设置为用户组成的Set集合。In the first embodiment, if the network device is an OLT device, it is judged whether all the user states corresponding to the device identifier of the OLT device are offline. For example, after receiving an interruption alarm message sent by the OLT device, the OLT device can be synchronously queried Whether the user status of all RADIUS users corresponding to the device identifier (for example, the IP address of the OLT device) is offline, so as to determine whether it is a real OLT device interruption fault. The specific process of querying the user status of all RADIUS users corresponding to the device ID of the OLT device. For example: first query the user records of all RADIUS users corresponding to the device ID. Find the corresponding user status in the record (that is, find the online status of the current user in the RADIUS service system). This part of the user record data is accumulated through the long-term RADIUS system, that is to say, whenever the electronic device receives a user record sent by the RADIUS system, it parses out the OLT device identifier associated with the user ID from the user record ( Such as OLT_IP) or/and PON device identifiers (such as PON_ID) and other network device identifiers, and then store the user ID and network device identifier in the Redis cluster, where the key (key) is set to OLT_IP or OLT_IP+PON_ID, value (value) Set to a Set collection composed of users.

第二种实施方式,如果网络设备是PON设备,判断PON设备的设备标识对应的所有用户状态是否均不在线,具体例如:在接收到PON设备发送的中断告警消息之后,可以同步查询PON设备的设备标识对应的所有用户状态是否均不在线,从而确定是否是真实的PON设备中断故障。在具体的实践过程中,在生成每个用户对应的预警消息之后,还可以对用户进行筛选和标记,具体例如:筛选出状态为掉电的用户,并将状态为掉电的用户标记为“实际影响下线”,并将用户异常下线原因字段设置“为PON中断”等等。In the second embodiment, if the network device is a PON device, it is judged whether all user states corresponding to the device identifier of the PON device are offline. For example, after receiving an interruption alarm message sent by the PON device, the PON device can be queried synchronously. Whether all user statuses corresponding to the device identification are offline, so as to determine whether it is a real PON device interruption fault. In the specific practice process, after generating the warning message corresponding to each user, users can also be screened and marked. Actual impact on offline", and set the reason field of user abnormal offline to "PON interruption" and so on.

第三种实施方式,如果网络设备是ONU设备,判断ONU设备的设备标识对应的所有用户状态是否均不在线,ONU设备的具体实施方式将在下面步骤S140之后详细地描述。In a third embodiment, if the network device is an ONU device, it is judged whether all user states corresponding to the device identifier of the ONU device are offline. The specific implementation of the ONU device will be described in detail after step S140 below.

在步骤S130之后,执行步骤S140:若设备标识对应的所有用户状态均不在线,则电子设备向设备标识对应的所有用户发送预警信息。After step S130, step S140 is executed: if all users corresponding to the device identifiers are offline, the electronic device sends warning information to all users corresponding to the device identifiers.

上述步骤S140的实施方式例如:若设备标识对应的所有用户状态均不在线,则通过超文本传输协议(Hyper Text Transfer Protocol,HTTP)或者超文本传输安全协议(Hyper Text Transfer Protocol Secure,HTTPS)向设备标识对应的所有用户发送预警信息。The implementation manner of the above-mentioned step S140 is for example: if all user states corresponding to the device identification are not online, then send a message to All users corresponding to the device ID send warning information.

在上述电子设备向设备标识对应的所有用户发送预警信息的过程中,可以针对设备标识对应的所有用户中的每个用户,判断对该用户是否有实际影响。其中,可以根据用户信息是否满足第一预设条件来判断是否该用户有实际影响,具体例如:判断每个用户的用户信息是否满足第一预设条件,第一预设条件包括:用户下线原因字段是异常下线,且掉电原因字段是非掉电原因,且异常下线原因字段没有被回填(即被再次填充),并且中断告警消息的接收时刻与该用户的最后一次下线时刻之间的时长是否超过预设时长;若该用户的用户信息满足第一预设条件,则确认实际影响结果是对该用户有实际影响,否则,确认实际影响结果是对该用户没有实际影响。In the process of the electronic device sending the warning information to all the users corresponding to the device identifier, it may be determined for each user among all the users corresponding to the device identifier whether the user is actually affected. Among them, whether the user has actual influence can be judged according to whether the user information satisfies the first preset condition. Specifically, for example: judging whether the user information of each user satisfies the first preset condition. The first preset condition includes: the user goes offline The reason field is an abnormal offline, and the power-off reason field is a non-power-off reason, and the abnormal offline reason field has not been backfilled (that is, it is filled again), and the time between the receiving time of the interruption alarm message and the user's last offline time If the user information of the user satisfies the first preset condition, then confirm that the actual impact result has an actual impact on the user; otherwise, confirm that the actual impact result has no actual impact on the user.

根据每个用户的用户信息生成每个用户的预警信息,具体例如:将每个用户信息填充至预警信息模板,从而生成每个用户的预警信息;然后根据每个用户的实际影响结果确定是否向设备标识对应的所有用户发送预警信息,具体例如:若对该用户有实际影响,则可以将该用户的异常下线故障原因字段设置为设备中断,然后再向该用户发送预警信息,当然在具体过程中,也可以选择不向该用户发送预警信息。若对该用户没有实际影响,则将本次的预警信息记录至数据库日志即可。Generate early warning information for each user based on the user information of each user, for example: fill each user information into the early warning information template to generate early warning information for each user; then determine whether to send All users corresponding to the device ID send early warning information. For example, if there is an actual impact on the user, you can set the user's abnormal offline failure cause field to device interruption, and then send early warning information to the user. Of course, in the specific During the process, you can also choose not to send warning information to the user. If there is no actual impact on the user, it is enough to record this warning information to the database log.

在上述的实现过程中,通过判断设备标识对应的所有用户状态是否均不在线,只有在所有用户状态均为不在线的情况下,才确定该网络设备是真正的故障,此时才向设备标识对应的所有用户发送预警信息,这种方式有效地避免了一些用户仍然可以正常使用该网络设备上网的情况下发送预警信息导致错误预警的概率,从而有效地提高了根据中断告警消息给相应用户发送预警消息的正确率。In the above implementation process, by judging whether all user states corresponding to the device identification are offline, only when all user states are offline, it is determined that the network device is a real fault, and at this time, the device identification All corresponding users send early warning information. This method effectively avoids the probability of wrong early warning caused by sending early warning information when some users can still use the network device to access the Internet. The accuracy of warning messages.

请参见图3示出的本申请实施例提供的修正ONU告警类型的流程示意图;可选地,上述的光网络单元ONU的中断告警消息的告警类型可以包括:ONU断纤和ONU掉电;在步骤S110中的接收网络设备发送的中断告警消息之后,还可以修正ONU告警类型,修正ONU告警类型的实施方式可以包括:Please refer to the schematic flow chart of the correction ONU alarm type provided by the embodiment of the present application shown in Figure 3; optionally, the alarm type of the interruption alarm message of the above-mentioned optical network unit ONU may include: ONU fiber break and ONU power-off; After receiving the interrupt alarm message sent by the network equipment in step S110, the ONU alarm type can also be corrected, and the implementation of the ONU alarm type can include:

步骤S210:判断在预设时间范围内统计出PON设备的设备标识对应的ONU消息记录是否满足第二预设条件,第二预设条件为ONU消息记录的条数大于或者等于预设数量,且ONU消息记录中告警类型为ONU掉电的消息比例大于或者等于预设比例。Step S210: Determine whether the ONU message record corresponding to the device identifier of the PON device within the preset time range is counted to meet the second preset condition, the second preset condition is that the number of ONU message records is greater than or equal to the preset number, and The proportion of messages whose alarm type is ONU power-off in the ONU message records is greater than or equal to the preset proportion.

上述步骤S210的实施方式例如:先查询出PON设备的设备标识对应的ONU消息记录,ONU消息记录中包括ONU设备标识以及ONU设备对应的OLT设备的IP地址(即OLT_IP)和PON设备的端口号(即PON_PORT);然后,统计出预设时间范围(即规定时间段)内的每个OLT设备对应的ONU消息记录;最后,判断统计出设备标识对应的ONU消息记录是否满足第二预设条件,第二预设条件为ONU消息记录的条数大于或者等于预设数量,且ONU消息记录中告警类型为ONU掉电的消息比例大于或者等于预设比例。此处的第二预设条件具体例如:ONU消息记录的条数大于或者等于3,且ONU消息记录中告警类型为ONU掉电的消息比例大于或者等于66%。The implementation manner of above-mentioned step S210 is for example: first query the ONU message record corresponding to the equipment identification of PON equipment, comprise the IP address (being OLT_IP) of the OLT equipment corresponding to ONU equipment identification and ONU equipment and the port number of PON equipment in ONU message record (i.e. PON_PORT); then, count the ONU message records corresponding to each OLT device in the preset time range (that is, the specified time period); finally, judge whether the ONU message records corresponding to the device identification meet the second preset condition , the second preset condition is that the number of ONU message records is greater than or equal to the preset number, and the proportion of messages whose alarm type is ONU power down in the ONU message records is greater than or equal to the preset proportion. The second preset condition here is specifically for example: the number of ONU message records is greater than or equal to 3, and the proportion of messages whose alarm type is ONU power down in the ONU message records is greater than or equal to 66%.

步骤S220:若在预设时间范围内统计出PON设备的设备标识对应的ONU消息记录满足第二预设条件,则将ONU消息记录的告警类型由ONU断纤修改为ONU掉电。Step S220: If the ONU message record corresponding to the device identifier of the PON device meets the second preset condition within the preset time range, change the alarm type of the ONU message record from ONU fiber cut to ONU power down.

上述步骤S220的实施方式例如:若在预设时间范围内统计出PON设备的设备标识对应的ONU消息记录满足第二预设条件(例如ONU消息记录的条数大于或者等于3,且ONU消息记录中告警类型为ONU掉电的比例大于或者等于66%),则先过滤出告警类型为ONU断纤的ONU消息记录;然后,将标记生成的预警消息记录为“误判”;最后,将ONU消息记录的告警类型由ONU断纤修改为ONU掉电。The implementation manner of the above-mentioned step S220 is for example: if the ONU message record corresponding to the device identifier of the PON device is counted within the preset time range to meet the second preset condition (for example, the number of ONU message records is greater than or equal to 3, and the ONU message record If the alarm type is that the ONU power-off ratio is greater than or equal to 66%), then first filter out the ONU message records whose alarm type is ONU fiber disconnection; then, record the warning message generated by the mark as "misjudgment"; The alarm type of the message record is changed from ONU fiber cut to ONU power down.

请参见图4示出的本申请实施例提供的网络故障预警装置的结构示意图;本申请实施例提供了一种网络故障预警装置300,包括:Please refer to the schematic structural diagram of the network fault early warning device provided by the embodiment of the present application shown in FIG. 4; the embodiment of the present application provides a network fault early warning device 300, including:

告警消息接收模块310,用于接收网络设备发送的中断告警消息,中断告警消息包括网络设备的设备标识。The alarm message receiving module 310 is configured to receive an interruption alarm message sent by a network device, where the interruption alarm message includes a device identifier of the network device.

用户状态查询模块320,用于根据设备标识查询网络设备承载的所有用户,确定设备标识对应的所有用户状态。The user status query module 320 is configured to query all users carried by the network device according to the device ID, and determine the status of all users corresponding to the device ID.

用户状态判定模块330,用于判断设备标识对应的所有用户状态是否均为不在线。The user status determination module 330 is configured to determine whether all user statuses corresponding to the device identifiers are offline.

预警消息发送模块340,用于若设备标识对应的所有用户状态均为不在线,则向设备标识对应的所有用户发送预警信息。The early warning message sending module 340 is configured to send early warning information to all users corresponding to the device identifier if all users corresponding to the device identifier are offline.

可选地,在本申请实施例中,用户状态查询模块,包括:Optionally, in this embodiment of the application, the user status query module includes:

用户标识查询模块,用于在键值对存储数据库中查询网络设备对应承载的所有用户标识,键值对存储数据库中存储有设备标识与用户标识之间的关联关系。The user identification query module is used to query all user identifications carried by the network device in the key-value pair storage database, and the key-value pair storage database stores the association relationship between the device identification and the user identification.

用户状态获得模块,用于查询所有用户标识中每个用户标识对应的用户状态,获得所有用户状态。The user status obtaining module is configured to query the user status corresponding to each user ID in all user IDs, and obtain all user statuses.

可选地,在本申请实施例中,用户状态查询模块,还包括:Optionally, in this embodiment of the application, the user status query module further includes:

服务集群搭建模块,用于搭建分布式流数据流引擎服务器集群。The service cluster building module is used to build a distributed stream data flow engine server cluster.

键值对数据库模块,用于在分布式流数据流引擎服务器集群上以分布式的方式部署键值对存储数据库。The key-value pair database module is used to deploy the key-value pair storage database in a distributed manner on the server cluster of the distributed stream data flow engine.

可选地,在本申请实施例中,预警消息发送模块,包括:Optionally, in this embodiment of the application, the warning message sending module includes:

实际影响结果获取模块,用于针对设备标识对应的所有用户中的每个用户,根据每个用户的用户信息获取每个用户的实际影响结果;The actual impact result acquisition module is configured to acquire the actual impact result of each user according to the user information of each user for each user among all users corresponding to the device identifier;

消息生成发送模块,用于生成每个用户的预警信息,然后根据每个用户的实际影响结果确定是否向设备标识对应的所有用户发送预警信息。The message generating and sending module is configured to generate early warning information for each user, and then determine whether to send early warning information to all users corresponding to the device identifier according to the actual impact result of each user.

可选地,在本申请实施例中,用户信息包括:用户下线原因字段、掉电原因字段和异常下线原因字段;实际影响结果获取模块,包括:Optionally, in the embodiment of the present application, the user information includes: a user offline reason field, a power-down reason field, and an abnormal offline reason field; the actual impact result acquisition module includes:

告警消息判断模块,用于判断每个用户的用户信息是否满足第一预设条件,第一预设条件包括:用户下线原因字段是异常下线,且掉电原因字段是非掉电原因,且异常下线原因字段没有被回填,并且中断告警消息的接收时刻与该用户的最后一次下线时刻之间的时长是否超过预设时长。The alarm message judging module is used to judge whether the user information of each user satisfies the first preset condition. The first preset condition includes: the user offline reason field is abnormal offline, and the power-off reason field is a non-power-off reason, and The abnormal logout reason field has not been backfilled, and whether the time between the receiving time of the interruption warning message and the user's last offline time exceeds the preset time.

实际影响确认模块,用于若该用户的用户信息满足第一预设条件,则确认实际影响结果是对该用户有实际影响,否则,确认实际影响结果是对该用户没有实际影响。The actual impact confirmation module is used to confirm that the actual impact result has an actual impact on the user if the user information of the user satisfies the first preset condition, otherwise, confirm that the actual impact result has no actual impact on the user.

可选地,在本申请实施例中,网络设备为无源光网络PON设备,网络故障预警装置,还包括:Optionally, in the embodiment of the present application, the network device is a passive optical network PON device, and the network fault early warning device further includes:

消息记录判断模块,用于判断在预设时间范围内统计出PON设备的设备标识对应的ONU消息记录是否满足第二预设条件,第二预设条件为ONU消息记录的条数大于或者等于预设数量,且ONU消息记录中告警类型为ONU掉电的消息比例大于或者等于预设比例。The message record judging module is used to judge whether the ONU message record corresponding to the device identification of the PON device within the preset time frame is counted and meets the second preset condition, and the second preset condition is that the number of ONU message records is greater than or equal to the preset The number is set, and the proportion of messages whose alarm type is ONU power-off in the ONU message record is greater than or equal to the preset proportion.

告警类型修改模块,用于若设备标识对应的ONU消息记录满足第二预设条件,则将ONU消息记录的告警类型由ONU断纤修改为ONU掉电。The alarm type modification module is used to modify the alarm type of the ONU message record from ONU fiber cut to ONU power down if the ONU message record corresponding to the device identifier satisfies the second preset condition.

可选地,在本申请实施例中,网络设备包括:无源光网络PON设备、光线路终端设备或者光网络单元ONU。Optionally, in this embodiment of the present application, the network device includes: a passive optical network PON device, an optical line terminal device, or an optical network unit ONU.

应理解的是,该装置与上述的网络故障预警方法实施例对应,能够执行上述方法实施例涉及的各个步骤,该装置具体的功能可以参见上文中的描述,为避免重复,此处适当省略详细描述。该装置包括至少一个能以软件或固件(firmware)的形式存储于存储器中或固化在装置的操作系统(operating system,OS)中的软件功能模块。It should be understood that the device corresponds to the above-mentioned embodiment of the network fault early warning method, and can perform various steps involved in the above-mentioned method embodiment. The specific functions of the device can refer to the description above. To avoid repetition, details are omitted here appropriately. describe. The device includes at least one software function module that can be stored in a memory in the form of software or firmware (firmware) or solidified in an operating system (operating system, OS) of the device.

请参见图5示出的本申请实施例提供的电子设备的结构示意图。本申请实施例提供的一种电子设备400,包括:处理器410和存储器420,存储器420存储有处理器410可执行的机器可读指令,机器可读指令被处理器410执行时执行如上的方法。Please refer to FIG. 5 , which is a schematic structural diagram of an electronic device provided by an embodiment of the present application. An electronic device 400 provided in an embodiment of the present application includes: a processor 410 and a memory 420, the memory 420 stores machine-readable instructions executable by the processor 410, and the machine-readable instructions are executed by the processor 410 to perform the above method .

本申请实施例还提供了一种计算机可读存储介质430,该计算机可读存储介质430上存储有计算机程序,该计算机程序被处理器410运行时执行如上的方法。The embodiment of the present application also provides a computer-readable storage medium 430, on which a computer program is stored, and the computer program is executed by the processor 410 to execute the above method.

其中,计算机可读存储介质430可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(Static Random Access Memory,简称SRAM),电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,简称EEPROM),可擦除可编程只读存储器(Erasable Programmable Read Only Memory,简称EPROM),可编程只读存储器(Programmable Read-Only Memory,简称PROM),只读存储器(Read-Only Memory,简称ROM),磁存储器,快闪存储器,磁盘或光盘。Wherein, the computer-readable storage medium 430 can be realized by any type of volatile or non-volatile storage device or their combination, such as static random access memory (Static Random Access Memory, referred to as SRAM), electrically erasable Electrically Erasable Programmable Read-Only Memory (EEPROM for short), Erasable Programmable Read Only Memory (EPROM for short), Programmable Read-Only Memory (PROM for short) ), read-only memory (Read-Only Memory, referred to as ROM), magnetic memory, flash memory, magnetic disk or optical disk.

本申请实施例提供的几个实施例中,应该理解到,所揭露的装置和方法,也可以通过其他的方式实现。以上所描述的装置实施例仅是示意性的,例如,附图中的流程图和框图显示了根据本申请实施例的多个实施例的装置、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段或代码的一部分,模块、程序段或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现方式中,方框中所标注的功能也可以和附图中所标注的发生顺序不同。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这主要根据所涉及的功能而定。In the several embodiments provided in the embodiments of the present application, it should be understood that the disclosed devices and methods may also be implemented in other ways. The device embodiments described above are only illustrative. For example, the flowcharts and block diagrams in the accompanying drawings show possible implementation architectures of devices, methods, and computer program products according to multiple embodiments of the embodiments of the present application. function and operation. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more executable instruction. It should also be noted that, in some alternative implementation manners, the functions noted in the block may also occur out of the order noted in the drawings. For example, two blocks in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.

另外,在本申请实施例中的各个实施例的各功能模块可以集成在一起形成一个独立的部分,也可以是各个模块单独存在,也可以两个或两个以上模块集成形成一个独立的部分。In addition, the functional modules of the various embodiments in the embodiments of the present application may be integrated to form an independent part, or each module may exist independently, or two or more modules may be integrated to form an independent part.

在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。In this document, relational terms such as first and second etc. are used only to distinguish one entity or operation from another without necessarily requiring or implying any such relationship between these entities or operations. Actual relationship or sequence.

以上的描述,仅为本申请实施例的可选实施方式,但本申请实施例的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请实施例揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请实施例的保护范围之内。The above description is only an optional implementation of the embodiment of the present application, but the scope of protection of the embodiment of the present application is not limited thereto. Anyone familiar with the technical field can Changes or substitutions that can easily be thought of should fall within the scope of protection of the embodiments of the present application.

Claims (10)

1.一种网络故障预警方法,其特征在于,包括:1. A network failure early warning method, characterized in that, comprising: 接收网络设备发送的中断告警消息,所述中断告警消息包括所述网络设备的设备标识;receiving an interruption warning message sent by a network device, where the interruption warning message includes a device identifier of the network device; 根据所述设备标识查询所述网络设备承载的所有用户,确定所述设备标识对应的所有用户状态;Querying all users carried by the network device according to the device identifier, and determining the status of all users corresponding to the device identifier; 判断所述设备标识对应的所有用户状态是否均为不在线;Judging whether all user states corresponding to the device identifier are offline; 若是,则向所述设备标识对应的所有用户发送预警信息。If yes, send early warning information to all users corresponding to the device identifier. 2.根据权利要求1所述的方法,其特征在于,所述根据所述设备标识查询所述网络设备承载的所有用户,确定所述设备标识对应的所有用户状态,包括:2. The method according to claim 1, wherein the querying all users carried by the network device according to the device identifier, and determining the status of all users corresponding to the device identifier comprises: 在键值对存储数据库中查询所述网络设备对应承载的所有用户标识,所述键值对存储数据库中存储有设备标识与用户标识之间的关联关系;Querying all user identifiers carried by the network device in the key-value pair storage database, where the key-value pair storage database stores an association between the device identifier and the user identifier; 查询所述所有用户标识中每个用户标识对应的用户状态,获得所述所有用户状态。Query the user status corresponding to each user ID in the all user IDs to obtain the status of all the users. 3.根据权利要求2所述的方法,其特征在于,在所述在键值对存储数据库中查询所述网络设备对应承载的所有用户标识之前,还包括:3. The method according to claim 2, further comprising: 搭建分布式流数据流引擎服务器集群;Build a distributed stream data flow engine server cluster; 在所述分布式流数据流引擎服务器集群上以分布式的方式部署所述键值对存储数据库。The key-value pair storage database is deployed in a distributed manner on the distributed stream data flow engine server cluster. 4.根据权利要求1所述的方法,其特征在于,所述向所述设备标识对应的所有用户发送预警信息,包括:4. The method according to claim 1, wherein the sending warning information to all users corresponding to the device identifier comprises: 针对所述设备标识对应的所有用户中的每个用户,根据每个用户的用户信息获取每个用户的实际影响结果;For each user among all users corresponding to the device identifier, obtain the actual impact result of each user according to the user information of each user; 生成每个用户的预警信息,然后根据所述每个用户的实际影响结果确定是否向所述设备标识对应的所有用户发送预警信息。Generate early warning information for each user, and then determine whether to send early warning information to all users corresponding to the device identifier according to the actual impact result of each user. 5.根据权利要求4所述的方法,其特征在于,所述用户信息包括:用户下线原因字段、掉电原因字段和异常下线原因字段;所述根据所述每个用户的用户信息获取所述每个用户的实际影响结果,包括:5. The method according to claim 4, wherein the user information includes: a user offline reason field, a power-down reason field, and an abnormal offline reason field; Actual impact results for each user described, including: 判断所述每个用户的用户信息是否满足第一预设条件,所述第一预设条件包括:所述用户下线原因字段是异常下线,且所述掉电原因字段是非掉电原因,且所述异常下线原因字段没有被回填,并且所述中断告警消息的接收时刻与该用户的最后一次下线时刻之间的时长是否超过预设时长;Judging whether the user information of each user satisfies a first preset condition, the first preset condition includes: the user offline reason field is abnormal offline, and the power-down reason field is a non-power-down reason, And the abnormal offline reason field is not backfilled, and whether the time length between the receiving moment of the interruption warning message and the user's last offline time exceeds the preset time length; 若是,则确认所述实际影响结果是对该用户有实际影响,否则,确认所述实际影响结果是对该用户没有实际影响。If yes, confirm that the actual impact result has an actual impact on the user; otherwise, confirm that the actual impact result has no actual impact on the user. 6.根据权利要求1-5任一所述的方法,其特征在于,所述网络设备为无源光网络PON设备,在所述接收网络设备发送的中断告警消息之后,还包括:6. The method according to any one of claims 1-5, wherein the network device is a passive optical network PON device, and after receiving the interruption alarm message sent by the network device, further comprising: 判断在预设时间范围内统计出所述PON设备的设备标识对应的ONU消息记录是否满足第二预设条件,所述第二预设条件为所述ONU消息记录的条数大于或者等于预设数量,且所述ONU消息记录中告警类型为ONU掉电的消息比例大于或者等于预设比例;Judging whether the ONU message record corresponding to the device identifier of the PON device is counted within the preset time range and whether it satisfies a second preset condition, the second preset condition is that the number of ONU message records is greater than or equal to the preset quantity, and the alarm type in the ONU message record is that the proportion of ONU power-down messages is greater than or equal to the preset proportion; 若是,则将所述ONU消息记录的告警类型由ONU断纤修改为ONU掉电。If yes, modify the alarm type recorded in the ONU message from ONU fiber cut to ONU power down. 7.根据权利要求1-5任一所述的方法,其特征在于,所述网络设备包括:无源光网络PON设备、光线路终端设备或者光网络单元ONU。7. The method according to any one of claims 1-5, wherein the network equipment comprises: a passive optical network (PON) equipment, an optical line terminal equipment or an optical network unit (ONU). 8.一种网络故障预警装置,其特征在于,包括:8. A network failure early warning device, characterized in that it comprises: 告警消息接收模块,用于接收网络设备发送的中断告警消息,所述中断告警消息包括所述网络设备的设备标识;An alarm message receiving module, configured to receive an interruption alarm message sent by a network device, where the interruption alarm message includes a device identifier of the network device; 用户状态查询模块,用于根据所述设备标识查询所述网络设备承载的所有用户,确定所述设备标识对应的所有用户状态;A user status query module, configured to query all users carried by the network device according to the device identifier, and determine the status of all users corresponding to the device identifier; 用户状态判定模块,用于判断所述设备标识对应的所有用户状态是否均为不在线;A user status determination module, configured to determine whether all user statuses corresponding to the device identifier are offline; 预警消息发送模块,用于若所述设备标识对应的所有用户状态均为不在线,则向所述设备标识对应的所有用户发送预警信息。The early warning message sending module is configured to send early warning information to all users corresponding to the device identifier if all users corresponding to the device identifier are offline. 9.一种电子设备,其特征在于,包括:处理器和存储器,所述存储器存储有所述处理器可执行的机器可读指令,所述机器可读指令被所述处理器执行时执行如权利要求1至7任一所述的方法。9. An electronic device, characterized in that it comprises: a processor and a memory, the memory stores machine-readable instructions executable by the processor, and when the machine-readable instructions are executed by the processor, they perform as follows: The method according to any one of claims 1 to 7. 10.一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至7任一所述的方法。10. A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and when the computer program is run by a processor, the method according to any one of claims 1 to 7 is executed.
CN202111210421.3A 2021-10-18 2021-10-18 A network fault early warning method, device, electronic equipment and storage medium Pending CN115996334A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111210421.3A CN115996334A (en) 2021-10-18 2021-10-18 A network fault early warning method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111210421.3A CN115996334A (en) 2021-10-18 2021-10-18 A network fault early warning method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115996334A true CN115996334A (en) 2023-04-21

Family

ID=85992766

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111210421.3A Pending CN115996334A (en) 2021-10-18 2021-10-18 A network fault early warning method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115996334A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040001479A1 (en) * 2002-07-01 2004-01-01 Pounds Gregory E. Systems and methods for voice and data communications including a network drop and insert interface for an external data routing resource
CN103905240A (en) * 2012-12-28 2014-07-02 中国电信股份有限公司 Method and system for active network service fault reminding and processing
CN105099763A (en) * 2015-06-29 2015-11-25 小米科技有限责任公司 Method and device for reminding lost connection of equipment
CN110149227A (en) * 2019-05-16 2019-08-20 平安科技(深圳)有限公司 The method and device of network alarm
CN110912775A (en) * 2019-11-26 2020-03-24 中盈优创资讯科技有限公司 Internet of things enterprise network fault monitoring method and device
CN113365164A (en) * 2021-05-26 2021-09-07 中盈优创资讯科技有限公司 Active identification two-stage light splitting method and device based on big data analysis

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040001479A1 (en) * 2002-07-01 2004-01-01 Pounds Gregory E. Systems and methods for voice and data communications including a network drop and insert interface for an external data routing resource
CN103905240A (en) * 2012-12-28 2014-07-02 中国电信股份有限公司 Method and system for active network service fault reminding and processing
CN105099763A (en) * 2015-06-29 2015-11-25 小米科技有限责任公司 Method and device for reminding lost connection of equipment
CN110149227A (en) * 2019-05-16 2019-08-20 平安科技(深圳)有限公司 The method and device of network alarm
CN110912775A (en) * 2019-11-26 2020-03-24 中盈优创资讯科技有限公司 Internet of things enterprise network fault monitoring method and device
CN113365164A (en) * 2021-05-26 2021-09-07 中盈优创资讯科技有限公司 Active identification two-stage light splitting method and device based on big data analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄印君;: "试论集中监控下信息通信网络故障的管理", 信息系统工程, no. 09, 20 September 2015 (2015-09-20) *

Similar Documents

Publication Publication Date Title
CN106973093B (en) A kind of service switch method and device
US20230267326A1 (en) Machine Learning Model Management Method and Apparatus, and System
US20160294886A1 (en) Registration Method and System for Common Service Entity
CN111565133B (en) Private line switching method and device, electronic equipment and computer readable storage medium
CN111130912B (en) Abnormal location method, server and storage medium for content distribution network
CN109391661B (en) Blockchain networking method and system for IoT terminal
CN112636979B (en) Cluster alarm method and related device
CN113453213A (en) Authentication data synchronization method and device
CN111679950A (en) Interface-level dynamic data sampling method and device
CN114205213B (en) Fault pushing method and device, storage medium and electronic equipment
WO2021233322A1 (en) Synchronization method, apparatus and device for recording data, and storage medium
US20240048598A1 (en) Identifying an active administration function (admf) in a lawful interception deployment that utilizes a plurality of admfs
CN113810238A (en) Network monitoring method, electronic device and storage medium
CN105323102A (en) Link polling method, link polling device and link polling system
CN111130821A (en) Power failure alarm method, processing method and device
US11709725B1 (en) Methods, systems, and computer readable media for health checking involving common application programming interface framework
CN115348161A (en) Log alarm information generation method and device, electronic equipment and storage medium
JP2022510687A (en) Systems and methods for determining and reporting node malfunctions
US9788223B2 (en) Processing customer experience events from a plurality of source systems
CN115996334A (en) A network fault early warning method, device, electronic equipment and storage medium
EP3139536A1 (en) Alarm reporting method and device
CN117557211A (en) Intelligent financial business processing method, platform and medium based on flow automation
CN113259185B (en) Network management agent and network element management platform
CN117196650A (en) Method and device for preventing goods from being fleed
NL2028390B1 (en) A method, a system and a computer program product for monitoring an industrial ethernet protocol type network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination