[go: up one dir, main page]

CN107992398A - The monitoring method and monitoring system of a kind of operation system - Google Patents

The monitoring method and monitoring system of a kind of operation system Download PDF

Info

Publication number
CN107992398A
CN107992398A CN201711404987.3A CN201711404987A CN107992398A CN 107992398 A CN107992398 A CN 107992398A CN 201711404987 A CN201711404987 A CN 201711404987A CN 107992398 A CN107992398 A CN 107992398A
Authority
CN
China
Prior art keywords
monitoring
data
operation system
monitored
system data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711404987.3A
Other languages
Chinese (zh)
Other versions
CN107992398B (en
Inventor
张现双
薛菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pleasant Sunny Technology Development (beijing) Co Ltd
Original Assignee
Pleasant Sunny Technology Development (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pleasant Sunny Technology Development (beijing) Co Ltd filed Critical Pleasant Sunny Technology Development (beijing) Co Ltd
Priority to CN201711404987.3A priority Critical patent/CN107992398B/en
Publication of CN107992398A publication Critical patent/CN107992398A/en
Application granted granted Critical
Publication of CN107992398B publication Critical patent/CN107992398B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3495Performance evaluation by tracing or monitoring for systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present invention provides a kind of monitoring method of operation system, this method includes:The daily record data of capturing service system, the data of acquisition are filtered and are standardized with pretreatment, the data for coming from particular ip address and/or particular host and/or particular system are filtered out from pretreated operation system data, and it is standardized and verifies, further according to the monitored item institute keyword set in advance corresponding to monitored service, judge whether the operation system daily record after reprocessing is the anomaly unit, monitoring abnormal alarm is generated according to monitoring rules, corresponding monitoring strategies are selected to pass through one or more medium alerts etc., it can realize and monitored system 0 is invaded, substantially reduce monitored system improvement cost and in real time, the technique effects such as arrival rate height.In addition, embodiments of the present invention additionally provide a kind of monitoring system, electronic equipment and a kind of computer-readable recording medium.

Description

The monitoring method and monitoring system of a kind of operation system
Technical field
The present invention relates to computer realm, more particularly to a kind of monitoring method of operation system, monitoring system, electronic equipment With a kind of computer-readable recording medium.
Background technology
With the fast development of corporate business, function iteration is carried out usually on the basis of existing business system, so that soon Speed seizes market.In this process, ordinary skill has little time to improve the real time monitoring to each branch of operation system, leads The missing of the infrastructure such as monitor supervision platform is caused, restriction is brought for further increase of follow-up business.In order to make up monitoring system Missing, existing solution be typically middleware, machine, disk, cpu etc. are subsequently monitored, these monitor modes To the invasive very strong of the system that is monitored, it is necessary to do larger transformation to system, more costs are virtually added.Such as By assessing a loan transaction system, if transformed a subsystem wherein more than 300, it is necessary to general 2400 people My god, it is huge input, and can not ensure the quality of transformation, can not guaranteeing business continuity, it is necessary to shut down.
Relevant monitoring product is belonged to and is designed from application program capacity monitoring (APM) angle and real in the industry at present Distributed monitoring products C AT existing, that such as popular comment company increases income, the system to the invasive too strong of the system that is monitored, and Performance has certain loss.
However, being greatly improved with present software and hardware reliabilty and availability, the performance of application program is relatively stablized, and occurs The probability of failure is smaller.By taking loan transaction system as an example, the sent system failure is monitored by application program capacity every year:As in Deposit that spilling, deadlock, CPU usage be excessive, ratio of internal storage warning, be less than operation system exception:As user withdraw deposit unsuccessfully, use Family Fail Transaction, the 5% of the excessively low failure of short message arrival rate.Therefore, the service exception of operation system monitors and alarm becomes weight In it is weight.
On the other hand, if the uncontrolled each exception for the operation system monitored occur generates alarm, It can then cause the pressure of alarm system huge.For example the such as user's operation error of the abnormal monitoring item of some nonsystematic failures is led The service exception of cause, if abnormal every time all alarm, will also result in the waste of background maintenance personnel.Accordingly, it is desirable to provide one The monitoring and alarm method that kind flexibly can configure, and can be received whenever and wherever possible by a variety of sending method maintenance personnel Warning reminding.
The description above recognizes it is the prior art not because not including in the introduction.
The content of the invention
To solve the service exception that the prior art is unable to real-time monitoring system, application program capacity monitoring is to the system that is monitored Invasive strong, performance is lost greatly, and the technical problem that alarm is excessively frequent and single pattern, it is a primary object of the present invention to A kind of monitoring method of operation system, system, electronic equipment and computer-readable recording medium are provided.
To achieve the above object, the present invention provides a kind of monitoring method of operation system, for monitoring multiple business at the same time System, including:
Gather and pre-process operation system data, wherein the operation system data include operation system daily record;
Obtain pretreated operation system data and reprocessed;
Item analysis is monitored based on the operation system data after reprocessing, judges anomaly unit;
Monitoring abnormal alarm is generated according to monitoring rules;
Using corresponding monitoring strategies alert.
Alternatively, the operation system data further include application program capacity monitoring data.
Alternatively, the pretreatment operation system data include operation system data are filtered and standardized.
Alternatively, the pretreatment operation system data are further included is distributed control to operation system data.
Alternatively, it is described reprocessing be filtered out from pretreated operation system data come from particular ip address and/ Or the data of particular host and/or particular system.
Alternatively, the reprocessing, which is further included, is standardized and verifies to pretreated operation system data.
Alternatively, the monitored item analysis is according to the monitored item institute key set in advance for corresponding to monitored service Word, judges whether the operation system daily record after reprocessing is the anomaly unit.
Alternatively, the monitoring rules include system-level abnormal merging and/or application level exceptions merge, wherein the system The abnormal frequency threshold value for merging into definition monitoring abnormal alarm of level, the application level exceptions merge into user, and to can customize monitoring different The priority often merged.
Alternatively, the monitoring strategies include monitoring nodes strategy, Node B threshold monitoring strategies, the monitoring of Node B threshold ratio In strategy, fixed-point data threshold value monitoring strategies, standard abnormal monitoring strategy and standard APM monitoring strategies any one or it is more Kind.
Alternatively, the alert is realized by one or more media of mail, short message or wechat.
In addition, to achieve the above object, the present invention also provides a kind of monitoring system, for monitoring multiple business systems at the same time System, it is characterised in that including:
Data acquisition pretreatment module, for gathering and pre-processing operation system data, wherein the operation system data Including operation system daily record;
Monitoring module, including:Unit is reprocessed, for obtaining pretreated operation system data and being reprocessed; Judging unit, for being monitored item analysis based on the operation system data after reprocessing, judges anomaly unit;Warning message is given birth to Into unit, for generating monitoring abnormal alarm according to monitoring rules;
Warning message sends module, for using corresponding monitoring strategies alert;
Monitor configuration module, for configure the monitored item of visualization interface, monitoring strategies, predetermined type of alarm, watch, System configuration, warning message are inquired about any one or more.
Alternatively, the operation system data further include application program capacity monitoring data.
Alternatively, the pretreatment operation system data include operation system data are filtered and standardized.
Alternatively, the pretreatment operation system data are further included is distributed control to operation system data.
Alternatively, it is described reprocessing be filtered out from pretreated operation system data come from particular ip address and/ Or the data of particular host and/or particular system.
Alternatively, the reprocessing, which is further included, is standardized and verifies to pretreated operation system data.
Alternatively, the monitored item institute keyword set in advance that monitored item analysis is monitored and services according to corresponding to, Judge whether the operation system daily record after reprocessing is the anomaly unit.
Alternatively, the monitoring rules include system-level abnormal merging and/or application level exceptions merge, wherein the system The abnormal frequency threshold value for merging into definition monitoring abnormal alarm of level, the application level exceptions merge into user, and to can customize monitoring different The priority often merged.
Alternatively, the monitoring strategies include monitoring nodes strategy, Node B threshold monitoring strategies, the monitoring of Node B threshold ratio In strategy, fixed-point data threshold value monitoring strategies, standard abnormal monitoring strategy and standard APM monitoring strategies any one or it is more Kind.
Alternatively, the alert can be realized by one or more media of mail, short message or wechat.
In addition, to achieve the above object, the present invention also provides a kind of electronic equipment, including:
Memory, for storing computer program;
Processor, for performing the computer program stored in the memory, when the computer program is performed, The step of performing monitoring method as described above.
In addition, to achieve the above object, the present invention also provides a kind of computer-readable recording medium, is stored thereon with calculating Machine program, when the computer program is executed by processor, perform as described above monitoring method the step of.
Technical solution provided by the invention, by the daily record data of capturing service system, filters the data of acquisition Pre-processed with standardization, filtered out from pretreated operation system data come from particular ip address and/or particular host and/ Or the data of particular system, and be standardized and verify, preset further according to the monitored item corresponding to monitored service Keyword, judge reprocessing after operation system daily record whether be the anomaly unit, according to monitoring rules generation monitoring it is different Often alarm, monitoring module select corresponding monitoring strategies to pass through one or more medium alerts.
Compared with prior art, the present invention occurs different by gathering IT business system acquisition daily record data with real time monitoring The higher service exception data of normal probability, the keyword of monitored service monitoring item is corresponded to by presetting, and judgement is located again Whether the operation system daily record after reason is the anomaly unit, realizes and monitored system 0 is invaded, application program 0 is transformed, whole Body service system scalability is strong, substantially reduces the improvement cost of monitored system, additionally provides the monitoring strategies of plurality of optional, And a variety of alarm modes that flexibly can configure, realize the technique effects such as real-time, cheap, arrival rate height.
Brief description of the drawings
In the accompanying drawings, some embodiments of the present invention are shown by way of example, and not by way of limitation, wherein:
Fig. 1 schematically shows a kind of structure diagram of monitoring system according to an embodiment of the present invention;
Fig. 2 schematically shows the flow chart of access layer data acquisition pretreatment according to an embodiment of the present invention;
Fig. 3 schematically shows monitoring module work flow diagram according to an embodiment of the invention;
Fig. 4 schematically shows monitored item according to an embodiment of the invention and monitored service entities graph of a relation;
Fig. 5 schematically shows the internal structure signal of data acquisition pretreatment module according to an embodiment of the invention Figure;
Fig. 6 schematically shows monitoring module structure diagram according to an embodiment of the invention;
Fig. 7 schematically shows monitoring & alarms block mold figure according to an embodiment of the invention;
Fig. 8 schematically shows monitoring configuration module user visualization interface signal according to an embodiment of the invention Figure;
Fig. 9 schematically shows monitoring configuration module user visualization interface signal according to an embodiment of the invention Figure;
Figure 10 schematically shows the structure diagram of a kind of electronic equipment according to an embodiment of the invention.
Embodiment
The principle of the present invention and spirit are described below with reference to some specific embodiments.It should be appreciated that these embodiments Just for the sake of better understood when those skilled in the art and then realize the present invention, this hair is not limited in any way Bright scope.
The present invention proposes a kind of monitoring method of operation system, a kind of monitoring system, a kind of electronic equipment and a kind of meter Calculation machine readable storage medium storing program for executing.The present invention can be implemented as following form, i.e.,:Complete hardware, complete software are (including solid Part, resident software, microcode etc.), or the form that hardware and software combines.
As shown in Figure 1, the structure diagram for the monitoring system that Fig. 1, which is the embodiment of the present invention, to be related to.Data acquisition is pre- Processing module 101 is filtered and marked to the data of acquisition in access layer 1013 by capturing service syslog data 1011 Standardization pre-processes, and the reprocessing unit 1021 in monitoring module 102 is filtered out from pretreated operation system data to be come from The data of particular ip address and/or particular host and/or particular system, and be standardized and verify, judging unit 1022 is again According to the monitored item institute keyword set in advance corresponding to monitored service, whether the operation system daily record after reprocessing is judged For the anomaly unit, alarm generation unit 1023 generates monitoring abnormal alarm according to monitoring rules, and warning message sends module The 103 corresponding monitoring strategies of selection pass through one or more medium alerts.
Technical solution provided in an embodiment of the present invention uses streaming computing platform of increasing income as instrument, with reference to streaming computing skill Art, using the daily record of each operation system as monitored object, by a variety of monitoring strategies and alarm strategy, auxiliary is synchronous in a distributed manner, The many kinds of measures such as distributed lock technology, streaming computing time window and pipelining, carry out the associated monitoring point in daily record Analysis calculates in real time, so as to be monitored and alarm.
One embodiment of the invention provides a kind of monitoring method of operation system, for monitoring multiple operation systems at the same time, Including:
Gather and pre-process operation system data, wherein the operation system data include operation system daily record;
Obtain pretreated operation system data and reprocessed;
Item analysis is monitored based on the operation system data after reprocessing, judges anomaly unit;
Monitoring abnormal alarm is generated according to monitoring rules;
Using corresponding monitoring strategies alert.
Further, the capturing service system data, including operation system daily record and application program capacity monitoring number According to 1012 (APM), following steps are performed from business system acquisition log services:
Based on metadata acquisition tool log collection component (logstash) of increasing income, gathered by its data collector (agent) By the daily record of service system, unified daily record access layer is pushed to;Log collection component (Logstash) is in push daily record When, it is necessary to carry the admission authentication information of monitoring system offer, monitoring system is carried in the communication protocol header (header) is ARGUS_ID is identified by unique ID of service system distribution, the ID marks that monitoring system verifies its carrying when daily record accesses are It is no consistent with the host address configured in system (host), if verification is consistent, the filtration treatment of daily record bag is carried out, is otherwise lost Abandon.
Further, as shown in Fig. 2, described gather and pre-process operation system data including being carried out to operation system data Filtering and standardization.Fig. 2 is access layer data acquisition pretreatment process figure, and the pretreatment of daily record performs following steps:
S201:Monitoring system provides unified http access services address, and multiple http service nodes use reverse proxy Server (nginx) carries out load balancing (lb), and log information is pushed from log collection component (logstash) toward the address, by Reverse Proxy (nginx) is forwarded to access layer;ARGUS_ID mark verifications are carried out in Reverse Proxy (nginx) Intercept process, with reduce post-controlled module processing flow pressure;
S202:Set white list rule to be monitored access verification, white name is set according to by the IP address of access server Single gauge then, prevents the plaintext transmission due to ARGUS_ID, and the ARGUS_ID, possible forged identity access service are intercepted and captured in certain service Device;
If by the IP address of access server not in the white list, step S203 is performed:Direct refusal respond;
If by the IP address of access server in the white list, by verification, step S204 is performed:;
S204:Configurating filtered rule, after access layer gets daily record data, carries out the filtering of data, such as filtering debug Daily record, or filter the daily record of mybatis etc., to reduce the flow that daily record flows into back-end system to the greatest extent;Specific filtering rule knot Specific daily record is closed to determine, such as is exemplified below several classes:
S205:Daily record standardization is carried out, with reference to the structure of actual log, is stored using unified model, with processing The various information that daily record text includes, such as daily record rank, timestamp, call class, bag name, specific basic information, the model It is defined as follows:
Since the rule that different access system daily records prints is all different, such as LogId, definition flexibly adapts to various situations Under resolution rules, generated if it can be resolved to, parse less than for sky.
S206:The distribution of daily record unified model entity information is carried out, corresponding processing queue is such as distributed to, by down-stream system Handled, monitor the state of queue;
If queue full, step S207 is performed:Send alarm;
S208:Access service degradation is carried out, takes bag drop policy and refusal to service, or to downstream treatment systems resource It is scheduled, to protect monitoring system to prevent from flooding;
S210:Terminate daily record data processing;
If queue is discontented with, step S209 is performed:Daily record data is distributed to processing of joining the team;
S211:Terminate daily record data processing.
Further, the distribution control of step S206 progress daily record data specifically includes:
After daily record standardization, then it is pushed in journal queue, is analyzed, handled for follow-up work;
The server that all cache servers (redis) of monitoring configuration system are obtained by access layer configures, and initialization is extremely The node of distributed application program coordination service (zookeeper);
After daily record is joined the team, queue is registered in time under corresponding cache server (redis) node, notifies data pick-up Device (spout) extracts log stream to the queue, and queue lists are obtained by the data pick-up device (spout) of monitoring system;
For each access system with host address (HOST) for dimension, each HOST uses a queue, uses more cachings Server redis carries out burst (shared);
Identification object is identified as with ARGUS_ID, determines daily record into which point using uniformity Hash (HASH) algorithm Piece;
After redis nodes delay machine, adjusted again using HASH algorithms dynamic immediately.
Further, as shown in figure 3, Fig. 3 is monitoring module work flow diagram, daily record by the cleaning of access layer, processing, After standardization, pretreated operation system data will be obtained and reprocessed, including step S301:
All teams for all bursts for being distributed in each cache server are obtained using multiple data pick-up devices (spout) Row;
When data pick-up device (spout) is initial burst row are obtained from distributed application program coordination service (zookeeper) Queue lists and local cache memory (cache) under table, burst, and in distributed application program coordination service (zookeeper) monitor (listener) node is registered on;
If node does not change, follow-up extract obtains queue from local cache memory (cache) every time List is extracted daily record and is launched;
When access layer registration, remove and monitor configuration module and safeguard that triggering etc. causes register node to change, can active obtaining Newest queue lists extract log stream.
Further, the reprocessing is to filter out to come from particular ip address from pretreated operation system data And/or the data of particular host and/or particular system, specifically include step S302:
Loading monitoring configuration, in the monitoring before processing configuration target monitoring item of progress log stream, generating run, rule supplies Network analysis is used;
Monitored item is filtered, the identification and filtering of institute's monitored object are managed collectively by monitoring configuration information;Goal systems is The monitored object of the system, relevant information collection service is operated in the system, by servicing the machine ip addresses with place Or domain name carrys out unique mark;Whenever newly, one monitored system of access, the system are registered in the present system, generate unique mark Know, which needs and information gathering service-conformance.
Further, the monitored item analysis is according to the monitored item institute pass set in advance for corresponding to monitored service Keyword, judges whether the operation system daily record after reprocessing is the anomaly unit, which will also pass through monitoring rules Double filtration with alarm rule just decides whether to generate warning message, specifically includes:
Step S303 identifies the target journaling to be alarmed, according to presetting keyword, such as day by defining monitored item " making loans unsuccessfully " keyword occurred in will, i.e., " make loans unsuccessfully " for a monitored item;
If there is " making loans unsuccessfully " keyword, then step S304 is performed, it is believed that there occurs a business is different for operation system Often, it is defined as anomaly unit;The information such as ARGUS_ID, HOST according to daily record bag, obtain corresponding configuration monitoring item, are supervised The lookup or canonical matching of item are controlled, anomaly unit is generated, then assembles the body messages of anomaly unit, launch;
The message format of anomaly unit is as follows:
Step S305 is performed if no service exception to terminate.
Further, each monitored item is in configuration, it is necessary to, the two entity relationship corresponding with some monitored service As shown in figure 4, each monitored item uniquely belongs to a monitored service, each monitored service can have multiple monitored item.
Further, step S306 is performed, selects corresponding monitoring is abnormal to merge rule, determines the abnormal number of generation monitoring Amount;The monitoring rules include system-level abnormal merging and/or application level exceptions merge, wherein the system-level exception is merged into Definition monitoring abnormal alarm frequency threshold value, the application level exceptions merge into user can customize monitoring merge extremely it is preferential Level, to generate monitoring abnormal alarm, specifically includes:
The system-level abnormal rule that merges can be automatically applied to all by service system, and being set for whole monitoring system needs The upper limit threshold of the monitoring abnormal number of alarm, alarms as certain monitored item of certain application system occurred more than 1000 times in 10 seconds, In this case, the details each alarmed are what is in fact inessential, it is important to generate the fact that a large amount of alarms, because This, it is more reasonable to be sent after which is merged, and the alarm system in such downstream needs to handle 1000 warning messages originally, existing Only needing to send 1, significantly reducing resources occupation amount;
The self-defined exception that application level exceptions merge rule to be provided according to applying in configuration monitoring strategy merges rule, The regular threshold value will be less than system-level abnormal consolidation strategy;
Combining objects above are each host address HOST, the monitoring exception object inbound message queue MQ of generation.
Further, monitoring strategies of the step S308 according to selection are performed, generation alarm signal is handled by alarm generation module Breath;The monitoring strategies include monitoring nodes strategy, Node B threshold monitoring strategies, Node B threshold ratio monitoring strategies, fixed-point number According to any one or more in threshold value monitoring strategies, standard abnormal monitoring strategy and standard APM monitoring strategies, according to correlation The demand of business scenario and system monitoring scene, configures a variety of monitoring strategies and makes choice, specifically include:
Monitoring nodes strategy:This kind of strategy is adapted to such scene, i.e. when desired node key occurs in monitored daily record After word, that is, trigger the generation of alarm unit, such as during system log monitoring, it is crucial in the event of " Fail Transaction " Word, is an alarm, or the mode such as " Fail Transaction and bank card limit or Fail Transaction and payment cipher mistake ", Support canonical matching;This kind of monitoring strategies can be used for alarming, and can also be used to count, as counted some operations or request Number, defines with reference to some time windows, can export the value of some frequencys, such as counts the page browsing amount pv of logging request, Or query rate qps per second of transaction request etc.;
Node B threshold monitors:The number that certain class monitoring strategies occurs in some time window compared with the threshold value of configuration, Reach some condition to alarm;Alarm for example, being less than 100w in the registration amount half an hour of certain channel user;Alternatively, certain user 5 Buying behavior is less than i.e. 10 scenes such as alarm within it;
Node B threshold ratio monitors:The number that certain class monitoring type occurs occupies the ratio occurred in all types and presets Threshold value compare, reach some condition and alarm;For example, user's registration, which can all produce, sends short message behavior, if at 10 minutes Interior, system generates 1000 times and sends short message request, and only 900 times send successfully, then it is 90% to send success rates, such as Fruit is alarmed less than 95%;Alternatively, interior when 12 is small, system coprocessing 1000 than transaction request, and only 900 than return into Work(, then it is 90% to pay success rate, and the scenes such as alarm are produced if less than 95%;
Fixed-point data threshold value monitors:Specific data in some monitoring data sources is monitored, such as is supervised in daily record In control, meeting returned data when system response front end is asked, if return to the data of JSON forms, such as following table:
Such as the data of JSON forms in upper table, if " e5a05b6d03ee45589f5704ba7210e3dd " is corresponding The value of minMobileAmt in the JSON objects of productDetailId is more than 1000 and produces alarm;
System automatically can parse different types of data automatically, so in configuration, configure an anchor point data, match somebody with somebody Put describe that the anchor point data represent be which key value, such as productDetailId, system positioned by the two data To data object, i.e. the object Object of JSON, then obtains the value of the minMobileAmt in the Object.No matter which kind of is counted According to form, application system front end is all to go to parse according to certain rule, is present in data in daily record also in compliance with this rule, So monitoring system goes to parse according to this rule.Can be with for the representation system of common JSON and KEY-Value Support;
Standard abnormal monitoring:This kind of monitoring strategies are intended to be monitored the exception of some standards in Log Source, so It is abnormal that user may not need input, it is only necessary to selection, such as NullpointerException, ClassCastException Etc..It can be distinguished with the list of maintenance abnormality and according to different language form;
Standard APM is monitored:This kind of monitoring strategies be intended to some as Java Virtual Machine JVM, database runtime parameter, Redis example states etc. are monitored;Occur for example, can configure in middleware periodically printing GC daily records log, log Alarm during Full GC;Or JMX periodically prints each parameter state of current Java Virtual Machine JVM examples, daily record is monitored In some parameters, the value that gets parms pass through and threshold comparison alarm;Or redis, it can periodically obtain message info The output of instruction is printed in daily record, and some parameter values in daily record are monitored.
Further, step S309 alerts are performed, the alert passes through mail, short message or micro- One or more message informing media of letter are realized.
Further, the monitoring for generation is abnormal, and the alarm that directly generates that should not be untreated is pushed to relevant people, Such as delay in the server of some business dependence under the extreme case of machine, can cause to monitor abnormal flooding, if abnormal untreated, Alarm system pressure can be caused huge, so as to cause short message, mailing system pressure to rise violently, bring avalanche effect, therefore, it is necessary to root Needed according to application, select different alarm strategy alerts, such as, the monitoring of " Fail Transaction " is abnormal, can define To occur just to send short messages for 3 times in 10s, can also be defined as in 1 minute at most alarm 2 times, 1 day at most alarm 100 times, often After alarm 2 times, the alarm times that occur altogether during 900s is reminded again again;Although warning message is merged, Be on the data store, still record each alarm, as in 900s there occurs 300 alarms, although only having sent out a short message, But backstage still records 300 alarm loggings, i.e. step S310.
According to the definition of alarm strategy and to solve the problems, such as, now provide several basic alarm strategies:
Further, the embodiment of the present invention also supports alarm to shield, and closes expected alarm sending action, shielding strategy It is as follows:
Time dimension:Including a continuous period, alarm sending action is no longer performed when such as continuous 1 is small;One solid The fixed period, such as no longer alarms for daily 0 point to 6 points;Festivals or holidays do not alarm;Weekend does not alarm;
Mask granularities:As some monitored item does not produce alarm;Specific application service is not alarmed, and does not support Close All, Granularity is up to application level;
Alarm object:Shield specific alert receipt people.
In addition, one embodiment of the invention additionally provides a kind of monitoring system, as shown in Figure 1, the monitoring of the embodiment of the present invention System, it is that data acquisition pretreatment module 101, monitoring module 102, warning message send mould respectively by four module compositions Block 103, monitoring configuration module 104, its general frame are as follows:
Data acquisition pretreatment module 101, for gathering and pre-processing operation system data, wherein the operation system number According to including operation system daily record data 1011;
Monitoring module 102, including:Unit 1021 is reprocessed, for obtaining pretreated operation system data and carrying out Reprocessing;Judging unit 1022, for being monitored item analysis based on the operation system data after reprocessing, judges abnormal single Member;Warning message generation unit 1023, for generating monitoring abnormal alarm according to monitoring rules;
Warning message sends module 103, for using corresponding monitoring strategies alert;
Configuration module 104 is monitored, for configuring the monitored item of visualization interface, monitoring strategies, predetermined type of alarm, on duty People, system configuration, warning message inquiry it is any one or more.
Further, the operation system data further include application program capacity monitoring data 1012 (APM).
Further, as shown in figure 5, Fig. 5 is the internal junction of the data acquisition pretreatment module 101 of one embodiment of the invention Structure schematic diagram.What it is due to progress is that the real-time of daily record on line is analyzed, so data volume can be very big, if being directly entered monitoring system System, can bring impact to system, so access layer 1013 (buffer) enters the buffering before monitoring system as data, permit at the same time Perhaps certain data drop policy, such as discards the raw daily record of contracting for fixed output quotas of non-com.creditease.service*, described pre- Processing business system data includes operation system data are filtered and standardized, and specifically includes:
Preliminary filtering, the cleaning of daily record are carried out, insignificant text is such as filtered out, supports the filtering of rule configuration;
The standardization of daily record is carried out, is standardized according to the predefined log entities model of system.
Further, the pretreatment operation system data are further included is distributed control to operation system data, specifically Including:
The distribution of daily record unified model entity information is carried out, corresponding processing queue is such as distributed to, is carried out by down-stream system Processing;
The state of queue is monitored, if queue full, alarm is sent and carries out access service degradation;
Take bag drop policy and refusal to service, or downstream treatment systems resource is scheduled, with protection monitoring system System prevents from flooding.
Further, as shown in fig. 6, Fig. 6 is the monitoring module structure diagram of one embodiment of the invention, the present invention is implemented The core of example monitoring system is monitoring module 102, according to monitored item set in advance, for the daily record collected, is carried out real-time Analysis, for the monitored item of triggering, using monitoring strategies, decides whether generation alarm, there is provided expansible, flexible configuration clothes Business;Persistent storage is on the one hand carried out for the warning message of generation, alarm system is on the other hand pushed to, to predetermined on duty Human hair send warning message.
Further, the reprocessing is to filter out to come from particular ip address from pretreated operation system data And/or the data of particular host and/or particular system, specifically include:
Goal systems is the monitored object of the system, and the i.e. operation of relevant information collection service on this system, passes through clothes Unique mark is carried out in the machine ip addresses or domain name at business and place;
Whenever newly, one monitored system of access, the system are registered in the present system, generate unique mark, which needs Will be with information gathering service-conformance.
Further, the reprocessing, which is further included, is standardized and verifies to pretreated operation system data.
Further, monitored item analysis is according to the monitored item institute key set in advance for corresponding to monitored service Word, judges whether the operation system daily record after reprocessing is the anomaly unit, is specifically included:
According to presetting keyword, monitored item is defined, identifies the target journaling to be alarmed, as occurred " making loans in daily record The keyword of failure ", then it is assumed that operation system is defined as anomaly unit there occurs a service exception, then " make loans unsuccessfully " for One monitored item;
Each monitored item uniquely belongs to a monitored service, and each monitored service can have multiple monitored item.
Monitored item has following base attribute:
Further, the monitoring rules include system-level abnormal merging and/or application level exceptions merge, described system-level Exception merges into the frequency threshold value of definition monitoring abnormal alarm, and the application level exceptions merge into user and can customize monitoring exception The priority of merging, to generate monitoring abnormal alarm.
Further, the monitoring strategies refer to monitoring nodes strategy, Node B threshold monitoring strategies, Node B threshold ratio prison Control in strategy, fixed-point data threshold value monitoring strategies, standard abnormal monitoring strategy and standard APM monitoring strategies any one or It is a variety of.
Further, monitoring exception is not directly generated for directly hitting the anomaly unit of monitored item, but passes through one Fixed strategy goes to generate, and the monitored item of such as " Fail Transaction " triggers anomaly unit, and there occurs three in 10s for the anomaly unit Secondary, then generation monitoring is abnormal, with facilitate service exception that small probability occurs more flexibly, the strategy of subdivision goes to be defined;
Further, the anomaly unit of only monitored item triggering meets that predetermined monitoring strategies can just generate monitoring exception, Logical relation between the abnormal model of monitored item, anomaly unit, monitoring strategies, monitoring is as shown in fig. 7, Fig. 7 is that monitoring & alarms are overall Model:
Further, warning message sends module 103, for using corresponding monitoring strategies alert, specifically Including:
Monitoring module 102 and warning message are decoupled between sending module 103 by a distributed deque, are adopted Synchronization is done with distributed lock;
The monitoring strategies of each monitoring module 102 correspond to a queue;
Into enqueue, warning message sends module and safeguards a thread pool warning message generation unit 1023 of monitoring;
Warning message sends module 103 and obtains queue lists in real time, and one is initiated whenever there is new queue to produce i.e. dynamic Processing thread removes processing warning message;
If queue is idle for a long time, thread termination, to save resource;
The warning message that the warning message generation unit 1023 of monitoring module 102 generates is received, using predetermined monitoring plan Slightly, send and alarm to watch, support a variety of type of alarms;
Warning message sends module 103 and is provided with retry mechanism.
Further, the alert can pass through one kind of short message 1031, mail 1032 or wechat 1033 Or various kinds of media is realized.
Further, as shown in Figure 8 and Figure 9, Fig. 8 and Fig. 9 is that the monitoring configuration module of one embodiment of the invention is used respectively Family visualization interface schematic diagram, monitoring configuration module 104 are used for monitored item 1041, the monitoring strategies for configuring visualization interface 1042nd, alarm inquiry 1043, watch 1044, type of alarm 1045, any one of system configuration such as statistical analysis 1046 or It is multinomial, specifically include:
Configuration module 104 is monitored, management, configuration system as whole platform, and some parameter monitorings, monitoring are whole The operating status of platform;
User is configured and is selected corresponding function and parameter by the module according to self-demand, including arranged below And selection:
Monitored item 1041, monitoring strategies 1042 configure;
Alarm inquiry 1043, alarm Web page check that short network address redirects, and is attached in alarming short message and mail, check alarm Details;
The grade of watch 1044 configures;
Type of alarm 1045 configures, including alarms, strategy configuration of alarming;
Systematic parameter configuration, queue, storm running state monitorings;
Queue lists are safeguarded, loading, access new monitored application during operation.
Further, monitoring configuration module is additionally operable to application-level information statistical analysis 1046, specifically includes:
Qps, tps statistics of system, the real-time statistics of pv, uv;
Access path, response time statistics;
Monitor specific user and access trace, request tracing etc.;
Other are related to the related service application scenarios of daily record data.
Further, which further includes memory module 105, and persistent storage is carried out to the warning message of generation, For inquiring about, statistical analysis.Merge if having carried out alarm, alarm as certain monitored item generates 10 times, alarm system is by 10 times An alarming short message is merged into alarm, then still stores 10 warning messages.
In addition, one embodiment of the invention also provides a kind of electronic equipment 40, and as shown in Figure 10, the electronic equipment of the embodiment 40 can include but is not limited to:One or more processor or processor 401, memory 402, connects different component (bags Include memory 402 and processor 401) bus 403, with external equipment 404 (such as keyboard, sensing equipment, display) carry out The I/O interfaces 405 of data transfer.The embodiment electronic equipment 40 can be PC (PersonalComputer, personal computer), Can also be tablet computer, pocket computer or other display devices with processor.
Further, processor 401 is stored in the computer program in memory 402 by operation, so as to perform various Application of function and data processing, for example, performing the instruction for each step being used for realization in above method embodiment;It is specific and Speech, processor 401 can perform the computer program stored in memory 402, and the computer program is performed, following fingers Order is run:Gather and pre-process operation system data, wherein the operation system data include operation system daily record;
Obtain pretreated operation system data and reprocessed;
Item analysis is monitored based on the operation system data after reprocessing, judges anomaly unit;
Monitoring abnormal alarm is generated according to monitoring rules;
Using corresponding monitoring strategies alert.
Further, memory 402 can include the computer system readable media of form of volatile memory, for example, Random access memory (RAM) 4021 and/or cache memory 4022;Monitoring device 40 may further include it is other can Movement/immovable, volatile/non-volatile computer system storage medium.Only as an example, ROM 4023 can be used for Read and write immovable, non-volatile magnetic media (not shown in Figure 10, commonly referred to as " hard disk drive ").Although not in Figure 10 In show, can provide for the disc driver to moving non-volatile magnetic disk (such as " floppy disk ") read-write, and pair can The CD drive of mobile anonvolatile optical disk (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these situations Under, each driver can be connected by one or more data media interfaces with bus 403.
Further, at least one program product can be included in system storage 402, which has one group (for example, at least one) program module, these program modules are configured to perform the function of various embodiments of the present invention, it has one Program/the utility 4025 of group (at least one) program module 4024, can be stored in such as system storage 402, and Such program module 4024 includes but not limited to:Operating system, one or more application program, other program modules and Routine data, may include the realization of network environment in each or certain combination in these examples.Program module 4024 is logical Often perform the function and/or method in embodiment described in the invention.
As shown in Figure 10, other 401 moulds of such as processor that network adapter 406 passes through bus 403 and monitoring device 40 Block communicates.
In addition, one embodiment of the invention also provides a kind of computer-readable recording medium, computer journey is stored thereon with Sequence.When the program is executed by processor, each step described in above method embodiment can be realized, such as:
Gather and pre-process operation system data, wherein the operation system data include operation system daily record;
Obtain pretreated operation system data and reprocessed;
Item analysis is monitored based on the operation system data after reprocessing, judges anomaly unit;
Monitoring abnormal alarm is generated according to monitoring rules;
Using corresponding monitoring strategies alert.
Further, the computer-readable recording medium can be ROM, random access memory (RAM), CD-ROM, Tape, floppy disk and optical data storage devices etc..
In addition, although in the accompanying drawings with the operation of particular order the invention has been described method, still, this do not require that or Hint must perform these operations according to the particular order, or the operation having to carry out shown in whole could realize it is desired As a result.Additionally or alternatively, it is convenient to omit multiple steps are merged into a step and performed by some steps, and/or by one Step is decomposed into execution of multiple steps.
Although by reference to above-mentioned embodiment the spirit and principle that the invention has been described, it should be appreciated that, this Invention is not limited to disclosed embodiment, and the division to each side does not mean that the feature in these aspects cannot yet Combination is to be benefited, and this division is merely to the convenience of statement.It is contemplated that cover appended claims spirit and In the range of included various modifications and equivalent arrangements.

Claims (22)

  1. A kind of 1. monitoring method of operation system, for monitoring multiple operation systems at the same time, it is characterised in that including:
    Gather and pre-process operation system data, wherein the operation system data include operation system daily record;
    Obtain pretreated operation system data and reprocessed;
    Item analysis is monitored based on the operation system data after reprocessing, judges anomaly unit;
    Monitoring abnormal alarm is generated according to monitoring rules;
    Using corresponding monitoring strategies alert.
  2. 2. monitoring method as claimed in claim 1, it is characterised in that the operation system data further include application program capacity Monitoring data.
  3. 3. monitoring method as claimed in claim 1, it is characterised in that the pretreatment operation system data are included to business system System data are filtered and standardized.
  4. 4. monitoring method as claimed in claim 3, it is characterised in that the pretreatment operation system data are further included to business System data is distributed control.
  5. 5. such as claim 1-4 any one of them monitoring methods, it is characterised in that the reprocessing is from pretreated industry The data for coming from particular ip address and/or particular host and/or particular system are filtered out in business system data.
  6. 6. monitoring method as claimed in claim 5, it is characterised in that the reprocessing is further included to pretreated business system System data are standardized and verify.
  7. 7. such as claim 1-4 any one of them monitoring methods, the monitored item analysis is included according to corresponding to monitored clothes The monitored item institute keyword set in advance of business, judges whether the operation system daily record after reprocessing is that the exception is single Member.
  8. 8. such as claim 1-4 any one of them monitoring methods, the monitoring rules include system-level abnormal merging and/or should Merged extremely with level, wherein the system-level exception merges into the frequency threshold value of definition monitoring abnormal alarm, the application layer is different Often merge into user and can customize the priority that monitoring merges extremely.
  9. 9. such as claim 1-4 any one of them monitoring methods, it is characterised in that the monitoring strategies include monitoring nodes plan Summary, Node B threshold monitoring strategies, Node B threshold ratio monitoring strategies, fixed-point data threshold value monitoring strategies, standard abnormal monitoring plan Slightly and standard application performance monitoring data monitoring strategies in any one or more.
  10. 10. monitoring method as claimed in claim 1, it is characterised in that the alert by mail, short message or One or more media of wechat are realized.
  11. A kind of 11. monitoring system, for monitoring multiple operation systems at the same time, it is characterised in that including:
    Data acquisition pretreatment module, for gathering and pre-processing operation system data, wherein the operation system data include Operation system daily record;
    Monitoring module, including:Unit is reprocessed, for obtaining pretreated operation system data and being reprocessed;Judge Unit, for being monitored item analysis based on the operation system data after reprocessing, judges anomaly unit;Warning message generation is single Member, for generating monitoring abnormal alarm according to monitoring rules;
    Warning message sends module, for using corresponding monitoring strategies alert;
    Configuration module is monitored, for configuring the monitored item of visualization interface, monitoring strategies, predetermined type of alarm, watch, system Configuration, warning message are inquired about any one or more.
  12. 12. monitoring system as claimed in claim 11, it is characterised in that the operation system data further include application program Can monitoring data.
  13. 13. monitoring system as claimed in claim 11, it is characterised in that the pretreatment operation system data are included to business System data is filtered and standardized.
  14. 14. monitoring system as claimed in claim 13, it is characterised in that the pretreatment operation system data are further included to industry Business system data is distributed control.
  15. 15. such as claim 11-14 any one of them monitoring systems, it is characterised in that the reprocessing is after pretreatment Operation system data in filter out the data for coming from particular ip address and/or particular host and/or particular system.
  16. 16. monitoring system as claimed in claim 15, it is characterised in that the reprocessing is further included to pretreated business System data is standardized and verifies.
  17. 17. such as claim 11-14 any one of them monitoring systems, it is characterised in that monitored item analysis is included according to correspondence In the monitored item institute keyword set in advance of monitored service, judge whether the operation system daily record after reprocessing is institute State anomaly unit.
  18. 18. such as claim 11-14 any one of them monitoring systems, it is characterised in that the monitoring rules include system-level Abnormal merging and/or application level exceptions merge, wherein the system-level exception merges into the number threshold of definition monitoring abnormal alarm Value, the application level exceptions merge into user and can customize the priority that monitoring merges extremely.
  19. 19. such as claim 11-14 any one of them monitoring systems, it is characterised in that the monitoring strategies are supervised including node Strategy, Node B threshold monitoring strategies, Node B threshold ratio monitoring strategies, fixed-point data threshold value monitoring strategies, standard is controlled to supervise extremely Control any one or more in tactful and standard application performance monitoring data monitoring strategies.
  20. 20. monitoring system as claimed in claim 11, it is characterised in that the alert by mail, short message or One or more media of person's wechat are realized.
  21. 21. a kind of electronic equipment, including:
    Memory, for storing computer program;
    Processor, for performing the computer program stored in the memory, wherein when the computer program is performed, Perform claim requires the step of monitoring method any one of 1-10.
  22. 22. a kind of computer-readable recording medium, is stored thereon with computer program, wherein when computer program is held by processor During row, any one of execution the claims 1-10 the step of monitoring method.
CN201711404987.3A 2017-12-22 2017-12-22 Monitoring method and monitoring system of service system Active CN107992398B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711404987.3A CN107992398B (en) 2017-12-22 2017-12-22 Monitoring method and monitoring system of service system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711404987.3A CN107992398B (en) 2017-12-22 2017-12-22 Monitoring method and monitoring system of service system

Publications (2)

Publication Number Publication Date
CN107992398A true CN107992398A (en) 2018-05-04
CN107992398B CN107992398B (en) 2021-04-27

Family

ID=62041439

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711404987.3A Active CN107992398B (en) 2017-12-22 2017-12-22 Monitoring method and monitoring system of service system

Country Status (1)

Country Link
CN (1) CN107992398B (en)

Cited By (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108845910A (en) * 2018-05-31 2018-11-20 康键信息技术(深圳)有限公司 Monitoring method, device and the storage medium of extensive micro services system
CN108959054A (en) * 2018-07-16 2018-12-07 郑州云海信息技术有限公司 A kind of log processing method and device based on Logstash
CN108984376A (en) * 2018-05-31 2018-12-11 阿里巴巴集团控股有限公司 A kind of system anomaly detection method, device and equipment
CN109413063A (en) * 2018-10-23 2019-03-01 中国平安人寿保险股份有限公司 A kind of white list updating method based on big data, device and electronic equipment
CN109491873A (en) * 2018-11-05 2019-03-19 网易无尾熊(杭州)科技有限公司 It caches monitoring method, medium, device and calculates equipment
CN109582533A (en) * 2018-10-31 2019-04-05 深圳壹账通智能科技有限公司 Data analysing method, device, electronic equipment and storage medium
CN109739724A (en) * 2018-12-29 2019-05-10 携程旅游网络技术(上海)有限公司 Data monitoring method, system, electronic equipment and storage medium
CN109766323A (en) * 2018-12-17 2019-05-17 深圳壹账通智能科技有限公司 The full link monitoring method, apparatus of loan transaction, computer equipment and storage medium
CN109871380A (en) * 2019-01-14 2019-06-11 深圳市东信时代信息技术有限公司 A kind of crowd's packet application method and system based on Redis
CN109960635A (en) * 2019-04-18 2019-07-02 江苏满运软件科技有限公司 The monitoring of real-time computing platform and alarm method, system, equipment and storage medium
CN110072251A (en) * 2019-04-23 2019-07-30 贵阳朗玛通信科技有限公司 A kind of method and device of analysis user communication behavior and management user
CN110138634A (en) * 2019-04-03 2019-08-16 福建天泉教育科技有限公司 Key data monitoring method and terminal
CN110147405A (en) * 2019-04-15 2019-08-20 平安普惠企业管理有限公司 A kind of monitoring method based on pedestal O&M, device, equipment and storage medium
CN110335142A (en) * 2019-06-28 2019-10-15 苏宁消费金融有限公司 A kind of intelligence flow based on syndicated loan is transferred to civilian work the realization system of business
CN110598199A (en) * 2018-06-12 2019-12-20 百度在线网络技术(北京)有限公司 Data stream processing method and device, computer equipment and storage medium
CN110688173A (en) * 2019-09-29 2020-01-14 北京金山安全软件有限公司 A method, device and electronic device for positioning components in a cross-platform interface framework
WO2020015098A1 (en) * 2018-07-18 2020-01-23 平安科技(深圳)有限公司 Data management method, terminal device, and medium
CN110795315A (en) * 2018-08-01 2020-02-14 北京京东尚科信息技术有限公司 Method and apparatus for monitoring services
CN110808993A (en) * 2019-11-08 2020-02-18 京东数字科技控股有限公司 Data transmission control method, device, computer system and medium
CN110858170A (en) * 2018-08-23 2020-03-03 阿里巴巴集团控股有限公司 Sandbox component, data abnormity monitoring method, equipment and storage medium
CN110928716A (en) * 2019-10-24 2020-03-27 江苏苏宁物流有限公司 Scheduling task exception handling method and device
CN111274095A (en) * 2020-02-24 2020-06-12 深圳前海微众银行股份有限公司 Log data processing method, apparatus, device and computer-readable storage medium
CN111290927A (en) * 2018-12-06 2020-06-16 北京京东尚科信息技术有限公司 Data monitoring method and device
CN111324374A (en) * 2018-12-17 2020-06-23 顺丰科技有限公司 Application registration method and device based on application performance management system
WO2020147264A1 (en) * 2019-01-14 2020-07-23 平安科技(深圳)有限公司 Method, apparatus and device for monitoring multi-system log data, and readable storage medium
CN111447202A (en) * 2020-03-24 2020-07-24 江苏易安联网络技术有限公司 Visual arrangement system of security policy
CN111694715A (en) * 2020-04-14 2020-09-22 新华三大数据技术有限公司 Abnormity warning method, device, equipment and machine readable storage medium
CN111770002A (en) * 2020-06-12 2020-10-13 南京领行科技股份有限公司 Test data forwarding control method and device, readable storage medium and electronic equipment
CN111782481A (en) * 2020-07-15 2020-10-16 五色石(杭州)数据技术有限公司 Universal data interface monitoring system and monitoring method
CN111796984A (en) * 2020-06-29 2020-10-20 平安国际智慧城市科技股份有限公司 Data monitoring method and device, computer equipment and storage medium
CN111815449A (en) * 2020-07-13 2020-10-23 上证所信息网络有限公司 Flow calculation-based anomaly detection method and system for multi-host market quotation system
CN111858250A (en) * 2020-07-22 2020-10-30 平安证券股份有限公司 Monitoring problem checking method and device, computer equipment and storage medium
CN112039936A (en) * 2019-06-03 2020-12-04 杭州海康威视系统技术有限公司 Data transmission method, first data processing equipment and monitoring system
CN112187511A (en) * 2020-08-28 2021-01-05 苏州浪潮智能科技有限公司 Method, system, equipment and medium for positioning micro-service fusing abnormity
CN112256516A (en) * 2019-07-22 2021-01-22 广州酷旅旅行社有限公司 Data analysis processing method for hotel direct connection system
WO2021012481A1 (en) * 2019-07-23 2021-01-28 平安科技(深圳)有限公司 System performance monitoring method and apparatus, device, and storage medium
CN112333188A (en) * 2020-11-04 2021-02-05 成都中科大旗软件股份有限公司 Data service interface, log supervision method, system, storage medium and terminal
CN112492632A (en) * 2020-11-09 2021-03-12 厦门亿联网络技术股份有限公司 Anomaly monitoring method and system based on roaming system
CN112685256A (en) * 2020-12-30 2021-04-20 上海掌门科技有限公司 Server monitoring method, device and medium
CN112799903A (en) * 2019-11-14 2021-05-14 北京沃东天骏信息技术有限公司 Method and device for evaluating health state of business system
CN113064797A (en) * 2021-04-22 2021-07-02 平安国际智慧城市科技股份有限公司 Front-end monitoring method and device, electronic equipment and storage medium
CN113076333A (en) * 2021-04-25 2021-07-06 中国建设银行股份有限公司 Service abnormity monitoring method and device
CN113138898A (en) * 2021-04-26 2021-07-20 上海淇玥信息技术有限公司 Method and device for identifying and early warning business system abnormity and electronic equipment
CN113297039A (en) * 2020-07-24 2021-08-24 阿里巴巴集团控股有限公司 Data monitoring method and device
CN113485891A (en) * 2021-06-30 2021-10-08 中国民航信息网络股份有限公司 Service log monitoring method and device, storage medium and electronic equipment
CN113656243A (en) * 2021-07-28 2021-11-16 连连(杭州)信息技术有限公司 Service index monitoring method, device, equipment and storage medium
CN113760640A (en) * 2020-11-13 2021-12-07 北京沃东天骏信息技术有限公司 Monitoring log processing method, device, device and storage medium
CN113760641A (en) * 2021-01-08 2021-12-07 北京沃东天骏信息技术有限公司 Service monitoring method, apparatus, computer system, and computer-readable storage medium
CN113837868A (en) * 2021-09-30 2021-12-24 重庆富民银行股份有限公司 Passenger group layering system and method
CN113934599A (en) * 2021-10-20 2022-01-14 北京京东振世信息技术有限公司 Information processing method, device, electronic equipment and computer readable medium
CN114091698A (en) * 2021-11-15 2022-02-25 广东电网有限责任公司 Three-dimensional visualization method, system, equipment and computer medium for business data
CN114253795A (en) * 2021-11-12 2022-03-29 国家电网有限公司客户服务中心 Real-time service monitoring system suitable for mass data
CN114328387A (en) * 2021-12-29 2022-04-12 北京合思信息技术有限公司 Data management method and device
CN114615169A (en) * 2020-12-03 2022-06-10 腾讯科技(深圳)有限公司 Path monitoring method, device and computer readable storage medium
CN114637656A (en) * 2022-05-13 2022-06-17 飞狐信息技术(天津)有限公司 Redis-based monitoring method, device, storage medium and equipment
CN115774641A (en) * 2021-09-06 2023-03-10 中移(苏州)软件技术有限公司 Monitoring method, device, equipment and storage medium
CN115941443A (en) * 2022-12-06 2023-04-07 天翼电子商务有限公司 A message queue-based service exception alarm method and system
CN116755408A (en) * 2023-06-29 2023-09-15 江苏西格数据科技有限公司 An MES production monitoring and early warning method and system
CN116895046A (en) * 2023-07-21 2023-10-17 北京亿宇嘉隆科技有限公司 Abnormal operation and maintenance data processing method based on virtualization
TWI836263B (en) * 2021-09-02 2024-03-21 大陸商鼎捷軟件股份有限公司 Interface control system and method
CN118331827A (en) * 2024-05-14 2024-07-12 浪潮(山东)农业互联网有限公司 Log-based back-end service monitoring and alarming method, equipment and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150317231A1 (en) * 2010-03-31 2015-11-05 Cloudera, Inc. Collecting and aggregating log data with fault tolerance
CN105183609A (en) * 2015-09-16 2015-12-23 焦点科技股份有限公司 Real-time monitoring system and method applied to software system
CN105847083A (en) * 2016-03-22 2016-08-10 乐视云计算有限公司 Business centralized monitoring method and system
CN106294066A (en) * 2016-08-01 2017-01-04 北京百度网讯科技有限公司 Alert data processing method and device
CN106407077A (en) * 2016-09-21 2017-02-15 广州华多网络科技有限公司 A real-time alarm method and system
CN106776780A (en) * 2016-11-23 2017-05-31 中国科学院软件研究所 Data exchange and shared method and system in a kind of cloud environment
CN106897614A (en) * 2017-02-27 2017-06-27 郑州云海信息技术有限公司 Initiative Defense daily record alarm method on a kind of SSR

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150317231A1 (en) * 2010-03-31 2015-11-05 Cloudera, Inc. Collecting and aggregating log data with fault tolerance
CN105183609A (en) * 2015-09-16 2015-12-23 焦点科技股份有限公司 Real-time monitoring system and method applied to software system
CN105847083A (en) * 2016-03-22 2016-08-10 乐视云计算有限公司 Business centralized monitoring method and system
CN106294066A (en) * 2016-08-01 2017-01-04 北京百度网讯科技有限公司 Alert data processing method and device
CN106407077A (en) * 2016-09-21 2017-02-15 广州华多网络科技有限公司 A real-time alarm method and system
CN106776780A (en) * 2016-11-23 2017-05-31 中国科学院软件研究所 Data exchange and shared method and system in a kind of cloud environment
CN106897614A (en) * 2017-02-27 2017-06-27 郑州云海信息技术有限公司 Initiative Defense daily record alarm method on a kind of SSR

Cited By (87)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108845910A (en) * 2018-05-31 2018-11-20 康键信息技术(深圳)有限公司 Monitoring method, device and the storage medium of extensive micro services system
CN108984376A (en) * 2018-05-31 2018-12-11 阿里巴巴集团控股有限公司 A kind of system anomaly detection method, device and equipment
CN108845910B (en) * 2018-05-31 2024-02-06 康键信息技术(深圳)有限公司 Monitoring method, device and storage medium of large-scale micro-service system
CN108984376B (en) * 2018-05-31 2021-11-19 创新先进技术有限公司 System anomaly detection method, device and equipment
CN110598199B (en) * 2018-06-12 2023-07-25 百度在线网络技术(北京)有限公司 Data stream processing method, device, computer equipment and storage medium
CN110598199A (en) * 2018-06-12 2019-12-20 百度在线网络技术(北京)有限公司 Data stream processing method and device, computer equipment and storage medium
CN108959054A (en) * 2018-07-16 2018-12-07 郑州云海信息技术有限公司 A kind of log processing method and device based on Logstash
WO2020015098A1 (en) * 2018-07-18 2020-01-23 平安科技(深圳)有限公司 Data management method, terminal device, and medium
CN110795315A (en) * 2018-08-01 2020-02-14 北京京东尚科信息技术有限公司 Method and apparatus for monitoring services
CN110858170A (en) * 2018-08-23 2020-03-03 阿里巴巴集团控股有限公司 Sandbox component, data abnormity monitoring method, equipment and storage medium
CN110858170B (en) * 2018-08-23 2023-06-30 阿里巴巴集团控股有限公司 Data anomaly monitoring method and device
CN109413063B (en) * 2018-10-23 2022-01-18 中国平安人寿保险股份有限公司 White list updating method and device based on big data and electronic equipment
CN109413063A (en) * 2018-10-23 2019-03-01 中国平安人寿保险股份有限公司 A kind of white list updating method based on big data, device and electronic equipment
CN109582533A (en) * 2018-10-31 2019-04-05 深圳壹账通智能科技有限公司 Data analysing method, device, electronic equipment and storage medium
CN109491873A (en) * 2018-11-05 2019-03-19 网易无尾熊(杭州)科技有限公司 It caches monitoring method, medium, device and calculates equipment
CN111290927A (en) * 2018-12-06 2020-06-16 北京京东尚科信息技术有限公司 Data monitoring method and device
CN111324374A (en) * 2018-12-17 2020-06-23 顺丰科技有限公司 Application registration method and device based on application performance management system
CN109766323A (en) * 2018-12-17 2019-05-17 深圳壹账通智能科技有限公司 The full link monitoring method, apparatus of loan transaction, computer equipment and storage medium
CN111324374B (en) * 2018-12-17 2024-04-16 顺丰科技有限公司 Application registration method and device based on application performance management system
CN109739724A (en) * 2018-12-29 2019-05-10 携程旅游网络技术(上海)有限公司 Data monitoring method, system, electronic equipment and storage medium
CN109739724B (en) * 2018-12-29 2022-07-08 携程旅游网络技术(上海)有限公司 Data monitoring method, system, electronic device and storage medium
CN109871380A (en) * 2019-01-14 2019-06-11 深圳市东信时代信息技术有限公司 A kind of crowd's packet application method and system based on Redis
CN109871380B (en) * 2019-01-14 2022-11-11 深圳市东信时代信息技术有限公司 Crowd pack application method and system based on Redis
WO2020147264A1 (en) * 2019-01-14 2020-07-23 平安科技(深圳)有限公司 Method, apparatus and device for monitoring multi-system log data, and readable storage medium
CN110138634A (en) * 2019-04-03 2019-08-16 福建天泉教育科技有限公司 Key data monitoring method and terminal
CN110147405A (en) * 2019-04-15 2019-08-20 平安普惠企业管理有限公司 A kind of monitoring method based on pedestal O&M, device, equipment and storage medium
CN109960635A (en) * 2019-04-18 2019-07-02 江苏满运软件科技有限公司 The monitoring of real-time computing platform and alarm method, system, equipment and storage medium
CN110072251A (en) * 2019-04-23 2019-07-30 贵阳朗玛通信科技有限公司 A kind of method and device of analysis user communication behavior and management user
CN112039936B (en) * 2019-06-03 2023-07-14 杭州海康威视系统技术有限公司 Data transmission method, first data processing equipment and monitoring system
CN112039936A (en) * 2019-06-03 2020-12-04 杭州海康威视系统技术有限公司 Data transmission method, first data processing equipment and monitoring system
CN110335142B (en) * 2019-06-28 2022-04-08 苏宁消费金融有限公司 System for realizing intelligent transfer service based on joint loan
CN110335142A (en) * 2019-06-28 2019-10-15 苏宁消费金融有限公司 A kind of intelligence flow based on syndicated loan is transferred to civilian work the realization system of business
CN112256516A (en) * 2019-07-22 2021-01-22 广州酷旅旅行社有限公司 Data analysis processing method for hotel direct connection system
WO2021012481A1 (en) * 2019-07-23 2021-01-28 平安科技(深圳)有限公司 System performance monitoring method and apparatus, device, and storage medium
CN110688173B (en) * 2019-09-29 2023-01-20 北京金山安全软件有限公司 Positioning method and device of components in cross-platform interface framework and electronic equipment
CN110688173A (en) * 2019-09-29 2020-01-14 北京金山安全软件有限公司 A method, device and electronic device for positioning components in a cross-platform interface framework
CN110928716B (en) * 2019-10-24 2022-09-06 江苏苏宁物流有限公司 Scheduling task exception handling method and device
CN110928716A (en) * 2019-10-24 2020-03-27 江苏苏宁物流有限公司 Scheduling task exception handling method and device
CN110808993A (en) * 2019-11-08 2020-02-18 京东数字科技控股有限公司 Data transmission control method, device, computer system and medium
CN112799903A (en) * 2019-11-14 2021-05-14 北京沃东天骏信息技术有限公司 Method and device for evaluating health state of business system
CN111274095B (en) * 2020-02-24 2023-01-24 深圳前海微众银行股份有限公司 Log data processing method, device, equipment and computer readable storage medium
CN111274095A (en) * 2020-02-24 2020-06-12 深圳前海微众银行股份有限公司 Log data processing method, apparatus, device and computer-readable storage medium
CN111447202A (en) * 2020-03-24 2020-07-24 江苏易安联网络技术有限公司 Visual arrangement system of security policy
CN111447202B (en) * 2020-03-24 2021-03-30 江苏易安联网络技术有限公司 Visual arrangement system of security policy
CN111694715A (en) * 2020-04-14 2020-09-22 新华三大数据技术有限公司 Abnormity warning method, device, equipment and machine readable storage medium
CN111770002A (en) * 2020-06-12 2020-10-13 南京领行科技股份有限公司 Test data forwarding control method and device, readable storage medium and electronic equipment
CN111796984A (en) * 2020-06-29 2020-10-20 平安国际智慧城市科技股份有限公司 Data monitoring method and device, computer equipment and storage medium
CN111815449A (en) * 2020-07-13 2020-10-23 上证所信息网络有限公司 Flow calculation-based anomaly detection method and system for multi-host market quotation system
CN111815449B (en) * 2020-07-13 2023-12-19 上证所信息网络有限公司 Abnormality detection method and system of multi-host quotation system based on stream computing
CN111782481B (en) * 2020-07-15 2023-11-07 五色石(杭州)数据技术有限公司 Universal data interface monitoring system and monitoring method
CN111782481A (en) * 2020-07-15 2020-10-16 五色石(杭州)数据技术有限公司 Universal data interface monitoring system and monitoring method
CN111858250B (en) * 2020-07-22 2022-11-01 平安证券股份有限公司 Monitoring problem checking method and device, computer equipment and storage medium
CN111858250A (en) * 2020-07-22 2020-10-30 平安证券股份有限公司 Monitoring problem checking method and device, computer equipment and storage medium
CN113297039A (en) * 2020-07-24 2021-08-24 阿里巴巴集团控股有限公司 Data monitoring method and device
US12149411B2 (en) 2020-08-28 2024-11-19 Inspur Suzhou Intelligent Technology Co., Ltd. Method and system for locating micro-service fusing anomaly, device and medium
CN112187511A (en) * 2020-08-28 2021-01-05 苏州浪潮智能科技有限公司 Method, system, equipment and medium for positioning micro-service fusing abnormity
CN112333188B (en) * 2020-11-04 2022-11-18 成都中科大旗软件股份有限公司 Data service interface, log supervision method, system, storage medium and terminal
CN112333188A (en) * 2020-11-04 2021-02-05 成都中科大旗软件股份有限公司 Data service interface, log supervision method, system, storage medium and terminal
CN112492632A (en) * 2020-11-09 2021-03-12 厦门亿联网络技术股份有限公司 Anomaly monitoring method and system based on roaming system
CN112492632B (en) * 2020-11-09 2023-02-17 厦门亿联网络技术股份有限公司 Anomaly monitoring method and system based on roaming system
CN113760640A (en) * 2020-11-13 2021-12-07 北京沃东天骏信息技术有限公司 Monitoring log processing method, device, device and storage medium
CN114615169A (en) * 2020-12-03 2022-06-10 腾讯科技(深圳)有限公司 Path monitoring method, device and computer readable storage medium
CN114615169B (en) * 2020-12-03 2023-10-20 腾讯科技(深圳)有限公司 Path monitoring method, path monitoring device and computer readable storage medium
CN112685256A (en) * 2020-12-30 2021-04-20 上海掌门科技有限公司 Server monitoring method, device and medium
CN113760641A (en) * 2021-01-08 2021-12-07 北京沃东天骏信息技术有限公司 Service monitoring method, apparatus, computer system, and computer-readable storage medium
CN113760641B (en) * 2021-01-08 2024-12-06 北京沃东天骏信息技术有限公司 Business monitoring method, device, computer system and computer readable storage medium
CN113064797A (en) * 2021-04-22 2021-07-02 平安国际智慧城市科技股份有限公司 Front-end monitoring method and device, electronic equipment and storage medium
CN113076333A (en) * 2021-04-25 2021-07-06 中国建设银行股份有限公司 Service abnormity monitoring method and device
CN113138898A (en) * 2021-04-26 2021-07-20 上海淇玥信息技术有限公司 Method and device for identifying and early warning business system abnormity and electronic equipment
CN113485891A (en) * 2021-06-30 2021-10-08 中国民航信息网络股份有限公司 Service log monitoring method and device, storage medium and electronic equipment
CN113656243B (en) * 2021-07-28 2024-05-31 连连(杭州)信息技术有限公司 Service index monitoring method, device, equipment and storage medium
CN113656243A (en) * 2021-07-28 2021-11-16 连连(杭州)信息技术有限公司 Service index monitoring method, device, equipment and storage medium
TWI836263B (en) * 2021-09-02 2024-03-21 大陸商鼎捷軟件股份有限公司 Interface control system and method
CN115774641A (en) * 2021-09-06 2023-03-10 中移(苏州)软件技术有限公司 Monitoring method, device, equipment and storage medium
CN113837868A (en) * 2021-09-30 2021-12-24 重庆富民银行股份有限公司 Passenger group layering system and method
CN113934599A (en) * 2021-10-20 2022-01-14 北京京东振世信息技术有限公司 Information processing method, device, electronic equipment and computer readable medium
CN114253795A (en) * 2021-11-12 2022-03-29 国家电网有限公司客户服务中心 Real-time service monitoring system suitable for mass data
CN114091698A (en) * 2021-11-15 2022-02-25 广东电网有限责任公司 Three-dimensional visualization method, system, equipment and computer medium for business data
CN114328387A (en) * 2021-12-29 2022-04-12 北京合思信息技术有限公司 Data management method and device
CN114637656A (en) * 2022-05-13 2022-06-17 飞狐信息技术(天津)有限公司 Redis-based monitoring method, device, storage medium and equipment
CN114637656B (en) * 2022-05-13 2022-09-20 飞狐信息技术(天津)有限公司 Redis-based monitoring method and device, storage medium and equipment
CN115941443A (en) * 2022-12-06 2023-04-07 天翼电子商务有限公司 A message queue-based service exception alarm method and system
CN116755408A (en) * 2023-06-29 2023-09-15 江苏西格数据科技有限公司 An MES production monitoring and early warning method and system
CN116895046A (en) * 2023-07-21 2023-10-17 北京亿宇嘉隆科技有限公司 Abnormal operation and maintenance data processing method based on virtualization
CN116895046B (en) * 2023-07-21 2024-05-07 北京亿宇嘉隆科技有限公司 Abnormal operation and maintenance data processing method based on virtualization
CN118331827A (en) * 2024-05-14 2024-07-12 浪潮(山东)农业互联网有限公司 Log-based back-end service monitoring and alarming method, equipment and medium
CN118331827B (en) * 2024-05-14 2025-01-24 浪潮(山东)农业互联网有限公司 A log-based backend service monitoring and alarm method, device and medium

Also Published As

Publication number Publication date
CN107992398B (en) 2021-04-27

Similar Documents

Publication Publication Date Title
CN107992398A (en) The monitoring method and monitoring system of a kind of operation system
EP2411927B1 (en) Monitoring of distributed applications
US8892960B2 (en) System and method for determining causes of performance problems within middleware systems
US8589343B2 (en) Systems and methods for digital file change monitoring
US20120311562A1 (en) Extendable event processing
CN110175451A (en) A kind of method for safety monitoring and system based on electric power cloud
CN107273267A (en) Log analysis method based on elastic components
CN107229556A (en) Log Analysis System based on elastic components
US9600523B2 (en) Efficient data collection mechanism in middleware runtime environment
CN101321084A (en) Method and apparatus for generating configuration rules for computing entities within a computing environment using association rule mining
Tang et al. Optimizing system monitoring configurations for non-actionable alerts
CN112596975A (en) Method, system, equipment and storage medium for monitoring network equipment
CN111092786A (en) Network equipment safety authentication service reliability enhancing system
CN112350854B (en) Flow fault positioning method, device, equipment and storage medium
CN108259202A (en) A kind of CA monitoring and pre-alarming methods and CA monitoring and warning systems
CN111669295A (en) Service management method and device
WO2025103085A9 (en) Log processing method and system, log management platform, and electronic device
CN111526109B (en) Method and device for automatically detecting running state of web threat recognition defense system
CN112969172A (en) Communication flow control method based on cloud mobile phone
Liu et al. A framework for database auditing
CN113760669A (en) Alarm method and device for problem data, electronic device, and storage medium
CN120104569A (en) A data archiving processing method and system based on shutdown system
Meng et al. Driftinsight: detecting anomalous behaviors in large-scale cloud platform
CN117455698A (en) Self-adaptive operation and maintenance monitoring method, system and medium for financial system
CN108289035B (en) Method and system for visually displaying running states of network and business system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant