HK1071956B

HK1071956B - Video tripwire

Info

Publication number: HK1071956B
Application number: HK05105007.5A
Authority: HK
Inventors: 彼德．L．韦奈蒂阿奈尔; 马克．C．阿尔蒙; 保罗．C．布鲁尔; 安德鲁．J．肖萨克; 约翰．I．W．克拉克; 马修．F．弗雷泽; 尼尔斯．希尔林; 于繦平田; 卡斯帕．霍恩; 艾伦．J．利普顿; 威廉．E．西弗森; 詹姆斯．S．斯费卡斯; 托马斯．E．斯洛韦; 托马斯．M．斯特拉特; 约翰．F．蒂尔基; 张忠
Original assignee: Avigilon Fortress Corporation
Priority date: 2001-10-09
Filing date: 2002-10-09
Publication date: 2008-05-16

Description

Video tripwire

Technical Field

The present invention relates to a monitoring system. More particularly, the present invention relates to a video-based monitoring system that implements a virtual tripwire.

Background

In its original form, a tripwire is an arrangement in which a wire, string or the like is stretched across a road, triggering some response if someone or something happens to trip over the wire or otherwise pull it. For example, such a response may be to detonate a mine, raise an alarm, or record an event (e.g., trigger a counter, camera, etc.). Today, tripwires are often implemented as beams of light (e.g., laser, infrared, or visible light); when someone or something breaks the beam, a response is triggered.

An example of a conventional tripwire utilizing a light beam is schematically illustrated in FIG. 1. The light source generates a light beam that is transmitted across the roadway to the receiver. If the beam is interrupted, i.e. the receiver no longer receives the beam, this results in triggering some response as discussed above.

The traditional tripwire has the advantages that: the tripwires are at least conceptually simple to use, and they also require minimal human intervention once the tripwires have been installed.

However, conventional tripwires have a number of drawbacks, for example, they do not distinguish between triggering an object of interest or an object of no interest. As an example, we may be concerned about how many people, rather than dogs, have walked the road; however, a person or animal will trigger the tripwire. There are also problems in that: if a group of people walks together, a single trigger of a tripwire results, rather than once for each person.

Additionally, conventional tripwire arrangements typically involve the installation of specialized equipment. For example, in view of the laser tripwire example, a laser source and laser detector must be installed across the roadway of interest. Furthermore, such dedicated equipment may be difficult to install in a manner that makes it less noticeable.

Furthermore, conventional tripwires do not provide a high degree of flexibility. Conventional tripwires typically only detect if someone or something passes through the tripwire, regardless of the direction of traversal. Moreover, because conventional tripwires extend only in a linear fashion, a limitation of conventional tripwires is the area through which the tripwire can be established.

Conventional video surveillance systems are also in common use today. For example, these video surveillance systems are commonly located in stores, banks, and many other facilities. Video surveillance systems typically involve the use of one or more video cameras and either record the video output from the camera for later review, or monitor the output video by a human observer, or both. Such a video surveillance system is shown in fig. 2, in which the camera 1 is aimed at the road. The camera 1 generates a video signal that is transmitted over a communication medium, shown here as a cable 2. The cable 2 feeds one or both of the visual display device 3 and the recording device 4.

In contrast to traditional tripwires, video surveillance systems are able to distinguish between people and animals (e.g., between objects of interest and objects of no interest), and are able to distinguish individual people among a group of people walking together. The video surveillance system also provides flexibility over tripwires in terms of the shape of the area that the video surveillance system is able to monitor. Further, since the video surveillance system is very widely used, it is not necessary to install additional equipment. However, video surveillance systems also have some drawbacks.

Perhaps the most important drawbacks of conventional video surveillance systems are: in order to extract information from the resulting video, a high degree of human intervention is required. That is, either someone must watch the video at the time it was generated, or someone must review the stored video.

Examples of prior art video-based surveillance systems can be found in U.S. Pat. Nos. 6,097,429 and No.6,091,771 to Seeley et al (hereinafter collectively referred to as "Seeley et al"). Seeley et al are focused on video security systems, including taking snapshots when an intrusion is detected. Seeley et al address some of the problems associated with false alarms and the need to detect certain intruders/intruders but not others. In this function, an image discrimination technique and an object recognition technique are used. However, as described below, there are many differences between Seeley et al and the present invention. One of the most serious drawbacks of Seeley et al is: there is no disclosure of how to perform detection and identification. The disclosure in this area is quite different from what is provided for the present invention.

Another example of a video-based and other sensor-based monitoring system is disclosed in U.S. patents No.5,696,503 and No.5,801,943 to Nasburg (hereinafter collectively referred to as "Nasburg"). Nasburg utilizes various sensors including video sensors to handle the tracking of the vehicle. A "fingerprint" is developed for the vehicle to be tracked and then used to detect individual vehicles. Although Nasburg mentions the concept of video tripwires, there is no disclosure of how to implement such video tripwires. In addition, Nasburg differs from the present invention in that: it focuses on detecting and tracking vehicles. Rather, the present invention as disclosed and claimed below is directed to detecting any moving object, including rigid (e.g., vehicles) and non-rigid (e.g., people).

Disclosure of Invention

In view of the above, it would be advantageous and an object of the present invention to have a surveillance system in which the advantages of tripwires and the advantages of video surveillance systems are combined.

The present invention implements a video tripwire system in which a virtual tripwire having an arbitrary shape is placed in a digital video using computer-based video processing techniques. The virtual tripwire is then monitored, again using computer-based video processing techniques. As a result of the monitoring, statistics can be compiled, intrusions detected, events logged, responses triggered, and the like. For example, in one embodiment of the invention, an event of a person crossing a virtual tripwire in one direction may trigger the capture of a snapshot of the person for future identification.

Thus, according to the present invention, there is provided a video tripwire system comprising: a sensing device for generating a video output; and a computer system including a user interface for performing calibration of the sensing device and collecting and processing data according to video output received from the sensing device, the user interface including input means and output means, wherein the computer system displays the processed data and allows a user to input at least one virtual tripwire using the user interface, the virtual tripwire including at least one line spanning at least a portion of an image corresponding to the video output, and determining from the video output whether the virtual tripwire has been traversed.

In addition, according to the present invention, there is also provided a method of implementing a video tripwire system, comprising the steps of: calibrating the sensing device to determine parameters of the sensing device used by the system; initializing a system comprising inputting at least one virtual tripwire, the inputting at least one virtual tripwire comprising: receiving an input from a user through a user interface defining the at least one virtual tripwire, wherein the at least one virtual tripwire includes at least one line spanning at least a portion of an image corresponding to the input from the sensing device; obtaining data from a sensing device; analyzing data obtained from the sensing device to determine whether at least one virtual tripwire has been traversed; and triggering a response to traversing the virtual tripwire.

The system of the present invention can be implemented using existing video equipment, in conjunction with computer equipment. Therefore, it has an advantage that it is not necessary to install a large amount of monitoring equipment. The system of the present invention may be embodied in part in the form of a computer readable medium containing software to implement the various steps of the corresponding method, or in the form of a computer system executing such software, which may include a computer network.

Furthermore, the system of the present invention may be used in conjunction with imaging devices other than traditional video, including thermal imaging systems or infrared cameras.

One embodiment of the present invention includes a method for implementing a video tripwire system, the method comprising the steps of: if no sensing device is present, then installing a sensing device (which may be a camera or other such device); calibrating the sensing device; establishing a boundary as a virtual tripwire; and collecting the data.

Further objects and advantages will become apparent from a consideration of the description, drawings and examples.

Definition of

In describing the present invention, the following definitions may apply throughout (including above).

"computer" refers to any device that is capable of accepting a structured input, processing the structured input according to specified rules, and producing processed results as output. Examples of the computer include: computers, general purpose computers, supercomputers, mainframes, ultraminiature computers, minicomputers, workstations, minicomputers, servers, interactive televisions, hybrid combinations of computers and interactive televisions, and application-specific hardware and/or software for simulating a computer. A computer may have a single processor or multiple processors that may operate in a parallel manner and/or not. A computer also refers to two or more computers connected together via a network for transmitting or receiving information between the computers. Examples of such computers include: a distributed computer system for processing information by computers connected by a network.

"computer-readable medium" refers to any storage device for storing data accessible by a computer. Examples of computer readable media include: a magnetic hard disk; a floppy disk; optical disks such as CD-ROMs or DVDs; a magnetic tape; a memory chip; and carrier waves carrying computer-readable electronic data such as those used in transmitting and receiving electronic mail or in accessing networks.

"software" refers to a specified rule calculated for operation. Examples of software include: software, code segments, instructions, computer programs, and program control logic.

"computer system" refers to a system having a computer, wherein the computer includes a computer-readable medium embodying software to operate the computer.

"network" refers to a number of computers and related devices connected by a communications infrastructure. A network involves fixed connections, such as cables, or temporary connections, such as those made through telephone or other communication links. Examples of networks include: an internetwork, such as the Internet, an intranet, a Local Area Network (LAN), a Wide Area Network (WAN), and combinations of networks, such as the Internet and an intranet.

"video" refers to a moving image represented in analog or digital form. Examples of videos include: television, movies, image sequences from a camera or other viewer, and computer-generated image sequences. These may be obtained from a field feed, a storage device, an IEEE 1394-based interface, a video digitizer, a computer graphics engine, or a network connection.

"video processing" refers to any operation on a video, including, for example, compression or editing.

A "frame" refers to a particular image or other discrete unit within a video.

Drawings

The invention will be better understood from a reading of the following detailed description, taken in conjunction with the accompanying drawings, in which like reference numerals identify like components:

FIG. 1 shows a prior art tripwire system;

FIG. 2 illustrates a prior art video surveillance system;

FIG. 3 illustrates a video tripwire system according to an embodiment of the present invention;

FIG. 4 shows a block diagram of an embodiment of an analysis system according to an embodiment of the invention;

FIG. 5 shows a flow chart for describing a method according to an embodiment of the invention;

FIG. 6 shows a flow chart for describing a first embodiment of the calibration step shown in FIG. 5;

FIG. 7 shows a flow chart for describing a second embodiment of the calibration step shown in FIG. 5;

FIG. 8 shows a flow chart for describing a third embodiment of the calibration step shown in FIG. 5;

FIG. 9 illustrates an exemplary embodiment of the histogram (histogram) step shown in FIG. 8;

FIG. 10 shows a flow chart for describing an exemplary embodiment of the segmentation step shown in FIGS. 7 and 8;

FIG. 11 shows a flow chart depicting an exemplary embodiment of the step of detecting tripwire traversal;

FIGS. 12 and 13 illustrate "screen shots" used to describe an exemplary embodiment of a report format; and

fig. 14 shows a flow chart for describing a typical application of the present invention.

Detailed Description

In describing the preferred embodiments of the present invention illustrated in the drawings, specific terminology is employed for the sake of clarity. However, the invention is not intended to be limited to the specific terminology so selected. It is to be understood that each specific element includes all technical equivalents which operate in a similar manner to accomplish a similar purpose. Each reference cited herein is incorporated by reference as if each reference were individually incorporated by reference.

Furthermore, the embodiments discussed below are generally discussed in terms of human detection. However, the present invention should not be construed as being limited to the detection of humans. Rather, the video tripwire system in the embodiments discussed below can be used to detect all kinds of objects, animate or inanimate. Examples include vehicles, animals, plant growth (e.g., detection systems when it is time to groom), dropped objects (e.g., detection systems when recyclable cans fall into trash chutes), and microscopic entities (e.g., detection systems when bacteria permeate the cell walls).

FIG. 3 shows an overview of an embodiment of a video tripwire system. As in fig. 2, the sensing device 1 is aimed at the road and generates an output signal. The sensing device 1 may be a camera as discussed in connection with fig. 2; however, the sensing device may also be any other type of sensor that produces a video-type output, such as a thermal-based, sound-based (e.g., sonogram), or infrared-based device. The output of the sensing device 1 is transmitted through the communication medium 2. The communication medium 2 may be, for example, a cable; however, the communication medium may be any other communication medium, such as RF (radio frequency), a network (e.g., the Internet), or a light wave. If the communication via the communication medium 2 requires modulation, coding, compression, or other communication-related signal processing, means are provided for performing such signal processing, either as part of the sensing device 1 or as separate means (not shown) connected to the sensing device 1. The communication medium 2 transmits the output signal from the sensing device 1 to the analysis system 5. The analysis system 5 receives input from the user interface 6 and sends output to the user interface 6. The user interface 6 may include, for example, a monitor, mouse, keyboard, touch screen, printer, or other input/output device. Using the user interface 6, a user is able to provide inputs to the system, including inputs that require initialization (including creating a virtual tripwire, as will be described later) and commands to the analytics system 5. The user interface 6 may also include an alarm or other alert device; the user interface 6 may also comprise or be connected to means for effecting any other response to a triggering event as discussed above. Typically, the user interface 6 will also include a display device such as the monitoring device 3 in fig. 2.

The analytics system 5 performs analytics tasks, including the processing required to implement a video tripwire. An embodiment of the analysis system 5 is shown in more detail in fig. 4. Fig. 4 shows the analysis system 5 as shown in fig. 3 connected to the communication medium 2 and the user interface 6, in fig. 4 the analysis system 5 is shown comprising a receiver 51, a computer system 52 and a memory 53. The receiver 51 receives the output signal of the sensing device 1 from the communication medium 2. The receiver 51 comprises means for performing demodulation, decoding, etc., if the signal has been modulated, encoded, etc. In addition, if the signal received from the communication medium 2 is in analog form, the receiver 51 comprises means for converting the analog signal into a digital signal suitable for processing by the computer system 52. The receiver 51 may be implemented as a separate module, as shown, or in an alternative embodiment may be included in the computer system 52. Furthermore, the receiver 51 may be omitted altogether if no signal processing needs to be performed before the transmission of the signal from the communication medium 2 to the computer system 52.

The computer system 52 is provided with a memory 53, which memory 53 may be external to the computer system 52 as shown, or included in the computer system 52, or a combination of both. The memory 53 contains all the storage resources required by the analysis system 52 and may also include one or more recording devices for storing signals received from the communication medium 2.

In another embodiment of the invention, the sensing device 1 may be implemented in the form of more than one sensing device monitoring the same location. In this case, the data output by each sensing device may be combined before transmitting the data via the communication medium 2, or the outputs of all sensing devices may be transmitted to the analysis system 5 and processed there.

In another embodiment of the invention, the sensing device 1 may comprise a number of sensing devices monitoring different locations and transmitting data of these sensing devices to the signal analysis system 5. In this way, the signal system can be used for monitoring of a plurality of sites.

The processing performed by the components shown in fig. 3 and 4 will become clearer from the following description of the method of the invention.

Fig. 5 shows an overview flow chart of an embodiment of the method of the present invention. If the sensing apparatus 1 has not been installed, the sensing apparatus must be installed (71). However, in many cases, such sensors may already be present. For example, most banks already use video surveillance systems, so there is no need to install new cameras. In a preferred embodiment of this system, the sensing device (or devices) is/are mounted so as to be stationary. Ideally, it is mounted in a "natural" orientation (e.g., top in the image corresponds to top in reality)

Once the sensing device 1 has been installed, it needs to be calibrated to the analysis system 5. In general, the system calibration can be performed either by an explicit calibration, in which the system is informed (or the system automatically determines) the required calibration parameters of the sensing device 1, or by an implicit calibration; in an implicit calibration, the system is informed (or automatically determines) the size of the object in question at different positions in the field of view of the sensing device 1. The purpose of the calibration is to provide scale information, i.e. to let the system know how large the size of the person or other object in question should be in different image areas. This information is particularly important for the data analysis step 74. The calibration may be performed according to one of the following three methods, or a combination of two or more of them: manual digital calibration, semi-automatic segmentation calibration, and fully automatic calibration. Flow diagrams of embodiments of these methods are shown in fig. 6, 7, and 8, respectively.

Fig. 6 shows a flow chart of an embodiment of a manual calibration method, involving explicit calibration as discussed above. The user enters parameters 721 related to the sensing device 1 via the user interface 6. These parameters may include, for example, the focal length of the sensing device 1, the height of the sensing device 1 from the ground, and the angle of the sensing device 1 relative to the ground. The analysis system 5 then generates visual feedback 722; for example, the system may add the person or other object in question on the actual video frame. The visual feedback is displayed to the user at the user interface 6. This visual feedback provides scale information (e.g., the size of the person or other object in question relative to the environment), which helps to verify that the calibration is correct. The user then determines whether the display of the visual feedback is acceptable or whether the parameters need to be adjusted 723. If it is acceptable, the process is ended; otherwise, processing returns to input 721 for new parameters.

An embodiment of a semi-automatic segmentation calibration method that utilizes implicit calibration and may also involve at least some degree of explicit calibration is shown in fig. 7 (see description below). In this embodiment a person walks (or some other object in question moves; the following description will refer to a person, but it should be understood that the same applies for other types of objects in question) across the field of view of the sensing device 1 (step 721A). This enables the system to determine the expected size of the average person in different areas of the image. During calibration, the person walking over should be the only moving object in the field of view. The system then segments out the moving person 722A. The size of the person in different regions of the image is then used to calibrate (i.e., determine parameters as discussed above) 723A. As in the manual calibration, visual feedback 724A is provided, and the user then evaluates whether the display of the image is acceptable 725A. If not, the user may adjust the parameter 726A, or, alternatively, the calibration may be redone completely, with the process returning to step 721A (dashed arrow). Which of these options is employed may be selected by the user. On the other hand, if the display is acceptable, the operation ends.

An embodiment of a fully automatic calibration method including implicit calibration is shown in fig. 8. First, information (video information shown in FIG. 8) 721B is collected by the sensing device 1 over an extended period of time, say hours or days, and after the data has been collected, the object is then segmented 722B for analysis. Histograms 723B are then generated for different objects in different regions of the image. This step is further illustrated in detail in fig. 9.

Fig. 9 shows histogram step 723B embodied as a two-step process, although the invention is not limited to such a process. In step 1, the system determines "unobtrusive" image areas, i.e. areas in which there are too many confounding objects to reliably track the object. As a result, only objects that can be tracked with high confidence are used; in one embodiment of the invention, these objects are just stored objects. In step 2, the system uses only the remaining image areas and forms a histogram of the objects detected in these areas. As shown in step 2, and as shown in fig. 8, the system then uses the histogram to determine the average size 724B of the person in each region of the image. This information is then used to calibrate the system 725B. This latter process is implemented similarly to step 723A of fig. 7.

The step 724B for determining the average size of the person in the image area is only performed if a sufficient number of objects yielding a meaningful determination are registered to the given area. The particular number of histograms of interest required may be determined empirically and may depend, for example, on the amount and type of activity that will expose the tripwire. For such regions, a peak is detected in the histogram. It is assumed that the highest peak in each image area, i.e. the most frequent occurrence, is a single person. If this information is determined, calibration 725B is successfully performed and the system can signal that it is ready for actual operation.

Typically, the process shown in FIG. 8 is performed without human intervention. However, the user can provide a time window during which most objects are expected to be individual people, to alleviate the problem of trying to distinguish a group of people. Such a time window may be added in the obtain information step 721B or in a further processing step

Each auto-calibration method (semi-automatic and fully automatic) requires segmentation of the image into foreground objects and background (see steps 722A and 722B of fig. 7 and 8, respectively). An embodiment of this process is shown in FIG. 10. This exemplary embodiment consists of three steps: pixel level background modeling 7221; foreground detection and tracking 7222; and object analysis 7223.

The purpose of pixel-level background modeling 7221 is to maintain accurate image background display and to distinguish Background (BG) pixels from Foreground (FG) pixels. In an exemplary embodiment, this step implements the process disclosed in commonly assigned U.S. patent application No.09/815,385 entitled Video Segmentation Using statistical Pixel Modeling, filed on 23/3.2001, which is hereby incorporated by reference. The general concept of the exemplary embodiment is to maintain a history of all pixels, including pixel values and statistics thereof, over several frames. Stable and invariant pixels are treated as background. If the statistics of a pixel change significantly, it will be considered foreground. If the pixel is again stable, it reverts to treating it as a background pixel. This approach is used to mitigate sensor noise and automatically account for changes to the background (e.g., in a store, when a person removes a good from a shelf, the shelf is treated as foreground instantaneously, and will revert to the background when the scene re-stabilizes).

The purpose of foreground detection and tracking 7222 is to compose foreground pixels into foreground objects and track the foreground objects over a large number of frames to ensure spatio-temporal consistency. This obtains the set of pixels determined to be foreground pixels and their statistical properties from the pixel-level background modeling 7221. In an exemplary embodiment, the foreground pixels are spatially combined into larger foreground objects using techniques known in the art, i.e., simple morphology and detection with connected components, and these objects are tracked using correlation methods over several frames to obtain reliable size information. Exemplary tracking techniques are discussed in commonly assigned, co-pending U.S. patent application No.09/694,712 entitled Interactive Video management, filed 24/10.2000, the disclosure of which is incorporated herein by reference. See also, Wren, c.r. et al, written Pfinder: Real-Time Tracking of the Human Body, IEEE trans. on Pattern Matching and Machine understanding, Vol 19, pp.780-784, 1997; grimson W.E.L et al, "Using Adaptive Tracking to Classification and monitor Activities in a Site," CVPR, pp.22-29, June 1998, and Olson, T.J. and Brill F.Z., "Moving object Detection and Event registration Algorithm for Smart cameras," IUW, pp.159-175, May 1997. Each of these references is incorporated herein by reference in its entirety.

In the third step, object analysis 7223 has a number of functions. Object analysis 7223 can be used to separate and count objects; to distinguish between the object in question and a "disturber" (e.g. a shopping cart); determining a direction of motion of the object; and inclusions that account for objects. In an alternative embodiment, the determination regarding the object is made in accordance with one or more of the following: the size of the object; internal motion of the object; the number of head-like protrusions (e.g., in the case where the subject in question is a person); and face detection (e.g., again, in the case where the subject in question is a person). Techniques for performing such functions are well known in The art and are described, for example, in Long-range spatial Motion estimation Using spatial flow records, written by Allmen m. and Dyer c. ieee CVPR, lahaiina, Maui, Hawaii, pp.303-309, 1991, Gavrial, d.m. The Visual Analysis of Human Motion: ASurvey [ "A System for Video Survellance and Monitoring": examples of such techniques are described in VSAMFinal Report, Robotics Institute, card-Mellon University, Tech.Rept.No. CMU-RI-TR-00-12, May 2000, Lipton, A.J., et al, written "moving target Classification and Tracking from Real-Time Video", 1998 DARPAUW, Nov.20-23, 1998, and "Visual event detection", Video Computing Series, M.Shah.Ed., 2001, written by Haering, N.et al. Each of these references is incorporated herein by reference in its entirety.

Returning now to fig. 5, the calibration step 72 is followed by a step 73 of initializing the system. This step allows the user to enter various parameters related to how the system will collect, respond to and report data. First, the user may add one or more lines in question on the image; these wires will serve as one or more tripwires. These lines can have any orientation and can be placed in almost any position in the image; the exceptions are: this line cannot appear too close to the image border because objects (e.g. people) crossing the line are at least partially visible on both sides of the line where detection occurs. In the illustrated embodiment, it is assumed that the tripwire is set up on the ground in the image; detection occurs when a bottom portion of the object (e.g., a person's leg) passes through the line. In a more general embodiment, the user may set a height above the ground for each line.

Other parameters that may be initialized include: the time interval for active detection, the direction of travel through each line as a criterion for event detection (e.g., a determination is made when a person enters an area as opposed to a determination that is required when a person either enters or exits), and the sensitivity of detection.

Another function of the initialization 73 is to facilitate the user to select different recording options. These options determine what data is collected and may include, but are not limited to:

recording only when a person (or, in general, the object in question) passes through;

recording only when two or more people pass through;

recording all passes;

only when there is a high confidence in the detection pass;

recording only the detection statistics;

take a "snapshot" or create an entire video around the detected event.

This means that a still image is created from a "snapshot," which may simply be a particular video (or other sensing device) frame, or may be generated independently.

After initialization 73, the system operates to collect and analyze data 74. If the user has entered the time window, the system begins processing when it is inside the time window. When it detects a tripwire event (of a particular type, if specified by the user), it records the event along with the accompanying information; hereinafter, in the description of the data report, the type of accompanying information will become apparent. In the context of some applications, a tripwire event may trigger an alarm or other response 76 (e.g., take a snapshot).

An embodiment of an exemplary technique for performing analysis and detecting tripwire time is shown in FIG. 11. First, foreground objects are determined from the video using object segmentation 740. For example, object segmentation 740 may include steps 7221, 7222, and 7223 as shown in FIG. 10 and discussed above. The position 741 of the foreground object is then tested to determine if it overlaps 742 the line representing the tripwire as described above, always assuming in the exemplary embodiment that the tripwire line is above ground, confirming that the object has crossed the tripwire if the bottom portion of the object overlaps the tripwire line. If it is determined that no overlap has occurred, there is no tripwire event 743. If there is an overlap, then if only passes in the specified direction are considered tripwire events, the direction of the pass is tested 744, and those passes that do not occur in the specified zone are not considered tripwire events 745. If the representative tripwire event is traversed in either direction, then the process skips the test of step 744. If step 744 has been performed and a positive result is produced, or if step 744 was not performed, then an additional query 746 may also be performed. For example, such queries may include: determining a particular characteristic of the object in question (e.g., car, truck, blue car, blue station wagon, car smaller than a certain size, etc.) or a particular object (e.g., a particular face, license plate number, etc.) if such a query 746 returns a positive result, or if no such query is made, the process determines that a tripwire event 747 occurred. Of course, if such an inquiry 746 is made and a negative result returned, then it will be determined that a tripwire event has not occurred.

Several methods can be performed for implementing the pass direction determination 744. As a first example, the above determination may be detected by implementing the detection of an object detected as crossing the tripwire by an optical flow method; the use of optical flow methods can also be used to avoid the need for object segmentation. As a second example, trajectory information may be used in accordance with object tracking (in step 7222 of fig. 10). As a third example, this determination may be accomplished by establishing an auxiliary (pseudo) tripwire on either side of each actual tripwire joined by the user, and determining in what order the auxiliary tripwires were traversed when traversing the actual tripwires.

Calibration 72 is particularly important when performing step 74, particularly if only particular types of objects are of interest. For example, if the subject in question is a person, calibration 72 allows step 74 to distinguish between subjects such as a person and one that is smaller than the person (e.g., cats and mice) or larger than the person (e.g., a crowd of people and cars).

When the data has been collected, it can then be reported to the user. In an exemplary embodiment of the invention, a user may query the system for results using a Graphical User Interface (GUI). In this embodiment, summary information and/or detailed data relating to one or more individual detections may be displayed. The summary information may include one or more of the following: the number of detections, the number of people (or other objects in question) detected, the number of multiple people (multiple objects) detected (i.e., when multiple people (or other objects in question) pass through simultaneously), the number of people (objects) passing through in each direction, any or all of the foregoing information within a user-selected time window, and a time histogram of one or more of one or all of the foregoing information. Details regarding a single detection may include one or more of the following: time, detection, number of people (objects) passed, size of objects passed, and one or more snapshots or videos taken near the detection time.

Fig. 12 and 13 show screen shots of an illustrative report display in an exemplary embodiment. FIG. 12 shows summary information 121 regarding tripwires 124 traversing across a corridor. In this particular example, the screen displays a live video 123 of the area containing the tripwire 124. Also included are descriptions that give the period during which monitoring (i.e., the time window) occurred and the period during which the pass event was recorded. The summary information 121 includes the number of passes and their direction. In this case, the user has also specified that a particular pass time and date 122 should be displayed.

FIG. 13 illustrates individual information regarding a particular pass through event; these pass through events correspond exactly to the particular pass through time and date 122 shown in fig. 12. In the display shown in FIG. 13, the user has selected to display a snapshot of each pass event along with its time and date. In particular, snapshots 131 and 132 correspond to pass through events in the area shown by video 123 in FIG. 12. In another embodiment, the user can click on the snapshot or a button related to the snapshot to view the corresponding video taken near the time of the pass event.

An example of an application for the video tripwire of the present invention is "follow-up" detection. An event is immediately described in which a certain number of people (often one) are allowed to enter an area (and so on), and one or more other people try to follow closely in order to gain entry as well. FIG. 14 depicts a flow chart of a method for implementing the follow-up detection system. In this embodiment, it is assumed that the video surveillance camera is installed at a position capable of recording an entry situation through an entrance such as a door or a swing door. Furthermore, as mentioned above, the camera must be calibrated. The system is activated by detecting that a person is entering or is about to enter through the portal 141. This can be achieved in any number of ways: for example, one may have to put in money, enter a code on a keyboard, or swipe a card through a card reader, or the system may use a video-based detection method to visually detect the opening of the portal (which would have the advantage of not requiring an interface with external devices (card reader, keypad, etc.), which may be easy to install and implement in some environments). When an entry is detected, monitoring starts 142. During this monitoring, the system detects objects moving through the portal and analyzes the objects to determine how many people have entered. If the camera is positioned so that it can record faces, this may involve face detection, as described above. The system then determines if the number of people entering is acceptable 143. In the illustrated embodiment, only one person is allowed at a time; however, in a more general embodiment, this may be an arbitrarily selected number. If one person (the allowed number) is entered, no response 144 is needed. On the other hand, if more than one (more than the allowed number) person enters, a response 145 is triggered. Such responses may include: such as an audible alarm, taking a snapshot, recording video around the portal. An additional advantage of a system utilizing either of the latter two responses is that: in the case of a system utilizing a card reader, useful evidence can be provided in tracking a person using a stolen card.

The embodiments shown and discussed in this specification are intended only to teach those skilled in the art the best way known to the inventors to make and use the invention. Nothing in this specification should be taken as a limitation on the scope of the invention. As will be recognized by those skilled in the art in the light of the above teachings, the above-described embodiments of the invention may be modified and varied, and elements added or omitted, without departing from the invention. It is, therefore, to be understood that within the scope of the appended claims and equivalents thereto, the invention may be practiced otherwise than as specifically described.

Claims

1. A video tripwire system comprising:

a sensing device for generating a video output; and

a computer system including a user interface for performing calibration of the sensing device and collecting and processing data according to video output received from the sensing device, the user interface including input means and output means, wherein the computer system displays the processed data and the computer system allows a user to input at least one virtual tripwire using the user interface, the virtual tripwire including at least one line spanning at least a portion of an image corresponding to the video output and determining from the video output whether the virtual tripwire has been traversed.

2. The video tripwire system of claim 1, further comprising:

means for transmitting video output from the sensing device;

a communication medium on which the video output is transmitted by the means for transmitting; and

means for receiving a video output from a communication medium.

3. The video tripwire system of claim 2, wherein the communication medium is a cable.

4. The video tripwire system of claim 2, wherein the communication medium comprises a communication network.

5. The video tripwire system of claim 1, wherein the output means comprises at least one of means for notifying a visual alarm and means for notifying an audible alarm.

6. The video tripwire system of claim 1, wherein the output device comprises a visual display device.

7. The video tripwire system of claim 6, wherein the visual display device is capable of displaying at least one of a video, one or more snapshots, and alphanumeric information.

8. The video tripwire system of claim 1, further comprising:

at least one storage device for storing at least one of video data and alphanumeric data.

9. The video tripwire system of claim 1, wherein the sensing device comprises at least one of a camera, an infrared camera, an ultrasonic device, and a thermal imaging device.

10. The video tripwire system of claim 1, further comprising:

at least one additional sensing device for generating a video output,

wherein the computer system also receives and processes video output of at least one additional sensing device.

11. A method of implementing a video tripwire system, comprising the steps of:

calibrating the sensing device to determine parameters of the sensing device used by the system;

initializing a system comprising inputting at least one virtual tripwire, the inputting at least one virtual tripwire comprising: receiving an input from a user through a user interface defining the at least one virtual tripwire, wherein the at least one virtual tripwire includes at least one line spanning at least a portion of an image corresponding to the input from the sensing device;

obtaining data from a sensing device;

analyzing data obtained from the sensing device to determine whether at least one virtual tripwire has been traversed; and

a response to traversing the virtual tripwire is triggered.

12. The method of claim 11, wherein said calibrating step comprises the steps of:

manually inputting parameters by a user;

generating visual feedback to a user; and

if the display of the visual feedback is not acceptable to the user, the user is allowed to re-enter the parameters.

13. The method of claim 11, wherein said calibrating step comprises the steps of:

moving a person through a field of view of a sensing device;

segmenting the moving person;

determining a parameter using the size of the person in different regions of the field of view;

providing visual feedback to a user; and

if the display of the visual feedback is not acceptable to the user, the parameter is allowed to be adjusted.

14. The method of claim 13, wherein said step of allowing adjustment comprises the step of allowing a user to manually adjust the parameters.

15. The method of claim 13, wherein said step of allowing adjustment includes the step of restarting the calibration step.

16. The method of claim 13, wherein said allowing step comprises the step of allowing a user to select between manually adjusting parameters or restarting the calibration step.

17. The method of claim 13, wherein said step of segmenting comprises the steps of:

performing pixel level background modeling;

performing foreground detection and tracking; and

the foreground object is analyzed.

18. The method of claim 11, wherein said calibrating step comprises the steps of:

collecting video information within a certain time period by using sensing equipment;

segmenting an object from the video information;

analyzing the segmented object to determine an average size of the person within different regions of the video image corresponding to the field of view of the sensing device; and

the parameters are determined by the average size of the person in the different areas.

19. The method of claim 18, wherein said step of segmenting comprises the steps of:

performing pixel level background modeling;

performing foreground detection and tracking; and

the foreground object is analyzed.

20. The method of claim 18, wherein said analyzing step comprises the steps of:

determining a non-salient region of the video image; and

a histogram of foreground objects detected in the unobtrusive regions of the video image is constructed.

21. The method of claim 20, wherein: the determination of the average size of the person within a particular area of the video image is only made if the number of foreground objects detected in said area exceeds a predetermined number.

22. The method of claim 20, wherein the highest peak in the histogram is obtained to correspond to a single person.

23. The method of claim 18, further comprising the steps of:

one or more time windows to be used for calibration are input by the user.

24. The method of claim 11, wherein the step of initializing the system further comprises the step of selecting at least one recording option.

25. The method of claim 11, wherein said analyzing step comprises the step of determining whether the detected object overlaps at least one virtual tripwire.

26. The method of claim 25, wherein said calibrating step comprises the step of performing object segmentation; and

wherein the detection object is detected according to the step of performing object segmentation.

27. The method of claim 26, wherein said performing object segmentation step comprises the steps of:

performing pixel standard level modeling;

performing foreground detection and tracking; and

the foreground object is analyzed.

28. The method of claim 25, wherein said analyzing step further comprises the steps of:

the object segmentation is carried out in such a way that,

wherein the detected object is detected according to the step of performing object segmentation.

29. The method of claim 26, wherein said performing object segmentation step comprises the steps of:

performing pixel level background modeling;

performing foreground detection and tracking; and

the foreground object is analyzed.

30. The method of claim 25, wherein said analyzing step further comprises the steps of:

if the step of determining whether the detected object overlaps the at least one virtual tripwire returns a positive result, determining whether the direction of traversal matches a direction of traversal of the user input.

31. The method of claim 30, wherein said analyzing step further comprises the steps of:

if the step of determining whether the direction of traversal matches the user-entered direction of traversal returns a positive result, at least one additional query is made regarding the characteristics of the detected object.

32. The method of claim 25, wherein said analyzing step further comprises the steps of:

if the step of determining whether the detected object overlaps the at least one virtual tripwire returns a positive result, at least one additional query is made regarding characteristics of the detected object.

33. The method of claim 11, wherein the step of triggering a response comprises at least one of activating an audible alarm, activating a visual alarm, taking a snapshot, and recording a video.

34. The method of claim 11, further comprising:

performing an affinity detection, the affinity detection comprising the steps of:

detecting that a person is entering the area in question;

in response to said detecting step, initiating monitoring of the area in question;

determining whether the number of persons entering the area in question is greater than an allowable number; and

if the determining step returns a positive result, a response is triggered.

35. The video tripwire system of claim 1, wherein the computer system comprises application specific hardware adapted to perform the calibration, the collection, and the processing.