[go: up one dir, main page]

GB2528194A - Object tracking in an image sequence - Google Patents

Object tracking in an image sequence Download PDF

Info

Publication number
GB2528194A
GB2528194A GB1516527.7A GB201516527A GB2528194A GB 2528194 A GB2528194 A GB 2528194A GB 201516527 A GB201516527 A GB 201516527A GB 2528194 A GB2528194 A GB 2528194A
Authority
GB
United Kingdom
Prior art keywords
image
frame
bitmap
frames
order combined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
GB1516527.7A
Other versions
GB2528194B (en
GB201516527D0 (en
Inventor
Donald Milne
Raymond William Hynds
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MICROPACK ENGINEERING Ltd
Original Assignee
MICROPACK ENGINEERING Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MICROPACK ENGINEERING Ltd filed Critical MICROPACK ENGINEERING Ltd
Priority to GB1516527.7A priority Critical patent/GB2528194B/en
Publication of GB201516527D0 publication Critical patent/GB201516527D0/en
Publication of GB2528194A publication Critical patent/GB2528194A/en
Application granted granted Critical
Publication of GB2528194B publication Critical patent/GB2528194B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/97Determining parameters from multiple pictures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F23COMBUSTION APPARATUS; COMBUSTION PROCESSES
    • F23NREGULATING OR CONTROLLING COMBUSTION
    • F23N2229/00Flame sensors
    • F23N2229/20Camera viewing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/28Indexing scheme for image data processing or generation, in general involving image processing hardware
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B17/00Fire alarms; Alarms responsive to explosion
    • G08B17/12Actuation by presence of radiation or particles, e.g. of infrared radiation or of ions
    • G08B17/125Actuation by presence of radiation or particles, e.g. of infrared radiation or of ions by using a video camera to detect fire or smoke

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

A method, and apparatus and a computer program product are provided for tracking an object in an image sequence comprising a plurality of image frames. A value of an image attribute is determined for a plurality of image blocks of a respective image frame of the image sequence. A characteristic value is assigned to each image block attribute to generate an image frame bitmap. A first order combined bitmap frame is generated by performing a first logical combination of a plurality N of image frame bitmaps spanning a first time period. A plurality, M, of those first order combined bitmap frames are logically combined to generate a second order combined bitmap frame. The object to be tracked is then identified using the second order combined bitmap frame.

Description

OBJECT TRACKING IN AN IMAGE SEQUENCE
FIELD OF THE INVENTION
The present invention relates to an apparatus, method and computer program for object tracking in an image sequence.
BACKGROUND OF THE INVENTION
The reliable tracking of objects across a field of view in an image sequence is a fundamental problem in machine vision which has been only partially solved. Previously known approaches to object tracking such as optical flow or Fourier phase correlation are typically very processor intensive and require large amounts of random access memory to support the computations.
Optical flow is a pattern of apparent motion of an object in a visual scene caused by the relative motion between an observer (e.g. the camera) and the scene itself Optical flow methods attempt to calculate motion between two image frames taken at different times at every voxel position. A voxel represents a value on a regular grid in three-dimensional space. The optical flow methods are differential methods based on partial derivatives with respect to spatial and temporal coordinates.
Fourier phase correlation is an image processing technique that performs discrete Fourier transforms on successive image fields. These differences are then subject to reverse transform, which directly reveals peaks in positions corresponding to motion between the fields.
The Fourier transforms are complex and thus computationally intensive.
The demands of known methods for object tracking in terms of processing power and Random Access Memory (RAM) mean that it would be difficult to implement these object tracking methods in embedded devices such as fire detectors, which are typically very simple devices similar in complexity to a lux meter. Furthermore the optical flow and Fourier phase correlation methods can be unreliable and have shown not to be particularly robust with respect to image disturbances such as a human being walking between a camera and an object being tracked.
Thus there is a requirement for a less processor and memory intensive and more robust method of object tracking in sequence of images.
BRIEF SUMMARY OF THE DISCLOSURE
According to a first aspect, the present invention provides an apparatus for tracking an object in an image sequence comprising a plurality of image frames, the apparatus comprising: an input interface for receiving the image sequence from a video capture sensor; memory for storing at least a subset of the plurality of image frames; a hardware processor having circuitry configured to: determine a value of an image attribute for a plurality of image blocks of a respective image frame of the image sequence; generate an image frame bitmap for the respective image frame by assigning a characteristic value to each of the plurality of image blocks based upon a comparison of the image attribute with a range; generate a first order combined bitmap frame by performing a first order logical combination of the characteristic values of spatially corresponding image blocks of a plurality, N, of image frame bitmaps spanning a first time period in the image sequence; generate a second order combined bitmap frame by performing a second order logical combination of values of spatially corresponding image blocks of a plurality, M, of the first order combined bitmap frames; and identify the object to be tracked using the second order combined bitmap frame.
It will be appreciated that the image block may comprise groups of two or more pixels, but in some embodiments the image block comprises a single pixel.
In some embodiments, the range used for comparison of the image attribute value comprises a single predetermined threshold.
In some embodiments, at least one of the first logical combination and the second logical combination comprises a logical OR operation.
In some embodiments, successive second order combined bitmap frames are generated using a sliding time window comprising M first order combined bitmap frames such that the second order combined bitmap frame spans M*N image frames.
In some embodiments the hardware processor comprises an X-bit register file for loading operands for performing the first order logical combination and the second order logical combination and wherein generation of the first order combined bitmap frame comprises packing the characteristic image values corresponding to a plurality of different image blocks of the image frame bitmap of a given image frame bitmap into an X-bit packed input operand for loading into a register of the register file and performing the first logical combination using packed input operands for corresponding image blocks of different ones of the N image frame bitmaps.
In some embodiments generation of the second order combined bitmap frame comprises packing X values corresponding to a plurality of different image blocks of the first order combined bitmap frame of a given first order combined bitmap frame into an X-bit packed input operand and performing the second logical combination by loading the X-bit packed input operands for corresponding image blocks of different ones of the M temporally contiguous first order combined bitmap frames into different registers of the register file. In these embodiments, X is a non-zero integer.
In some embodiments the image attribute is a brightness value of an image block of a single image frame of the image sequence.
In some embodiments the image attribute comprises an inter-frame difference of a parameter value for an image block.
In some embodiments the inter-frame difference is a flicker amount corresponding to a difference in brightness.
In some embodiments the processor is configured to perform a third logical combination of two or more of the second order combined bitmap frames corresponding to respective different attribute values and spanning substantially the same time period in the image sequence.
In some embodiments the third logical combination comprises combining two different attribute values corresponding to brightness and flicker.
According to a second aspect the present invention provides a method for tracking an object in an image sequence comprising a plurality of image frames, the method comprising: determining a value of an image attribute for a plurality of image blocks of a respective image frame of the image sequence; generating an image frame bitmap for the respective image frame by assigning a characteristic value to each of the plurality of image blocks based upon a comparison of the image attribute with a range; generating a first order combined bitmap frame by performing a first logical combination of the characteristic values of spatially corresponding image blocks of a plurality, N, of image frame bitmaps spanning a first time period in the image sequence; generating a second order combined bitmap frame by performing a second logical combination of values of spatially corresponding image blocks of a plurality, M, of the first order combined bitmap frames; and identifying the object to be tracked using the second order combined bitmap frame.
According to a third aspect the present invention provides a computer program product stored on a transitory or non-transitory medium having: determining a value of an image attribute for a plurality of image blocks of a respective image frame of the image sequence; code for generating an image frame bitmap for the respective image frame by assigning a characteristic value to each of the plurality of image blocks based upon a comparison of the image attribute with a range; code for generating a first order combined bitmap frame by performing a first logical combination of the characteristic values of spatially corresponding image blocks of a plurality, N, of image frame bitmaps spanning a first time period in the image sequence; code for generating a second order combined bitmap frame by performing a second logical combination of values of spatially corresponding image blocks of a plurality, M, of the first order combined bitmap frames; and code for identifying the object to be tracked using the second order combined bitmap frame.
Other aspects and features of the present invention are as defined in the appended claims.
Embodiments of the present invention exploit the fact that some objects have a characteristic such that they tend to oscillate around a fixed spot in a field of view of an image sequence.
This characteristic allows for object tracking to be performed in a less processing intensive and a more memory efficient way via implementation of first order and second order temporal logical combinations of image frame bitmaps. One example of objects that tend to oscillate around a fixed point in the field of view are flames of a fire.
A bitmap is a set of data in which multi-bit pixels representing attributes of an image frame have been mapped from multi-bit to single-bit values to generate the bitmap. According to the present technique a "packed bitmap" may be used, in which each 1-bit pixel value is stored as a single bit in the bitmap representation.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments of the invention are further described hereinafter with reference to the accompanying drawings, in which: Figure 1 schematically illustrates an apparatus for performing object tracking in an image sequence; Figure 2 schematically illustrates a process for performing first and second order combination of image frame bitmaps; Figure 3A schematically illustrates performing a pixel-wise logical OR" operation on a sequence of three image frame bitmaps corresponding to different times in an image sequence; Figure 3B schematically illustrates three consecutive image frames that have been bitmapped to highlight bright parts of the field of view and a corresponding logical combination of those frames; Figure 4 is a flow chart that schematically illustrates an overview of an object tracking algorithm comprising first and second order bitmap frame combinations to perform object detection in an image sequence; Figure 5 is a flow chart that schematically illustrates image frame bitmap creation from a captured image frame having pixel values corresponding to an image attribute; Figure 6 is a flow chart that schematically illustrates generation of the first order combined bitmap frame of the algorithm of Figure 4; Figure 7 is a flow chart that schematically illustrates generation of the second order combined bitmap frame of the algorithm of Figure 4; Figure 8 is a flow chart that schematically illustrates a process for identifying a flame flickering area in an image sequence; Figure 9 is a flow chart that schematically illustrates an algorithm corresponding to a vertical motion test to distinguish flame objects from other image objects; and Figure 10 schematically illustrates a sequence of consecutive frames for a typical flame showing a trend for more upwards movement overall between successive image frames.
DETAILED DESCRIPTION
Figure 1 schematically illustrates an object tracking apparatus according to one embodiment.
The apparatus comprises a 16-bit hybrid digital signal processor (DSP)/microcontroller. The apparatus comprises an embedded processor 110 that implements an orthogonal Reduced Instruction Set Computing (RISC) like microprocessor instruction set and single-instruction, multiple-data (SIMD) capabilities. The embedded processor 110 comprises a computing unit having an Address Arithmetic Unit 122, a control unit 124, a register file 130 and an Arithmetic Logic Unit (ALU) 140.
The embedded processor 110 also has internal memory comprising a code memory 152 of 82 kilobytes (kE), a data cache 154 of 16 kB, a data memory 156 of 48 kB and a main stack 158 of 4kB capacity. In this particular embodiment, all memory resources are mapped through a flat 32-bit address space. The main stack 158 stores stack and local variable information. The code memory 152 and data memory 156 are Static Random Access Memory (SRAM) and typically operate at or close to the full processor speed with little or no latency. Also provided in the apparatus is an off-chip memory 160 comprising an 128 MB Synchronous Dynamic Random Access Memory (SDRAM), which operates at a slower speed than the on-chip memories 152,154,156. A connection 162 between the embedded processor 110 and the SDRAM 160 provides a 133MHz data connection.
On chip input/output devices of the processor 110 have their control registers mapped into memory-mapped registers at addresses near the top of the 4 GB memory address space. The memory-mapped registers are separated into a first block that contains control memory-mapped registers for all core functions and a second block that contains registers needed for setup and control of the on-chip peripherals outside of the processor core 110.
The ALU 140 is configured to perform a set of arithmetic and logical operations such as Boolean AND, OR, NOT, NAND and NOR operations on 16-bit or 32-bit data. A number of special program instructions are included to accelerate signal processing tasks. A set of video instructions is provided including: byte alignment and packing operations; 16-bit and 8-bit adds with clipping; 8-bit average operations; and 8-bit subtract/absolute value/accumulate operations.
The register file 130 contains eight 32-bit registers RO to R7. As shown in Figure 1, each register is separated into a 16-bit high portion (e.g. R0.H) and a corresponding 16-bit low portion (e.g. R0.L). When performing computing operations on 16-bit operate operand data, register file operates as 16 independent 16-bit registers. All operands for computing operations come from the multi-port register file 130 and from instruction constant fields in the address arithmetic unit 122. The address arithmetic unit 122 provides two addresses for substantially 0 simultaneous dual fetches from memory and comprises its own register file (not shown).
A video graphics array (VGA) resolution monochrome sensor 170 is provided and supplies captured video frames at a rate of 25 Hz which are supplied as input to a Parallel Peripheral Interface (PPI) 172 on the embedded processor 110. These monochrome video frames comprise image frames of an image sequence upon which various processing operations are performed. In particular, object tracking including, in some embodiments, flame tracking within image frames of the incoming image sequence may be performed on video frames received via the PPI port 172. The presence of the PPI port 172 means that external video capture circuits are not required. The PPI port 172 receives monochrome video data, while the 12C interface may be used to configure the VGA mono sensor 170, for example, to set a desired frame rate.
A VGA resolution colour sensor 180 is also provided although the colour video is not processed by the embedded processor 110 in this embodiment, but simply passes straight though and out of the detector via the video output unit 182.
Other circuitry components of the apparatus of Figure 1 include a temperature probe 192, a flash non-volatile store 194, a real-time clock 196 and a watchdog/reset timer 198.
The real-time clock 196 is capable of counting time in human units (unlike a hardware clock). It is clocked by a 32.768 kHz crystal external to the embedded processor 110. It has a dedicated power supply pin so that it can be remain powered-up and clocked even when the rest of the processor is a reduced powered state. It can be used to wake up the processor from a sleep mode and it provides a number of programmable interrupt options.
The watchdog timer 198 is a 32-bit timer that can be used to implement a software watchdog function that can improve system availability by forcing the processor to a known state via generation of a hardware reset, a non-maskable interrupt or a general-purpose interrupt if the timer expires before being reset by software. The watchdog timer 198 can thus protect the system from remaining in an unknown state when software has stopped running as a result of, for example, external noise or a software error. The watchdog timer 198 is clocked by the system clock.
The VGA colour sensor 180 is configured to supply video frames at PAL (Phase Alternating Line) or NTSC (National Television System Committee) frame rate to a video output unit 182 on the embedded processor.
A signal comprising processed and prioritised status is supplied from the processor core 122 to an output unit 193. A micro SD (uSD) card 195 may be provided, with a connection to the processor core 120. A Universal Asynchronous Receiver Transmitter (UART) port is provided on the embedded processor 110 and forms a connection with an RS 485 module 199. RS 485 is a data communications standard produced by the Electronics Industry Association. The UART pod 197 converts bytes into a serial bitstream for supply to attached serial devices.
0 A Serial Peripheral Interface (SPI) connection 194a is provided between the processor core and the flash memory 194 and a similar SPI connection 196a is provided between the processor core 120 and the real-time clock 196 and a further SPI connection 195a is provided between the processor core 120 and the micro Secure Data (uSD) card 195. A connection 180b between the processor core 120 and the VGA colour sensor 180 is an Inter-Integrated Circuit bus (120). The 120 bus 180a is a simple bus providing arbitration and collision detection and is provided between the VGA monochrome sensor 170 and the processor core 120. A similar 120 bus connections are provided between the processor core 120 and the temperature probe 192.
The embedded processor 110 of Figure 1 is one example hardware implementation of an apparatus configured to perform object tracking according to the present technique. In some embodiments, the apparatus is configured as a flame detection apparatus implementing the object tracking. Accordingly, the processor core 120 is configured to generate image frame bitmaps from incoming monochrome video frames received at the PPI 172 and to generate first-order and second order combined bitmap frames by performing logical combinations, for example, logical OR operations, on pixels of image frames corresponding to different video frames in the incoming sequence. The data cache 154 and the data memory 156 have a limited capacity for storing image frame data. According to the present technique, efficient use is made of available RAM memory by performing the two-stage image frame bitmap combination. This two-stage bitmap combination algorithm may be used, for example, to more efficiently and reliably identify image objects corresponding to flames when the apparatus is implemented as a flame detector.
Flame detection has been found to work better with greyscale cameras rather than with colour video. One reason for this is that in colour video cameras an infra-red cut filter is typically used to correct colour balance, but it has been recognised that this may cut out infra-red energy from flames. In the Figure 1 embodiment, the colour images are supplied to the user for viewing because viewers tend to prefer to view the colour images. The video sensors 170, 180 are configured to share a single lens via a beam splitter prism such that the colour and mono sensors 170, 180 have the same field of view. The object tracking and flame detection algorithms use the VGA mono sensor output 170 but the user views output of the VGA colour sensor 180. Conveniently, the video output unit 182 may be configured to output in NTSC format if required whereas the VGA mono sensor 170 may always operate at PAL frequencies as expected by the algorithms.
It will be appreciated that Figure 1 is only one example implementation of an object tracking apparatus according to the present technique. Notable features of the hardware implementation of Figure 1 are inclusion of a black and white VGA resolution camera module 170 delivering 25 image frames per second, a processor having an ALU, 128MB external RAM, a 128kB flash 0 memory for firmware and configuration storage and an RS 485 module for configuration and reprograming. The outputs 193 include status LED5, relays and a 4 to 20 milliamp (mA) signal.
The apparatus of Figure 1 may be configured for use as a flame detector. The present technique may alternatively be implemented on general-purpose processing hardware such as a personal computer by execution of a computer program application. The present technique may alternatively be implemented using a combination of general-purpose and dedicated hardware and is not limited to the embedded processor arrangement of the embodiment of Figure 1.
Figure 2 schematically illustrates how frames of an incoming image sequence are successively combined to generate a series of bitmap frames representing input image data. In this example a first group of image frame bitmaps 210 represents 25 frames of 25Hz video data comprising pixel values for a given image attribute. The 25Hz image frames are 640 wide x 480 pixels high and are non-interlaced (progressive) monochrome in this embodiment. This can be compared with digitized PAL which is typically 720x576 pixels, interlaced and colour but also at 25Hz. The image frame bitmaps 210 comprise a single bit value of zero or one representing each pixel of the frame. This is a characteristic value" representing a multi-bit image attribute value corresponding to a pixel of the image frame sequence supplied to the PPI 172 of Figure 1. The attribute value may be, for example, a brightness value (single frame) or a flicker value (inter-frame difference). In some embodiments logical combinations of image frame bitmaps corresponding to the same time in the image sequence but to different image aftributes may be generated and used as input to the uppermost stage of the frame logical combining operation illustrated by Figure 2. The Figure 2 pixel-wise frame combination is performed along a time axis of the image sequence.
A second group 212 of twenty five image frame bitmaps 212 represents frames 26 to 50 in the input image frame sequence; a third group 214 of 25 image frame bitmaps represents frames 51 to 75 in the input image frame sequence; and a fouith group 216 of image frame bitmaps represents frames 76 to 100 in the input image frame sequence. Each of the groups 210, 212, 214 and 216 of image frame bitmaps spans a predetermined time period, in this case one second of data of the incoming image sequence. In a first order combination of image frame bitmaps corresponding pixels of each of the 25 pixel frames of the group 210 are combined using a logical OR operator to generate a first order combined bitmap frame 220.
Thus, the image frame bitmaps corresponding to twenty-five image frames in the group 210 have been compressed such that the information is contained in the first order combined bitmap frame 220. Similar first order logical combinations are performed on the second group 212 of image frame bitmaps to generate a first order combined bitmap frame 222. A first order combination is performed on the group of image frame bitmaps 214 to generate a first order combined bitmap frame 224. A first order combination is performed on the group of image ) frame bitmaps 216 to generate a corresponding first order combined bitmap frame 226. It can be seen that in this example implementation, the first order logical combination is performed on temporally contiguous frames received in an input image sequence and the combining windows 210, 212, 214 and 216 are non-overlapping. However, in alternative embodiments, frames may be selectively included in the first order combination, e.g., to eliminate any corrupted image frames. Furthermore the time windows for combining the image frames could partially overlap or could be non-contiguous. In particular, the time window could span a time period comprising a plurality of image frames, but could disregard one or more of the image frames within the window when performing the logical combination.
Once the first order combined bitmap frames 220, 222, 224, 226 have been generated, these are then combined in a second order logical combination as shown in Figure 2. In this particular example, the second order logical combination involves combining three of the results of the first order logical combination. In particular, first order combined bitmap frames 220, 222 and 224 are combined in a logical OR operation to generate a second level OR" result corresponding to the second order combined bitmap frame 230. The second order logical combination in this example differs from the first order logical combination in that a sliding time window is used for the purposes of combining the bitmaps. Thus, although the first order combined bitmap frames 222 and 224 have already been combined to generate the second order combined bitmap frame 230, these are also used in together with a subsequent first order combined bitmap frame 226 to generate a different second order combined bitmap frame 232.
Thus it can be seen that a first order bitmap frame combination logically combines bitmap results from N frames and a second order bitmap frame combination combines M of the first order combined bitmap frames using a sliding time window such that the second order combined bitmap frames effectively cover a total of N*M image frame bitmaps of the uppermost sequence of Figure 2 comprising the groups 210, 212, 214 and 216. It will be appreciated that once the second order combined bitmap frame 230 has been generated, then the first order combined bitmap frame data 220 may be oveiwriften in memory. Similarly, as soon as the first order combined bitmap frame 220 has been generated, the image frame bitmaps corresponding to the group of 25 frames 210 no longer need be stored in memory. This makes the image tracking method according to the present technique memory efficient by enabling image frame data combination corresponding to a period of N*M frames without specifically requiring data corresponding to all of those frames to be substantially simultaneously stored in memory. The reduction in data volume also produces a corresponding reduction in processing overhead.
The image frame bitmaps 210, 212, 214, 216 may each correspond to a respective image of the captured image sequence. However, these image frame bitmaps may alternatively have been generated using inter-frame differences for attribute values such as the difference 0 between a current frame and an immediately previous frame. Furthermore, although in the example of Figure 2 a bitmap value is provided for each and every pixel of the image frame, it will be appreciated that a single bit could be characteristic of groups of pixels of the PAL image frame, the group of pixels corresponding to an image block, similarly to the way that "macroblocks" are used in image compression. Use of bitmaps comprising image blocks rather than individual pixels may provide a reduction in the data volume to be processed, but could potentially result in some loss of information due to the reduced granularity.
Figure 3A schematically illustrates how three different image frame bitmaps are combined using a logical OR operation to generate a combined image frame bitmap. In this example, for simplicity of illustration each image frame is represented by a grid of 10 x 10 pixels, but it will be appreciated that embodiments use larger image frames (e.g. 640 x 480 pixels). Brightness attribute values for each pixel received from a captured image have been compared with a brightness threshold such that any multi-bit values being greater than or equal to the threshold assigned a value of one whereas any brightness values less than the threshold are assigned a value of zero. The brightest pixels are shown as white pixels in Figure 3 whereas the dimmer pixels are shown by shaded pixels. A first image frame bitmap 310 shows a flame-like object in the centre of the image frame. A second image frame 320 shows that relative to the preceding image frame 310, three additional pixels 322, 324 and 326 have been assigned brightness values above the threshold. Then in the third image frame 330, a number of the bright pixels previously appeared in the second frame 320 have had reductions in brightness values resulting in the pixels being allocated characteristic values corresponding to dark pixels. In particular, pixels 332, 334 and 336 are now dark in the third image frame 330. Frame 340 represents a first order combined bitmap frame where the three image frame bitmaps 310, 320 and 330 have been logically combined using an OR operation. As a result of this, any pixels at any spatial position in the image frame that was allocated a brightness value above the threshold in any one of the previous image frames 310, 320 and 330 is allocated a bright value in the first order combined bitmap frame 340. It will be appreciated that the present technique is not limited to assigning values of one to pixels above the threshold and values of zero to pixels below a threshold, but the values could be allocated in the opposite sense. Furthermore, logical combinations other than an OR" operation or more than one logical operator could be used to combine frames having different image capture times. Figure 3A is illustrative of a simple pixel-wise logical combination of bitmaps.
Figure 3B is an empirically obtained example of a flame detection process performed according to the present technique. Figure 3B shows three consecutive image frames 350, 360 and 370 that have been bitmapped to highlight bright parts of the field of view. A fourth image frame 380 shows a bitwise-OR of the first three image frames 350, 360 and 370. It can be seen that the bitwise OR operations have resulted in a filling in of the voids of the flame image and a 0 smoothing out of the crinkly edges that can be seen in the individual image frames 350, 360 and 370.
If the connected region of white pixels from combined image frame 380 is identified as a region of interest for analysis, that is if it is identified as an object for tracking in the image sequence, then this flame object may be analysed into the indefinite future in the incoming video stream provided that the object does not stray outside of the OR boundary. It will be appreciated that when the OR frame 380 only includes three source frames then it is less likely to capture an accurate orbit" for an object than an OR-frame that includes a larger sample of source frames.
Empirical evidence has demonstrated that extending the OR-frame function to cover a time window of approximately 75 image frames or three seconds worth of video assuming a standard PAL frame rate of 25 frames per second provides reliable object tracking where the image object is a flame. Using these parameters it will be appreciated that the method according to the present technique, unlike previously known methods, can readily handle temporary occlusion of the object being tracked. In fact, provided that the object is not hidden for more than three seconds (assuming a second order combined bitmap frame spanning a three second time period) the tracking outcome algorithm according to the present technique should be completely unaffected by any occlusion of the object.
The bitwise OR of the image frame bitmaps captures an average position or uncertainty region of an object to be tracked. A long "averaging window' has a potential disadvantage that it may not adapt very quickly to sudden changes, for example when a smouldering fire suddenly bursts into life and becomes a more aggressive fire. However, this can be addressed by using a two-stage combination in which a first order bitmap combination spans a shorter timescale and a second order combination spans a longer timescale. Maintaining a sliding window of the most recent three one second OR bitmap frames means that in every second of time, a new one-second combined bitmap frame enters the sliding window and the oldest one second combined bitmap frame leaves the sliding window. In this way a region of interest calculated over a three second time period becomes resilient to occlusion and yet the method is still adaptive within one second or two to any radical change in behaviour such as flames bursting out.
Figure 4 is a flow chart that schematically illustrates an object tracking method according to the present technique. The process starts at block 410 and proceeds to block 420 where a loop is commenced over a number of different image attributes.
A pixel is a picture element of an image frame that typically represents a greyscale intensity to be displayed for that particular spatial portion of the image. Typically pixel values are integers in the range from zero, which represents black to 255 which represents white. The 256 possible greyscale intensity values are associated with the fact that pixel of a greyscale image is 0 typically represented as one byte of data composed of eight bits with each bit taking on two possible values leading to 28=256 possible values. Pixels of colour images require more storage space than pixels of greyscale images because pixels of colour images are typically represented by three values corresponding to red, green and blue intensities. The colour pixel values can alternatively be represented in a chrominance and luminance colour space (Cb, Cr, Y) where the luminance channel Y carries information regarding the brightness of the pixel.
According to the present technique, either monochrome or colour video data may be used as an input image sequence, although monochrome video sensors make more effective flame detectors at least due to the infra-red filters in colour cameras being detrimental to the flame detection aspect. Thus image attributes may correspond to greyscale intensity values, red, green or blue intensity values, luminance or chrominance values. For example where the attribute is a pixel greyscale intensity at block 430 of the flowchart of Figure 4 a range bitmap is created whereby 8 bit greyscale pixel values of each captured monochrome image frame are compared with a predetermined range comprising a brightness threshold value of, for example, (a value closer to the white end in the range 0 to 256) such that pixels having a greyscale pixel value greater than or equal to 200 are allocated a value of one in the bitmap whereas pixels having values less than 200 are allocated a value of zero in the image frame bitmap. The image frame bitmap creation process is described in more detail below with reference to the flowchart of Figure 5.
Once the image frame bitmap has been created from the input image sequence pixel values at block 430, at block 440 a first order logical OR operation is performed to combine bitmap frames corresponding to different image capture times in the received image sequence. The first order combined bitmap generation is described in more detail below with reference to Figure 6. The first order combined bitmap creation of box 440 involves logically combining a plurality N of the bitmaps generated at block 430, where the number of image frame bitmaps to be combined corresponding to a predetermined time period, such as, one second of captured video data. Note that N is a non-zero integer value.
At box 450, results of the first order combined bitmap creation for the given image attribute are output and stored to an available data memory such as the data cache 154 or the data memories 156 or 160 of Figure 1. Each first order combined bitmap in this example combines the attribute data for 25 frames of the input image sequence. The first order combined bitmaps generated at box 450 are subsequently used at box 460 as input data to generate a second order combined bitmap frame by performing a second logical combination of pixel values using pixel values at corresponding spatial positions of the first order combined bitmap but corresponding to different times in the image sequence. In this example embodiment, a sliding time window is used as shown in Figure 2, with the window advancing by one first order bitmap 0 frame period per iteration and a non-zero integer number M of first order combined bitmaps are logically combined to generate the second order combined bitmap frame. In this particular example, three temporally contiguous first order combined bitmap frames are used to generate each second order combined bitmap frame. At box 470 the second order bitmap frames are output and stored in memory. Each second older combined bitmap frame spans a total of M*N image frames, which in this case is 75 image frames corresponding to 3 seconds of PAL video data. The circle 475 indicates the end of the attribute loop that started at box 420. Thus boxes 430, 440, 450, 460 and 470 are repeated for each attribute type (brightness, flicker etc.) selected for analysis in the object tracking exercise.
Once all of the second order combined bitmap frames have been produced for each required image attribute the process goes to box 480 where a Boolean combination of the second order combined bitmap frames for at least two different attributes can be performed. For example, the second order combined bitmap frame for the brightness attribute may be combined via a logical AND operation with the second order combined bitmap frame for the same or a similar instant in time but corresponding to a flicker attribute. The time may not be exactly the same, for example, where the brightness attribute corresponds to an intra-frame attribute whereas the flicker attribute corresponds to an inter-frame difference. At box 490 the multi-attribute bitmap is output and stored. This is supplied to a blob detection algorithm at box 492, where regions of the image frame that are of interest are selected for further processing. The blob detection function is configured to find connected regions of bright pixels in the multi-attribute bitmap. In a simple implementation the blob detection of box 492 may be performed on a second order combined bitmap frame of a single attribute bitmap such as a greyscale intensity bitmap, skipping the Boolean combination stage of box 480 of Figure 4. The process ends at box 494.
Figure 5 is a flow chart that schematically illustrates image frame bitmap creation based on pixel attribute values of a video image sequence. This is a precursor to the first and second order bitmap combining of Figure 2. At box 510 the process starts and then at box 520 video frame is captured for bitmap creation. In this example, greyscale intensity values are utilised. At box 530 a loop is performed for a given captured image frame over all pixel positions within the frame. Inside the loop, at box 540 the pixel attribute value is measured or read from memory for a given pixel position (x, y). This is typically an eight-bit value corresponding to a greyscale intensity or luminance. At box 550 it is determined whether or not the attribute value falls within a predetermined range. The pixel attribute value may be based on a single captured image frame or may be based on an inter-frame difference. In order to create the image frame bitmap, a range or at least an upper or a lower threshold for the attribute value is used. For the purposes of this description we shall consider that a threshold is a particular example of a range, the threshold being characterised by a single boundary value rather than having configurable upper and lower limits. In this example the attribute will be taken to be a luminance 0 value with a threshold set at a value of 190. Thus any pixels having a luminance of 190 or greater at allocated a "1" at box 552 whereas any pixels having a luminance of less than 190 are allocated a 0" at box 554 in the bitmap. The circle 560 in figure 5 corresponds to the end of the loop over the pixel positions of the frame, so that boxes 540, 550 and either 552 or 554 are visited for each pixel of the frame.
Once creation of a single image frame bitmap has been completed in this loop, the process goes to box 570 where the image frame bitmap for all of the pixels of the image frame are stored.
The image frame bitmap values will be processed by the embedded processor of Figure 1 to generate first order combined bitmap frames and second order combined bitmap frames. For the purposes of this calculation bitmap values corresponding to a plurality of pixels may be packed into a single processor register. In particular, for a 32-bit register, bits of the image frame bitmap corresponding to 32 different pixel positions can be simultaneously stored in a single register. It will be appreciated that a binary value can be stored on a computer as a single bit and 8 bits can be packed into a single word of an 8-bit processor, 16 bits can be packed into a single word of a 16-bit processor, 32 bits can be packed into a single word of a 32-bit processor and so on. The bus width for a bus connecting the processor and memory, such as the bus 162 in the Figure 1 system, is also a consideration. It is possible to fetch 32 bits in a single RAM access on a system having a 32-bit data (RAM) bus. Assuming a 32-bit processor and a 32-bit data bus width, a single RAM access may fetch 32 results into memory substantially simultaneously, a single logical operator applied to one of the registers 130 (see Figure 1) effectively processes 32 pixels substantially simultaneously and a single bus access can write 32 pixel results back to RAM.
This packing technique is a form of vector processing that does not require special hardware.
Speeding up pixel processing by factors of 16 or 32 for 16-bit or 32 bit processors/buses respectively makes it practical to implement the object tracking and/or flame detection according to the present technique on modest and inexpensive processors such as the Analog Devices BlackfinRTM BF533 processor. This also allows for the processor and the image sensor to be provided in the same device. Previously known video based flame detectors required comparatively large processing power and so provided a computer separate from the image sensor. Note that packing of bits in this way can be applied to numerical values other than pixel intensities.
Thus SIMD operations according to the present technique can be performed to implement, for example, the OR operations substantially simultaneously on 32 pixel positions to combine bitmap values corresponding to those 32 pixel positions at different times in the image sequence. The results may be unpacked to separate out the individual pixel values upon completion of the calculation. The process ends at block 580. Note that the image bitmap 0 creation process of Figure 5 is not limited to using a single pixel attribute value to generate the image frame bitmap corresponding to a single instant in time of the image sequence. Instead, image frame bitmaps corresponding to the same instant in time but to two different attribute types may be logically combined in a process analogous to that of the flowchart of Figure 5, prior to the first order combined bitmap frame or the second order combined bitmap frame being generated. The first order and second order combinations involve combining bitmap frames corresponding to different times in the image sequence whereas the original range bitmap creation process of Figure 5 corresponds to the same or a similar point in time.
Figure 6 is a flow chart that schematically illustrates the first order combination of image frame bitmaps. The process begins at box 610 and at block 620 a first order logical combination duration is set to a time ti. This covers a non-zero integer number of frames N equal to tl*fps, where fps represents the number of frames per second in the received video sequence (e.g. 25Hz for PAL). At box 630 a loop is performed over a sequence of N image frame bitmaps such as those produced by the process of Figure 5. At box 640, a loop is performed over all pixel positions in the bitmap frame and in box 650 logical "OR" operations are performed on bitmap values corresponding to the given pixel position by combining all N different image frame bitmaps within the time period ti. Circle 642 corresponds to the end of the inner loop over the pixel positions, whereas a circle 632 corresponds to the end of the outer loop over the sequence of N attribute range bitmaps. At box 660, the first order combined bitmap frame results are output for a current group of image frame bitmaps spanning the time ti -As shown in Figure 2, in the example embodiment this corresponds to 25 images in the received video sequence. At box 670 it is determined whether or not there are more attribute bitmaps to process. If all of the attribute bitmaps have already been processed then the process ends at 672, but otherwise the process proceeds to box 680 where the next sequence of N image frame bitmaps is processed to generate a corresponding second order combined bitmap frame.
Figure 7 is a flow chart that schematically illustrates a second order combining of bitmaps. The process starts at element 710 and at box 720 a plurality of first order combined bitmap frames is accessed from memory. At block 720, a duration of the second order combination is set to a time t2 so that overall the temporal combining of bitmap data will span N*M image frames. At box 740, a loop is performed over M first order combined bitmap frames corresponding to a current position of the sliding window. Within this loop at block 750 an inner loop is performed over all pixel positions of the frame within a given first order bitmap image frame. At block 760, pixel-wise logical OR" operations are performed combining a non-zero integer number M of the first order combined bitmap frames accessed at box 720. The circle 752 corresponds to the end of the loop over the pixel positions of the flame. At block 770 outside the two groups, the second order combined bitmap frame is output for the current group of first order combined bitmap frames spanning the time ti. At box 780 it is determined whether or not there are any 0 more first order combined bitmap frames to process. If all of the first order combined bitmap frames have already been processed then the process ends at 782. Otherwise, the process goes to box 790 where the sliding window is shifted by a single first order combined bitmap frame. The time in seconds corresponding to the shift will depend upon how many image frame bitmaps are combined in the first order combined bitmap frame and also upon the frame rate of the incoming video sequence.
The object tracking according to the present technique represents image attribute measurements using a single bit per pixel via a bitmap. These individual bitmaps may be combined to represent more complex measurements such that for example, by ORing together a long series of compound bitmaps connected regions of interest in a sequence of image frames may be easily yet reliably identified. For example, combinations of characteristics such as brightness AND flickering can be identified and regions exhibiting the desired combinations of behaviours may be tracked over time using less processing power and less RAM than previously known techniques. Tracking over time allows practical applications of the object tracking such as fire detection (see Figures 8 and 9) by analysing temporal behaviour of the image object although the tests are primarily spatial in nature.
Figure 8 is a flowchart that schematically illustrates a process for identifying a flame flicker area in an image sequence for use when the apparatus of Figure 1 is implemented as a flame detector. The process starts at element 810 and progresses to box 820 where a sliding window comprising two adjacent greyscale image frames (i.e. monochrome video data) are used. The two greyscale image frames correspond to adjacent times in the image sequence corresponding to a time k and a time (k-i). At box 830 a brightness change image frame is created from a pixel-wise signed difference in brightness for the two adjacent greyscale image frames. As with previous example embodiments, the present technique is not limited to using pixel-wise differences but could instead use values corresponding to blocks of pixels of the frame such as average attribute values.
At box 840 a brightness increase bitmap is created from the brightness change image created at box 830. The brightness increase bitmap identifies the subset of pixels of the image frame for which the change in brightness is greater than a threshold value. Each pixel of the image frame is tested for this threshold condition. At box 840 a brightness decrease bitmap is created based on the brightness change image calculated at box 830. The brightness decrease bitmap identifies pixels of the image frame for which the change in brightness is such that the brightness has reduced by greater than a certain predetermined threshold. In this particular example, the magnitude of the brightness increase used to identify the relevant pixels (i.e. those satisfying the thresholding criteria) in the brightness increase bitmap at stage 840 and the 0 magnitude of the brightness decrease used to identify the relevant pixels in the brightness decrease bitmap at box 850 is the same, although these two values may differ in alternative embodiments. At box 860 a greyscale change bitmap is generated pixel by pixel by performing a logical combination comprising a logical "CR" of the brightness increase bitmap of box 840 and the brightness decrease bitmap of box 850 corresponding to the same pair of two image frames. At box 870 candidate flame features may be identified from the greyscale change bitmap. The greyscale change bitmap provides an indication of flame flicker in the case of detecting objects corresponding to potential flames in an image. A difference between two greyscale image frames, such as 8-bit frames, may result in a nine-bit per pixel difference value, which may be stored in 16 or 32 bits. According to the present technique, the greyscale frame difference may be stored as the brightness increase bitmap and the brightness decrease bitmap requiring storage of only two bits per pixel. The candidate flame features may be identified according to the object tracking method as described from Figure 3A to Figure 7.
Figure 9 is a flow chart that schematically illustrates a vertical motion test according to the present technique. The vertical motion test can be used to distinguish flame-type behaviour from other image objects in fire detectors. Candidate flame objects may be identified as having behaviour consistent with a fire hypothesis when a count indicates a predominant upwards movement of a candidate flame object. The process starts at element 910 and proceeds to box 912 where candidate flame features in an image frame are identified. The candidate flame features may be identified as the top of a detected object in an image or a flame flicker area identified from the OR of the brightness increase and brightness decrease bitmaps calculated at box 860 of the process of Figure 8. A plurality of different flame candidates may be identified at box 912 and at box 920 a loop is performed over the identified flame candidates with the number of candidates being denoted N1. At box 925 a count accumulated for a given flame candidate is initialised to zero.
At box 930 a loop is performed over a number of frames starting from the current frame, which is denoted frame "z" (e.g. frame 100 in a sequence of 100 frames). In this particular example, the current frame and 99 previous frames are included in the count. Empirically, it has been found that, for example, between 64 and 100 previous frames form a good sample upon which to obtain reliable results for flame detection via the accumulated count. Within this loop over the group of previous frames, at box 940 the candidate flame profile is compared in frame z and frame z-1. At box 950 the processor determines whether or not the candidate flame feature has moved closer to the top of the frame since the previous frame and if this is true then a count is incremented at box 952 and then the process returns to process the next pair of previous frames in the frame sequence at box 930 (the frame index z is decremented). It will be appreciated that in alternative embodiments the count could be decremented if upward motion 0 is detected and incremented if downwards motion is detected. The algorithm works in the same way in either case but the sign of the result is different if the count is decremented rather than incremented for upwards motion of a candidate flame object.
If instead at box 950 it is determined that the candidate flame feature has not moved closer to the top of the frame since the previous frame the process proceeds to box 960 where it is determined if the candidate flame feature has moved closer to the bottom of the frame since the previous frame. If this is the case then at box 962 the count is decremented by one and the process returns to the loop over frames at box 930 to consider the next pair of frames in the group of 100 contiguous frames. A circle 965 corresponds to the end of the loop over the number of frames. Outside the loop over the number of frames at box 970, a cumulative count for an individual flame candidate across the frame sample of 100 frames is output and at box 980 it is determined if the cumulative count is a positive value. If at box 980 it is found that the cumulative count is negative then the flame candidate is disregarded because its behaviour is inconsistent with flame having been identified. Alternatively, if at box 980 it is determined that the cumulative count is positive then at box 984 the image object is tagged as a flame and a positive detection has been made. An individual candidate flame feature has been categorised at box 982 or 984 the process proceeds to the circle 990 corresponding to the end of the loop over flame candidates and a next flame candidate is processed until all the flame candidates in the image sequence have been exhausted. The process ends at box 992.
The flame detection algorithm illustrated by Figure 9, which involves identifying a positive count value for an image object moving closer to the top of frame since the previous frame exploits the observation that flames have a tendency to rise up over several image frames and then to more suddenly collapse back as the flame front runs out of fuel. The algorithm of Figure 9 provides a computationally simple yet robust method of detecting this behaviour in the image objects.
Figure 10 schematically illustrates a sequence of consecutive image frames representing a typical fire. A sequence comprising its different frames is illustrated. The sequence comprises a first frame 1010, a second frame 1020, a third frame 1030, a fourth frame 1040, a fifth frame 1050, a sixth frame 1060, a seventh frame 1070 and an eighth frame 1080. As shown in Figure the first frame 1010, the second frame 1020, the fourth frame 1040 the fifth frame 1050 and the eighth frame 1080 are all marked with a "+" symbol. These are the subset of frames of the sequence in which the flame rose higher relative to a previous frame. By way of contrast in the third frame 1030 and the sixth frame 1060 the top of the flame fell back relative to the previous frame and so these two frames are marked with a "-" symbol. It has been demonstrated empirically that in the case of real flames, the "+" frames have a tendency to dominate. This is because a visible flame is seen to rise and expand over two or three frames due to hot air 0 buoyancy effects and then to suddenly collapse back, typically on a single frame when the flame front runs out of fuel. The fuel runs out because the evaporation rate of combustible gases is limited by the fixed pool surface area of the flame. This characteristic flame pattern of rise-rise-collapse tends to occur cyclically throughout the duration of fire.
Thus by implementing an algorithm to keep a running tally of "+" and "-" candidate flame objects, it is possible to reliably identify those objects most likely to be real flames. In particular, the number of "+" is expected to exceed the number of -" frames for an image sequence corresponding to a fire. Candidate flame objects that do not exhibit the predominant upwards motion consistent with a fire are rejected as candidates i.e. considered to be inconsistent with objects corresponding to flames.
According to the present technique single binary value (single bit) may be used per image measurement. The single bit may provide an indication if the measurement is or is not consistent with a fire hypothesis. The binary values may be retained whilst the measured results themselves may be discarded. A plurality of binary results may be combined using logical operators to produce another single binary result.
The flame detection system according to the present technique implemented in a detector such as that of Figure 1 works to first identify a list of candidate regions of interest (or "blobs") in image frames within a field of view using the technique described form Figure 3A through to Figure 7. The object tracking technique described herein also provides the ability to track the detected candidate flame objects over time and to subject the candidate flame objects to one or more temporal tests such as the vertical motion test of Figure 9. If a candidate flame object reaches a certain age (e.g. in number of image frames) having survived one or more elimination rounds corresponding to the one or more temporal tests then a fire alarm may be triggered. For example the "certain age" in some embodiments is 100 frames whilst blob detection typically takes up to 25 frames, providing a total detection time of 4.5 seconds on average.
One or more software programs that may implement or utilize the various techniques described in the embodiments may use an application programming interface (API), reusable controls, and the like. Such programs may be implemented in a high level procedural or object oriented programming language to communicate with a computer system. However, the program(s) may be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language, and combined with hardware implementations. The program instructions may be provided on a transitory (e.g. transmission) or a non-transitory (e.g. storage) medium. Where functionality has been described as being implemented by means of software, that functionality could equally be implemented solely in hardware (for example by means of one or more ASICs (application specific integrated circuit)) or indeed by a mix of hardware and software.
Where functional units have been described as circuitry, the circuitry may be general purpose processor circuitry configured by program code to perform specified processing functions. The circuitry may also be configured by modification to the processing hardware. Configuration of the circuitry to perform a specified function may be entirely in hardware, entirely in software or using a combination of hardware modification and software execution. Program instructions may be used to configure logic gates of general purpose or special-purpose processor circuitry to perform a processing function.
It should be understood that where the functional units described in this specification have been labelled as modules or units, this is to highlight their implementation independence. Note that a module may be implemented, for example, as a hardware circuit comprising custom VLSI circuits or gate arrays, off-the-shelf semiconductors such as logic chips, transistors, or other discrete components. A module or unit may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices or the like.
Modules or units may also be implemented in software for execution by various types of processors. An identified module of executable code may, for instance, comprise one or more physical or logical blocks of computer instructions, which may, for instance, be organized as an object, procedure, or function. Nevertheless, the executable program code of an identified module need not be physically located together, but may comprise disparate instructions stored in different locations which, when joined logically together, comprise the module and achieve the stated purpose for the module.
A module of executable code such as that implementing a function corresponding to a box of a flow chart may be a single instruction, or many instructions, and may even be distributed over several different code segments, among different programs, and across several memory devices. Similarly, operational data may be identified and illustrated herein within modules, and may be embodied in any suitable form and organized within any suitable type of data structure.
The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices, and may exist, at least partially, merely as electronic signals on a system or network. The modules may be passive or active, including agents operable to perform desired functions. The modules may be implemented at least in part in a cloud computing environment, where processing functions are distributed across different geographical locations.

Claims (31)

  1. CLAIMS: 1. Method for tracking an object in an image sequence comprising a plurality of image frames, the method comprising: determining a value of an image attribute for a plurality of image blocks of a respective image frame of the image sequence; generating an image frame bitmap for the respective image frame by assigning a characteristic value to each of the plurality of image blocks based upon a comparison of the image attribute with a range; generating a first order combined bitmap frame by performing a first logical combination of the characteristic values of spatially corresponding image blocks of a plurality, N, of image frame bitmaps spanning a first time period in the image sequence; generating a second order combined bitmap frame by performing a second logical combination of values of spatially corresponding image blocks of a plurality, M, of the first order combined bitmap frames; and identifying the object to be tracked using the second order combined bitmap frame.
  2. 2. The method of clam 1 wherein the image block comprises a single pixel.
  3. 3. The method of claim 1 or claim 2, wherein the range used for comparison of the image attribute value comprises a single predetermined threshold.
  4. 4. The method of any one of the preceding claims, wherein at least one of the first logical combination and the second logical combination comprises a logical OR operation.
  5. 5. The method of claim 4, wherein successive second order combined bitmap frames are generated using a sliding time window comprising M first order combined bitmap frames such that the second order combined bitmap frame spans M*N image frames, where M and N are non-zero integers.
  6. 6. The method of claim 5, wherein generation of the first order combined bitmap frame comprises packing the characteristic image values corresponding to a plurality of different image blocks of the image frame bitmap of a given image frame bitmap into a packed input operand and performing the first logical combination using packed input operands for corresponding image blocks of different ones of the N image frame bitmaps.
  7. 7. The method of claim 6, wherein generation of the second order combined bitmap frame comprises packing the values corresponding to a plurality of different image blocks of the first order combined bitmap frame of a given first order combined bitmap frame into a packed input operand and performing the second logical combination using packed input operands for corresponding image blocks of different ones of the M first order combined bitmap frames.
  8. 8. The method of claim 6 or claim 7 wherein the packed input operands comprise a number of bits corresponding to a processor register width.
  9. 9. The method of any one of the preceding claims, wherein the image attribute is a brightness value of an image block of a single image frame of the image sequence.
  10. 10. The method of any one of the preceding claims, wherein the image attribute comprises an inter-frame difference of a parameter value for an image block.
  11. 11. The method of claim 10, wherein the inter-frame difference is a flicker amount corresponding to a difference in brightness.
  12. 12. The method of any one of the preceding claims comprising performing a third logical 0 combination of two or more of the second order combined bitmap frames corresponding to respective different attribute values and spanning substantially the same time period in the image sequence.
  13. 13. The method of claim 12, wherein the third logical combination comprises combining two different attribute values corresponding to a brightness attribute and a flicker attribute.
  14. 14. Apparatus for tracking an object in an image sequence comprising a plurality of image frames, the apparatus comprising: an input interface for receiving the image sequence from a video capture sensor; memory for storing at least a subset of the plurality of image frames; a hardware processor having circuitry configured to: determine a value of an image attribute for a plurality of image blocks of a respective image frame of the image sequence; generate an image frame bitmap for the respective image frame by assigning a characteristic value to each of the plurality of image blocks based upon a comparison of the image attribute with a range; generate a first order combined bitmap frame by performing a first order logical combination of the characteristic values of spatially corresponding image blocks of a plurality, N, of image frame bitmaps spanning a first time period in the image sequence; generate a second order combined bitmap frame by performing a second order logical combination of values of spatially corresponding image blocks of a plurality, M, of the first order combined bitmap frames; and identify the object to be tracked using the second order combined bitmap frame.
  15. 15. The apparatus of claim 14 wherein the image block comprises a single pixel.
  16. 16. The apparatus of claim 14, wherein the range used for comparison of the image attribute value comprises a single predetermined threshold.
  17. 17. The apparatus of any one of the preceding claims, wherein at least one of the first logical combination and the second logical combination comprises a logical OR operation.
  18. 18. The apparatus of claim 14, wherein successive second order combined bitmap frames are generated using a sliding time window comprising M first order combined bitmap frames such that the second order combined bitmap frame spans M*N image frames.
  19. 19. The apparatus of any one of claims 14 to 18, wherein the hardware processor comprises an X-bit register file for loading operands for performing the first order logical combination and the second order logical combination and wherein generation of the first order combined bitmap frame comprises packing the characteristic image values corresponding to a plurality of different image blocks of the image frame bitmap of a given image frame bitmap into an X-bit packed input operand for loading into a register of the register file and performing the first logical combination using packed input operands for corresponding image blocks of different ones of the N image frame bitmaps.
  20. 20. The apparatus of claim 19, wherein the generation of the second order combined bitmap frame comprises packing X values corresponding to a plurality of different image blocks of the first order combined bitmap frame of a given first order combined bitmap frame into an X-bit packed input operand and performing the second logical combination by loading the X-bit packed input operands for corresponding image blocks of different ones of the M temporally contiguous first order combined bitmap frames into different registers of the register file.
  21. 21. The apparatus of any one of the preceding claims, wherein the image attribute is a brightness value of an image block of a single image frame of the image sequence.
  22. 22. The apparatus of any one of the preceding claims, wherein the image attribute comprises an inter-frame difference of a parameter value for an image block.
  23. 23. The apparatus of claim 22, wherein the inter-frame difference is a flicker amount corresponding to a difference in brightness.
  24. 24. The apparatus of any one of the preceding claims, wherein the processor is configured to perform a third logical combination of two or more of the second order combined bitmap frames corresponding to respective different attribute values and spanning substantially the same time period in the image sequence.
  25. 25. The apparatus of claim 12, wherein the third logical combination comprises combining two different attribute values corresponding to brightness and flicker.
  26. 26. A flame detector apparatus comprising a processor configured to implement the method of any one of claims ito 13.
  27. 27. An apparatus having processing circuitry configured to implement the method of any one of claims ito 13.
  28. 28. A computer program product comprising program instructions configured such that when executed on a processor they cause the processor to execute the method of any one of claims ito 13.
  29. 29. A method as substantially hereinbefore described with reference to the accompanying drawings.
  30. 30. An apparatus as substantially hereinbefore described with reference to the accompanying drawings.
  31. 31. A computer program product as substantially hereinbefore described with reference to the accompanying drawings.
GB1516527.7A 2015-09-17 2015-09-17 Object tracking in an image sequence Active GB2528194B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1516527.7A GB2528194B (en) 2015-09-17 2015-09-17 Object tracking in an image sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1516527.7A GB2528194B (en) 2015-09-17 2015-09-17 Object tracking in an image sequence

Publications (3)

Publication Number Publication Date
GB201516527D0 GB201516527D0 (en) 2015-11-04
GB2528194A true GB2528194A (en) 2016-01-13
GB2528194B GB2528194B (en) 2016-08-31

Family

ID=54544429

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1516527.7A Active GB2528194B (en) 2015-09-17 2015-09-17 Object tracking in an image sequence

Country Status (1)

Country Link
GB (1) GB2528194B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110580444A (en) * 2019-06-28 2019-12-17 广东奥园奥买家电子商务有限公司 human body detection method and device

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022096949A1 (en) * 2021-06-21 2022-05-12 Sensetime International Pte. Ltd. Method and apparatus for detecting game currency state, electronic device and storage medium
CN120467456B (en) * 2025-07-15 2025-09-19 江西善流慧联科技有限公司 Traffic monitoring method and system based on video image analysis and processing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
None *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110580444A (en) * 2019-06-28 2019-12-17 广东奥园奥买家电子商务有限公司 human body detection method and device
CN110580444B (en) * 2019-06-28 2023-09-08 时进制(上海)技术有限公司 Human body detection method and device

Also Published As

Publication number Publication date
GB2528194B (en) 2016-08-31
GB201516527D0 (en) 2015-11-04

Similar Documents

Publication Publication Date Title
CN106878668B (en) Movement detection of an object
GB2528195A (en) Flame detection in an image sequence
US11785339B2 (en) Automated camera mode selection
CN107257980B (en) Local Change Detection in Video
Lu A multiscale spatio-temporal background model for motion detection
CN104573625A (en) face detection
US10255683B1 (en) Discontinuity detection in video data
US20090257680A1 (en) Method and Device for Video Stitching
US20200175692A9 (en) Low-complexity motion detection based on image edges
CN108875519B (en) Object detection method, device and system and storage medium
Mankani et al. Real-time implementation of object detection and tracking on DSP for video surveillance applications
GB2528194A (en) Object tracking in an image sequence
US11200681B2 (en) Motion detection method and motion detection system with low computational complexity and high detection accuracy
Pawaskar et al. Detection of moving object based on background subtraction
JP2020504383A (en) Image foreground detection device, detection method, and electronic apparatus
WO2024140154A1 (en) Gaze detection method and apparatus, and electronic device and storage medium
Madeo et al. An optimized stereo vision implementation for embedded systems: application to RGB and infra-red images
US11270412B2 (en) Image signal processor, method, and system for environmental mapping
US10965877B2 (en) Image generating method and electronic apparatus
TWI526994B (en) Moving object detection method and image monitoring device with moving object detection function based on programmable logic array (FPGA)
CN101924870A (en) Mobile image data generator and system using mobile image data, and related method
US20240314432A1 (en) Methods, storage media, and systems for detecting a persisting or sustained blur condition
Chen et al. Design and Realization of Real-Time Motion Target Detection System Based on ZYNQ
Matrella et al. An embedded video sensor for a smart traffic light
CN107977644A (en) Image processing method and device, computing device based on image capture device