[go: up one dir, main page]

CN117975383A - Vehicle positioning and identifying method based on multi-mode image fusion technology - Google Patents

Vehicle positioning and identifying method based on multi-mode image fusion technology Download PDF

Info

Publication number
CN117975383A
CN117975383A CN202410387616.2A CN202410387616A CN117975383A CN 117975383 A CN117975383 A CN 117975383A CN 202410387616 A CN202410387616 A CN 202410387616A CN 117975383 A CN117975383 A CN 117975383A
Authority
CN
China
Prior art keywords
visible light
image
light image
model
vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410387616.2A
Other languages
Chinese (zh)
Other versions
CN117975383B (en
Inventor
邓乾
刘文平
李思涵
杨凌晨
刘行军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HUBEI UNIVERSITY OF ECONOMICS
Original Assignee
HUBEI UNIVERSITY OF ECONOMICS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HUBEI UNIVERSITY OF ECONOMICS filed Critical HUBEI UNIVERSITY OF ECONOMICS
Priority to CN202410387616.2A priority Critical patent/CN117975383B/en
Publication of CN117975383A publication Critical patent/CN117975383A/en
Application granted granted Critical
Publication of CN117975383B publication Critical patent/CN117975383B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/54Surveillance or monitoring of activities, e.g. for recognising suspicious objects of traffic, e.g. cars on the road, trains or boats
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/809Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
    • G06V10/811Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data the classifiers operating on different input data, e.g. multi-modal recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a vehicle positioning and identifying method based on a multi-mode image fusion technology, which comprises the following steps: acquiring an infrared image and a corresponding visible light image of a target vehicle in a current environment; judging whether the current environment is a dark light environment, if so, inputting the infrared image and the visible light image into a fusion generation model to obtain an enhanced visible light image output by the fusion generation model, otherwise, taking the visible light image as the enhanced visible light image; inputting the infrared image and the enhanced visible light image into a vehicle detection model to obtain the position and model of a target vehicle output by the vehicle detection model; the vehicle detection model is trained based on the second sample infrared image and the second sample visible light image and the position label and the model label of the corresponding vehicle. The application realizes the passive positioning and vehicle type recognition of the target vehicle, and ensures that more accurate vehicle positioning and recognition results can be obtained under different illumination conditions.

Description

Vehicle positioning and identifying method based on multi-mode image fusion technology
Technical Field
The application belongs to the technical field of computer vision, and particularly relates to a vehicle positioning and identifying method based on a multi-mode image fusion technology.
Background
The vehicle positioning and identifying technology mainly adopts a target detection technology to accurately position and identify a plurality of target vehicles of different categories in images or videos, and can be applied to the fields of intelligent transportation, automatic driving, security monitoring and the like.
Under night driving conditions, existing vehicle positioning devices, such as license plate positioners and vehicle GPS devices, have the problem of insufficient visibility, which directly affects the accuracy of positioning and identifying vehicles, thereby possibly threatening traffic safety, weakening traffic monitoring efficiency, and delaying vehicle tracking in emergency situations.
Disclosure of Invention
Aiming at the defects of the prior art, the application aims to provide a vehicle positioning and identifying method based on a multi-mode image fusion technology, which aims to solve the problem that the positioning and identifying technology of the existing vehicle is poor in accuracy in a night environment.
In order to achieve the above object, in a first aspect, the present application provides a vehicle positioning and identifying method based on a multi-mode image fusion technology, including the steps of:
step S101, acquiring an infrared image and a corresponding visible light image of a target vehicle in a current environment;
Step S102, judging whether the current environment is a dark light environment, if so, inputting the infrared image and the visible light image into a fusion generation model to obtain an enhanced visible light image output by the fusion generation model, otherwise, taking the visible light image as the enhanced visible light image;
The fusion generation model is obtained by combining a discrimination model to generate countermeasure training based on the first sample infrared image and the first sample visible light image, and the discrimination model is used for discriminating the authenticity of the sample enhanced visible light image generated by the fusion generation model;
step S103, inputting the infrared image and the enhanced visible light image into a vehicle detection model to obtain the position and model of the target vehicle output by the vehicle detection model;
The vehicle detection model is obtained through training based on the second sample infrared image and the second sample visible light image and the position label and the model label of the corresponding vehicle.
In an optional example, inputting the infrared image and the visible light image into a fusion generation model to obtain an enhanced visible light image output by the fusion generation model, specifically includes:
Inputting the infrared image and the visible light image into a fusion generation model, respectively carrying out convolution processing on the infrared image and the visible light image by the fusion generation model, carrying out splicing processing on characteristics obtained by the convolution processing on characteristic channels, and inputting the characteristics obtained by the splicing processing into a pix2pix generator in the fusion generation model to obtain the enhanced visible light image;
Or the fusion generation model carries out convolution processing on the infrared image and the visible light image respectively, the features obtained by the convolution processing are spliced on the feature channel, the features obtained by the splicing processing are input to the SE attention module in the fusion generation model, and the output result of the SE attention module is input to the pix2pix generator in the fusion generation model to obtain the enhanced visible light image.
In an alternative example, the fusion generation model is specifically trained with the constraint of consistency between the sample enhanced visible light image and the first sample visible light image; the sample enhanced visible light image is generated by fusing a fusion generation model in the training process based on the simulated visible light image and the first sample infrared image; the simulated visible light image is obtained by randomly shielding and darkening the first sample visible light image.
In an alternative example, inputting the infrared image and the enhanced visible light image into a vehicle detection model to obtain a position and a model of the target vehicle output by the vehicle detection model, specifically including:
Inputting the infrared image and the enhanced visible light image into a vehicle detection model, firstly adopting double branches to respectively extract the infrared image characteristic and the visible light image characteristic by the vehicle detection model, respectively extracting the multiscale characteristics of the infrared image characteristic and the visible light image characteristic, calculating the attention weight between the multiscale characteristics of the infrared image characteristic and the visible light image characteristic by using an SE attention mechanism so as to respectively generate an infrared enhanced characteristic and a visible light enhanced characteristic, then carrying out a shuffle operation on the infrared enhanced characteristic and the visible light enhanced characteristic to obtain a mixed characteristic, and finally carrying out vehicle positioning and model classification based on the mixed characteristic to obtain the position and the model of the target vehicle.
In an alternative example, the loss function of the vehicle detection model includes a cross entropy loss between infrared enhancement features and visible enhancement features, a CIOU loss for a vehicle localization task, and a Focal loss for a vehicle model classification task.
In an alternative example, step S103 further includes:
converting the position of the target vehicle into the position of the target vehicle under a camera coordinate system based on an internal reference matrix of the camera corresponding to the infrared image;
based on the external reference matrix of the camera, the position of the target vehicle in the camera coordinate system is converted into the position of the target vehicle in the world coordinate system.
In a second aspect, the present application provides a vehicle positioning and identification system based on a multi-modal image fusion technique, comprising:
the image acquisition module is used for acquiring an infrared image and a corresponding visible light image of the target vehicle in the current environment;
The fusion generation module is used for judging whether the current environment is a dark light environment, if so, inputting the infrared image and the visible light image into a fusion generation model to obtain an enhanced visible light image output by the fusion generation model, otherwise, taking the visible light image as the enhanced visible light image; the fusion generation model is obtained by combining a discrimination model to generate countermeasure training based on the first sample infrared image and the first sample visible light image, and the discrimination model is used for discriminating the authenticity of the sample enhanced visible light image generated by the fusion generation model;
the vehicle detection module is used for inputting the infrared image and the enhanced visible light image into a vehicle detection model to obtain the position and the model of the target vehicle output by the vehicle detection model; the vehicle detection model is obtained through training based on the second sample infrared image and the second sample visible light image and the position label and the model label of the corresponding vehicle.
In a third aspect, the present application provides an electronic device comprising: at least one memory for storing a program; at least one processor for executing a memory-stored program, which when executed is adapted to carry out the method described in the first aspect or any one of the possible implementations of the first aspect.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when run on a processor, causes the processor to perform the method described in the first aspect or any one of the possible implementations of the first aspect.
In a fifth aspect, the application provides a computer program product which, when run on a processor, causes the processor to perform the method described in the first aspect or any one of the possible implementations of the first aspect.
It will be appreciated that the advantages of the second to fifth aspects may be found in the relevant description of the first aspect, and are not described here again.
In general, the above technical solutions conceived by the present application have the following beneficial effects compared with the prior art:
The application provides a vehicle positioning and identifying method based on a multi-mode image fusion technology, which comprises the steps of acquiring an infrared image and a corresponding visible light image of a target vehicle in a current environment, judging whether the current environment is a dark light environment, if so, fusing an input infrared image and a visible light image with insufficient light by a fusion generation model to generate a high-quality visible light image, and then carrying out joint target detection by combining useful information of the infrared image and the high-quality visible light image by a vehicle detection model, thereby fully utilizing image information of two modes, realizing passive positioning and vehicle type identification of the target vehicle, and ensuring that more accurate vehicle positioning and identifying results can be acquired under different illumination conditions.
Drawings
FIG. 1 is a schematic flow chart of a vehicle locating and identifying method according to an embodiment of the present application;
FIG. 2 is a second flow chart of a method for locating and identifying a vehicle according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a network structure of a fusion generation model according to an embodiment of the present application;
FIG. 4 is a second diagram of a network structure of a fusion generation model according to an embodiment of the present application;
FIG. 5 is a detection flow chart of a vehicle detection model provided by an embodiment of the present application;
FIG. 6 is a block diagram of a vehicle locating and identification system provided by an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The terms "first" and "second" and the like in the description and in the claims are used for distinguishing between different objects and not for describing a particular sequential order of objects. For example, the first sample infrared image and the second sample infrared image, etc., are sample infrared images for discriminative training of different models, and are not used to describe a particular order of sample infrared images.
In embodiments of the application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
First, technical terms involved in the embodiments of the present application will be described.
Infrared imaging technology: infrared imaging techniques refer to capturing infrared radiation emitted by an object with an infrared sensor, thereby generating an infrared image. The infrared image can be used for analyzing the heat distribution of an object, and is suitable for monitoring and identification under night or low-visibility conditions.
Visible light imaging technology: the visible light imaging technology is to capture visual information such as appearance and color of an object by using a visible light sensor, and generate a visible light image. The visible light image is suitable for viewing details and features of the object.
The infrared dual-mode camera is a camera device with the capability of simultaneously acquiring and processing visible light images and infrared images. Such cameras typically comprise two independent sensors: one for capturing visible light images and the other for capturing infrared radiation images. In this way it is able to provide information about two different perspectives of a scene, which is very valuable in certain applications. In the visible mode, the infrared dual mode camera is capable of recording color images similar to conventional cameras, which are suitable for viewing visual features such as size, shape, and color of objects. In the infrared mode, the camera captures infrared radiation emitted by the object, which is related to the temperature of the object. The method combines the visible light shooting and infrared thermal imaging technologies, and can acquire visible light images and infrared images simultaneously.
Deep learning (DEEP LEARNING) is a branch of machine learning, which imitates the working mode of a human brain neural network and performs data processing and learning through a multi-level neural network. The traditional geometric core idea is to learn and extract high-level abstract features from a large amount of data by constructing and training a multi-level neural network model so as to realize effective classification and prediction of the data. Deep learning can realize accurate positioning of vehicles by constructing a multi-level neural network model and automatically learning complex features in images. By combining the traditional geometric methods, such as plane geometry, space geometry and the like, the problems of attitude estimation, scale transformation and the like in the positioning process can be effectively solved.
Object Detection (Object Detection) is an important task in the field of computer vision, aimed at accurately locating and identifying a number of different classes of Object in an image or video. Unlike image classification, which only requires determining whether an object is present in an image, object detection requires locating the position of the object in the image and classifying each object.
The cross-modal re-identification of an infrared image to a visible image refers to the task of converting the infrared image into a corresponding visible image. Infrared images and visible light images are different physical modalities that differ greatly in image characteristics and content. Through cross-modal re-identification, the infrared image can be converted into a visible light image, so that the understanding and the visualization of the infrared image content are realized.
Next, the technical scheme provided in the embodiment of the present application is described.
The application provides a vehicle positioning and identifying method based on a multi-mode image fusion technology, and fig. 1 is one of flow diagrams of the vehicle positioning and identifying method provided by the embodiment of the application, as shown in fig. 1, and the method comprises the following steps:
step S101, acquiring an infrared image and a corresponding visible light image of a target vehicle in a current environment;
Step S102, judging whether the current environment is a dark light environment, if so, inputting the infrared image and the visible light image into a fusion generation model to obtain an enhanced visible light image output by the fusion generation model, otherwise, taking the visible light image as the enhanced visible light image;
The fusion generation model is obtained by combining a discrimination model to generate countermeasure training based on the first sample infrared image and the first sample visible light image, and the discrimination model is used for discriminating the authenticity of the sample enhanced visible light image generated by the fusion generation model;
step S103, inputting the infrared image and the enhanced visible light image into a vehicle detection model to obtain the position and model of the target vehicle output by the vehicle detection model;
The vehicle detection model is obtained through training based on the second sample infrared image and the second sample visible light image and the position label and the model label of the corresponding vehicle.
Here, the target vehicle, that is, the vehicle that needs to be positioned and identified, may be one or more vehicles, which is not particularly limited in the embodiment of the present application.
Specifically, an infrared dual-mode camera may be used to capture a visible light image and a corresponding infrared image of a target vehicle in a current environment. In consideration of poor visible light image quality in a dark light environment such as a night environment, the embodiment of the application judges whether the current environment is the dark light environment, if so, the fusion generation model is applied to combine the dominant parts of the infrared image and the visible light image, thereby generating a high-definition enhanced visible light image, and for a bright light environment, the step can be omitted, and the visible light image is directly used as the enhanced visible light image. On the basis, the vehicle detection model can carry out combined target detection on the infrared image and the enhanced visible light image to obtain the position and model of the target vehicle. Optionally, the determining whether the current environment is a dark light environment may specifically be performed according to whether the brightness of the collected visible light image is lower than a preset brightness threshold.
It can be understood that the first sample infrared image and the first sample visible light image, and the second sample infrared image and the second sample visible light image are respectively high-quality sample image pairs for training different models, and can be acquired by using an infrared dual-mode camera under the condition of more sufficient illumination. Further, after the model of the target vehicle is acquired, the parameter information of the target vehicle may be acquired according to a correspondence between the pre-stored vehicle model and the parameter information. The parameter information of brands, years, colors and the like corresponding to the vehicle model can be obtained through a web crawler.
According to the method provided by the embodiment of the application, the infrared image and the corresponding visible light image of the target vehicle in the current environment are obtained, whether the current environment is the dim light environment or not is judged, if the current environment is the dim light environment, the fusion generation model fuses the input infrared image and the visible light image with insufficient light to generate the high-quality visible light image, the vehicle detection model synthesizes the useful information of the infrared image and the high-quality visible light image to perform combined target detection, the image information of two modes is fully utilized, the passive positioning and the vehicle type recognition of the target vehicle are realized, and the accurate vehicle positioning and recognition results can be obtained under different illumination conditions.
Based on the above embodiment, inputting the infrared image and the visible light image into a fusion generation model, and obtaining the enhanced visible light image output by the fusion generation model specifically includes:
Inputting the infrared image and the visible light image into a fusion generation model, respectively carrying out convolution processing on the infrared image and the visible light image by the fusion generation model, carrying out splicing processing on characteristics obtained by the convolution processing on characteristic channels, and inputting the characteristics obtained by the splicing processing into a pix2pix generator in the fusion generation model to obtain the enhanced visible light image;
Or the fusion generation model carries out convolution processing on the infrared image and the visible light image respectively, the features obtained by the convolution processing are spliced on the feature channel, the features obtained by the splicing processing are input to the SE attention module in the fusion generation model, and the output result of the SE attention module is input to the pix2pix generator in the fusion generation model to obtain the enhanced visible light image.
It should be noted that, the fusion generation model adopts an improved pix2pix network structure, so that the image quality of the generated enhanced visible light image can be further improved, and the accuracy of vehicle positioning and recognition can be further improved. The improved pix2pix network structure provides two structural alternatives, wherein the SE attention module is a channel attention module, the SE attention module can carry out channel feature enhancement on the input feature map, the size of the input feature map is not changed, and the image generation effect can be further improved.
In addition, in the cross-mode task of converting the infrared image into the corresponding visible light image, the original visible light image is combined, and the image quality of the generated enhanced visible light image is further improved.
Based on any one of the above embodiments, the fusion generation model is specifically trained with a constraint of consistency between the sample enhanced visible light image and the first sample visible light image; the sample enhanced visible light image is generated by fusing a fusion generation model in the training process based on the simulated visible light image and the first sample infrared image; the simulated visible light image is obtained by randomly shielding and darkening the first sample visible light image.
It can be understood that, considering that if the first sample infrared image and the first sample visible light image are directly used as input samples of the fusion generation model, the training label, that is, the fusion image label, is difficult to obtain, for this reason, in the embodiment of the present application, the first sample visible light image is firstly randomly shielded and darkened to obtain the simulated visible light image, so that a region of the simulated visible light image may be black in a dark environment, then the simulated visible light image and the first sample infrared image are used as input samples of the fusion generation model, the first sample visible light image is used as the training label, that is, the consistency between the sample enhanced visible light image generated by the fusion generation model and the first sample visible light image is used as constraint, the initial fusion generation model is trained, and finally the trained fusion generation model is obtained.
The consistency between the enhanced visible light image and the first sample visible light image can be judged through the judging model, and then the training of the model is generated through constraint fusion of the judging result.
Based on any one of the above embodiments, inputting the infrared image and the enhanced visible light image to a vehicle detection model, obtaining a position and a model of the target vehicle output by the vehicle detection model specifically includes:
Inputting the infrared image and the enhanced visible light image into a vehicle detection model, firstly adopting double branches to respectively extract the infrared image characteristic and the visible light image characteristic by the vehicle detection model, respectively extracting the multiscale characteristics of the infrared image characteristic and the visible light image characteristic, calculating the attention weight between the multiscale characteristics of the infrared image characteristic and the visible light image characteristic by using an SE attention mechanism so as to respectively generate an infrared enhanced characteristic and a visible light enhanced characteristic, then carrying out a shuffle operation on the infrared enhanced characteristic and the visible light enhanced characteristic to obtain a mixed characteristic, and finally carrying out vehicle positioning and model classification based on the mixed characteristic to obtain the position and the model of the target vehicle.
It should be noted that, the vehicle detection model adopts an improved Dual yolo structure network, designs a Dual-Fusion (D-Fusion) module, and includes an Attention Fusion module composed of a Inception module and an Attention-Fusion module, and a Fusion-Shuffle module connected in series, so as to effectively fuse the features of two different modes. Wherein Inception module extracts multi-scale characteristics to reduce calculation cost; the Attention-Fusion module calculates the Attention weight between infrared and visible image features by using an SE Attention mechanism and generates two enhanced features, specifically, calculates the Attention feature vector of the visible image features by using the infrared image features and then combines the Attention feature vector with the visible image features to generate the enhanced visible image features, and vice versa, so as to obtain the two enhanced features; the Fusion-Shuffle module further enhances and shuffles the enhanced features. The detection module carries out vehicle positioning and model classification based on mixed features, adopts four detection heads, and each detection head is respectively responsible for detecting target objects with different dimensions such as small, medium, large, ultra-large and the like, so that the target objects with different sizes are covered, and the comprehensiveness of detection is ensured.
By the design, the Dual-YOLO architecture not only reduces redundant information, but also effectively accelerates the convergence speed of the network. Experimental results show that the infrared target detection performance is remarkably improved by the framework, and an effective solution is provided for target detection at night or under low illumination conditions.
The input of the vehicle detection model is an infrared image and an enhanced visible light image, and the output comprises:
position coordinates of the detection frame: including x, y coordinates representing the center position of the frame, as well as the width and height;
Target class probability: each detection frame can output the probability of each type of vehicle, and the most probable type, namely model, in the detection frame is represented;
Confidence level: indicating the likelihood of inclusion of the target vehicle within the detection frame.
Based on any of the above embodiments, the loss function of the vehicle detection model includes a cross entropy loss between the infrared enhancement features and the visible enhancement features, a CIOU loss for a vehicle localization task and a Focal loss for a vehicle model classification task.
It should be noted that, in the design of the loss function, the cross entropy loss between the infrared enhancement feature and the visible light enhancement feature, i.e. the feature entropy loss, is used to penalize the redundant feature in the attention fusion module, so as to improve the generalization capability of the vehicle detection model, and the positioning loss and the classification loss adopt CIoU loss and Focal loss to improve the accuracy and stability of the detection, wherein CIoU loss is used to measure the position error between the prediction frame and the real frame, so as to realize the stability of frame regression, and Focal loss is used to measure the error between the prediction category and the real category.
Based on any of the above embodiments, step S103 further includes:
converting the position of the target vehicle into the position of the target vehicle under a camera coordinate system based on an internal reference matrix of the camera corresponding to the infrared image;
based on the external reference matrix of the camera, the position of the target vehicle in the camera coordinate system is converted into the position of the target vehicle in the world coordinate system.
It will be appreciated that the position output by the vehicle detection model is essentially voxel information of the target vehicle detection frame, and that camera internal and external parameters are also required to calculate the actual position of the target vehicle. The infrared image corresponds to the camera, namely the camera for collecting the infrared image and the visible light image in the step S101, namely the infrared dual-mode camera.
Specifically, the embodiment of the application firstly acquires detection frame information of a vehicle, including a center position (X, Y) and a size (W, H), by using a vehicle detection model based on a double-YOLO architecture.
The position (u, v) of the center point of the vehicle model in the image coordinate system can be obtained by the following formula:
,/>
Using the internal reference matrix K, and the external reference matrix consisting of the rotation matrix R and the translation matrix T, (u, v) is converted into a position in the world coordinate system. First, the image coordinates (u, v) are converted into points in the camera coordinate system
Then, the external parameter matrix is used forConversion to points in the world coordinate System/>
Thus, the accurate position of each vehicle in the world coordinate system can be obtained, and high-precision vehicle positioning is realized.
Based on any of the above embodiments, at present, the vehicle positioning technology is not good at night, and there are problems that the reflection effect of the traditional reflective material under the low illumination condition is limited, the configuration of the lighting device is insufficient, the design of the device itself fails to fully consider the night use environment, and the like, and the factors work together, so that the night vehicle positioning device is difficult to reach the visibility level in the daytime.
Aiming at the defects of the prior art, the application aims to provide a vehicle positioning and identifying method based on a multi-mode image fusion technology, which realizes the passive positioning of a vehicle by using an infrared dual-mode camera to reconstruct the three-dimensional position and calculate the coordinates of the vehicle in all-weather (daytime and night) modes.
The vehicle positioning and identifying method based on the multi-mode image fusion technology can be applied to the fields of intelligent transportation, automatic driving, security monitoring and the like. Fig. 2 is a second flow chart of a vehicle positioning and identifying method according to an embodiment of the present application, as shown in fig. 2, the vehicle positioning method includes steps S10 to S40, and is described in detail as follows:
S10, acquiring a picture of a vehicle and corresponding public vehicle parameter information from the Internet; shooting an infrared image and a corresponding visible light image by using an infrared dual-mode camera to manufacture a corresponding data set;
S20, training the data set of the infrared image and the visible light image by using an improved pix2pix model, and predicting the input infrared image and the visible light image with insufficient light in a night environment to obtain a high-quality visible light image;
S30, carrying out joint target detection on the infrared image and the visible light image by adopting a vehicle detection model based on the improvement of a double YOLO architecture;
S40, according to the target detection result, combining the internal reference information and the external reference information of the camera, and realizing a pixel coordinate system reincarnation boundary coordinate system of the vehicle, thereby realizing positioning.
According to the application, a data set is acquired from the Internet and the dual-mode camera, a cross-mode conversion model and a vehicle detection model are trained, voxel information in a labeling frame, namely a detection frame, is acquired, and the actual position of the vehicle is calculated by utilizing the internal parameters and the external parameters of the camera, so that the three-dimensional coordinates of one or more vehicles under the infrared dual-mode camera can be calibrated, and the passive positioning of the vehicle is realized.
In this step S10, the specific steps of obtaining the vehicle picture and the corresponding model and vehicle parameter information may be:
(1) And acquiring a vehicle picture by writing and developing a crawler program through Python, and extracting parameter information such as model, brand, year, color and the like. By using an XML library of Python, an XML file can be created, and the picture path and corresponding parameter information are stored in the XML node according to a certain structure and specification. And cleaning the obtained vehicle pictures and models by data, storing the obtained vehicle pictures and models into a data set in a voc format after data processing, establishing parameter information corresponding to the models by using XML, and storing the parameter information by using an XML file.
(2) And shooting an infrared image and a corresponding visible light image in daytime by using an infrared dual-mode camera to manufacture a corresponding data set.
Wherein, in step S20:
And (3) in the night environment, using an improved pix2pix network to fuse the generated model, and training the model by utilizing the data of the sub-step (2) in the step S10, wherein the input of the model is a visible light image and a corresponding infrared image in a low-illumination environment, and the model outputs a high-definition visible light image, namely the enhanced visible light image. The daytime environment does not proceed to this step and proceeds directly to step S30.
Fig. 3 is one of network structure diagrams of a fusion generation model provided in the embodiment of the present application, as shown in fig. 3, compared with a standard pix2pix network structure, an input part is modified in a generator part by using an improved pix2pix, two inputs are respectively convolved, and a feature of the convolution is spliced on a feature channel, so as to obtain the fusion generation model. Fig. 4 is a second schematic diagram of a network structure of a fusion generation model according to an embodiment of the present application, as shown in fig. 4, compared to the network structure of fig. 3, an SE attention module is added, and the SE attention module performs channel feature enhancement on an input feature map without changing the size of the input feature map.
After the trained model is obtained, a visible light image and an infrared image with insufficient light are input in the night environment, and a visible light image with higher definition is generated.
Wherein, in step S30:
the improved YOLOV model of the present application will now be described in detail using the dataset prepared in step (2) in step S10 to train a vehicle detection model under visible light, first using the improved YOLOV model. The application carries out fusion target detection on the infrared image and the visible light image based on the double-YOLO architecture, acquires a detection frame for target detection, and divides the detection frame.
Fig. 5 is a detection flow chart of a vehicle detection model provided by an embodiment of the present application, as shown in fig. 5, a main design based on a Double-YOLO (Double-YOLO) architecture is as follows: based on YOLOv design, the hierarchical recognition structure comprising P1 to P6 is adopted, and the characteristic extraction aspect adopts a double-branch backlight to extract infrared and visible light image characteristics respectively. The design of a Dual-Fusion (D-Fusion) module, including a Attention Fusion module composed of Inception modules and Attention-Fusion modules, and a Fusion-Shuffle module in series, aims to effectively fuse the characteristics of two different modalities. Wherein Inception module: and multi-scale characteristics are extracted, and the calculation cost is reduced. The Attention-Fusion module: infrared and visible features are enhanced with each other by SE attention mechanisms. Fusion-Shuffle module: the infrared and visible light features are integrated by a shuffle operation to adapt the network to both modes.
In addition, the double YOLO architecture adopts four detection heads, so that targets with different sizes are covered, and the comprehensiveness of detection is ensured. In the design of the Loss function, the feature entropy Loss is used for punishing redundant features in the fusion module, and the positioning Loss and the classification Loss adopt CIoU and Focal Loss to improve the accuracy and the stability of detection.
By the design, the Dual-YOLO architecture not only reduces redundant information, but also effectively accelerates the convergence speed of the network. Experimental results show that the infrared target detection performance is remarkably improved by the framework, and an effective solution is provided for target detection at night or under low illumination conditions.
Then, the relevant parameters are configured according to the method, the data set is divided into a training set and a verification set, and the corresponding proportion is 8:2, training by using a training module of the improved YOLOV model according to the embodiment of the application, and finally obtaining a vehicle detection model.
Next, the infrared image and the corresponding visible light image are collected as training data. The images should be paired, i.e. the infrared image and the visible image correspond to each other. The collected images are preprocessed, including adjusting the image size, normalizing, removing noise, etc. The infrared image and the visible image are paired and a data set is created such that each sample contains a pair of infrared image and visible image.
And training the model by utilizing the preprocessed training data set. In each training iteration, the model accepts a pair of infrared and visible light images as input, calculates a loss function, and updates model parameters by a back propagation algorithm. The training process continues until the model performance reaches a predetermined threshold. The model is input into an infrared image and an enhanced visible light image, and is output into a target detection frame and category information.
Wherein, in step S40:
the embodiment of the application firstly obtains the detection frame information of the vehicle by utilizing a vehicle detection model based on an improved double YOLO architecture, wherein the detection frame information comprises the center position (X, Y) and the size (namely the width and the height) (W, H) of the detection frame.
The position (u, v) of the center point of the vehicle model in the image coordinate system can be obtained by the following formula:
,/>
Using the internal reference matrix K, and the external reference matrix consisting of the rotation matrix R and the translation matrix T, (u, v) is converted into a position in the world coordinate system. First, the image coordinates (u, v) are converted into points in the camera coordinate system
Then, the external parameter matrix is used forConversion to points in the world coordinate System/>
Thus, the accurate position of each vehicle in the world coordinate system can be obtained, and high-precision vehicle positioning is realized. Through this process, not only the position of the vehicle but also its orientation and posture can be determined.
Based on any one of the embodiments, the embodiment of the application provides a vehicle positioning and identifying system based on a multi-mode image fusion technology. FIG. 6 is a block diagram of a vehicle locating and recognition system according to an embodiment of the present application, as shown in FIG. 6, the system includes:
The image acquisition module 610 is configured to acquire an infrared image and a corresponding visible light image of a target vehicle in a current environment;
The fusion generation module 620 is configured to determine whether the current environment is a dark light environment, if so, input the infrared image and the visible light image to a fusion generation model to obtain an enhanced visible light image output by the fusion generation model, otherwise, take the visible light image as the enhanced visible light image; the fusion generation model is obtained by combining a discrimination model to generate countermeasure training based on the first sample infrared image and the first sample visible light image, and the discrimination model is used for discriminating the authenticity of the sample enhanced visible light image generated by the fusion generation model;
A vehicle detection module 630, configured to input the infrared image and the enhanced visible light image into a vehicle detection model, and obtain a position and a model of the target vehicle output by the vehicle detection model; the vehicle detection model is obtained through training based on the second sample infrared image and the second sample visible light image and the position label and the model label of the corresponding vehicle.
It can be understood that the detailed functional implementation of each module may be referred to the description in the foregoing method embodiment, and will not be repeated herein.
Based on the method in the above embodiment, the embodiment of the application provides an electronic device. Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application, as shown in fig. 7, the electronic device may include: processor 710, communication interface (Communications Interface) 720, memory 730, and communication bus 740, wherein processor 710, communication interface 720, memory 730 communicate with each other via communication bus 740. Processor 710 may invoke logic instructions in memory 730 to perform the methods of the embodiments described above.
Further, the logic instructions in the memory 730 described above may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application.
Based on the method in the above embodiment, the embodiment of the present application provides a computer-readable storage medium storing a computer program, which when executed on a processor, causes the processor to perform the method in the above embodiment.
Based on the method in the above embodiments, an embodiment of the present application provides a computer program product, which when run on a processor causes the processor to perform the method in the above embodiments.
It is to be appreciated that the processor in embodiments of the present application may be a central processing unit (central processing unit, CPU), other general purpose processor, digital signal processor (DIGITAL SIGNAL processor, DSP), application Specific Integrated Circuit (ASIC), field programmable gate array (field programmable GATE ARRAY, FPGA) or other programmable logic device, transistor logic device, hardware components, or any combination thereof. The general purpose processor may be a microprocessor, but in the alternative, it may be any conventional processor.
The steps of the method in the embodiment of the present application may be implemented by hardware, or may be implemented by executing software instructions by a processor. The software instructions may be comprised of corresponding software modules that may be stored in random access memory (random access memory, RAM), flash memory, read-only memory (ROM), programmable ROM (PROM), erasable programmable ROM (erasable PROM, EPROM), electrically Erasable Programmable ROM (EEPROM), registers, hard disk, removable disk, CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted across a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Drive (SSD)), etc.
It will be appreciated that the various numerical numbers referred to in the embodiments of the present application are merely for ease of description and are not intended to limit the scope of the embodiments of the present application.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the application and is not intended to limit the application, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the application are intended to be included within the scope of the application.

Claims (10)

1. The vehicle positioning and identifying method based on the multi-mode image fusion technology is characterized by comprising the following steps of:
step S101, acquiring an infrared image and a corresponding visible light image of a target vehicle in a current environment;
Step S102, judging whether the current environment is a dark light environment, if so, inputting the infrared image and the visible light image into a fusion generation model to obtain an enhanced visible light image output by the fusion generation model, otherwise, taking the visible light image as the enhanced visible light image;
The fusion generation model is obtained by combining a discrimination model to generate countermeasure training based on the first sample infrared image and the first sample visible light image, and the discrimination model is used for discriminating the authenticity of the sample enhanced visible light image generated by the fusion generation model;
step S103, inputting the infrared image and the enhanced visible light image into a vehicle detection model to obtain the position and model of the target vehicle output by the vehicle detection model;
The vehicle detection model is obtained through training based on the second sample infrared image and the second sample visible light image and the position label and the model label of the corresponding vehicle.
2. The method according to claim 1, wherein inputting the infrared image and the visible light image into a fusion generation model, obtaining an enhanced visible light image output by the fusion generation model, specifically comprises:
Inputting the infrared image and the visible light image into a fusion generation model, respectively carrying out convolution processing on the infrared image and the visible light image by the fusion generation model, carrying out splicing processing on characteristics obtained by the convolution processing on characteristic channels, and inputting the characteristics obtained by the splicing processing into a pix2pix generator in the fusion generation model to obtain the enhanced visible light image;
Or the fusion generation model carries out convolution processing on the infrared image and the visible light image respectively, the features obtained by the convolution processing are spliced on the feature channel, the features obtained by the splicing processing are input to the SE attention module in the fusion generation model, and the output result of the SE attention module is input to the pix2pix generator in the fusion generation model to obtain the enhanced visible light image.
3. The method according to claim 1, wherein the fusion generated model is trained with the constraint of consistency between the sample enhanced visible image and the first sample visible image; the sample enhanced visible light image is generated by fusing a fusion generation model in the training process based on the simulated visible light image and the first sample infrared image; the simulated visible light image is obtained by randomly shielding and darkening the first sample visible light image.
4. The method according to claim 1, wherein inputting the infrared image and the enhanced visible light image to a vehicle detection model to obtain a position and a model of the target vehicle output by the vehicle detection model, specifically comprises:
Inputting the infrared image and the enhanced visible light image into a vehicle detection model, firstly adopting double branches to respectively extract the infrared image characteristic and the visible light image characteristic by the vehicle detection model, respectively extracting the multiscale characteristics of the infrared image characteristic and the visible light image characteristic, calculating the attention weight between the multiscale characteristics of the infrared image characteristic and the visible light image characteristic by using an SE attention mechanism so as to respectively generate an infrared enhanced characteristic and a visible light enhanced characteristic, then carrying out a shuffle operation on the infrared enhanced characteristic and the visible light enhanced characteristic to obtain a mixed characteristic, and finally carrying out vehicle positioning and model classification based on the mixed characteristic to obtain the position and the model of the target vehicle.
5. The method of claim 4, wherein the loss function of the vehicle detection model includes a cross entropy loss between infrared enhancement features and visible enhancement features, a CIOU loss for a vehicle locating task, and a Focal loss for a vehicle model classification task.
6. The method according to claim 1, further comprising, after step S103:
converting the position of the target vehicle into the position of the target vehicle under a camera coordinate system based on an internal reference matrix of the camera corresponding to the infrared image;
based on the external reference matrix of the camera, the position of the target vehicle in the camera coordinate system is converted into the position of the target vehicle in the world coordinate system.
7. A vehicle locating and recognition system based on a multi-modal image fusion technique, comprising:
the image acquisition module is used for acquiring an infrared image and a corresponding visible light image of the target vehicle in the current environment;
The fusion generation module is used for judging whether the current environment is a dark light environment, if so, inputting the infrared image and the visible light image into a fusion generation model to obtain an enhanced visible light image output by the fusion generation model, otherwise, taking the visible light image as the enhanced visible light image; the fusion generation model is obtained by combining a discrimination model to generate countermeasure training based on the first sample infrared image and the first sample visible light image, and the discrimination model is used for discriminating the authenticity of the sample enhanced visible light image generated by the fusion generation model;
the vehicle detection module is used for inputting the infrared image and the enhanced visible light image into a vehicle detection model to obtain the position and the model of the target vehicle output by the vehicle detection model; the vehicle detection model is obtained through training based on the second sample infrared image and the second sample visible light image and the position label and the model label of the corresponding vehicle.
8. An electronic device, comprising:
At least one memory for storing a computer program;
At least one processor for executing the memory-stored program, which processor is adapted to perform the method according to any of claims 1-6 when the memory-stored program is executed.
9. A computer readable storage medium storing a computer program, characterized in that the computer program, when run on a processor, causes the processor to perform the method according to any one of claims 1-6.
10. A computer program product, characterized in that the computer program product, when run on a processor, causes the processor to perform the method according to any of claims 1-6.
CN202410387616.2A 2024-04-01 2024-04-01 Vehicle positioning and identifying method based on multi-mode image fusion technology Active CN117975383B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410387616.2A CN117975383B (en) 2024-04-01 2024-04-01 Vehicle positioning and identifying method based on multi-mode image fusion technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410387616.2A CN117975383B (en) 2024-04-01 2024-04-01 Vehicle positioning and identifying method based on multi-mode image fusion technology

Publications (2)

Publication Number Publication Date
CN117975383A true CN117975383A (en) 2024-05-03
CN117975383B CN117975383B (en) 2024-06-21

Family

ID=90865069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410387616.2A Active CN117975383B (en) 2024-04-01 2024-04-01 Vehicle positioning and identifying method based on multi-mode image fusion technology

Country Status (1)

Country Link
CN (1) CN117975383B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119600393A (en) * 2024-09-04 2025-03-11 中国电子科技集团公司第二十八研究所 Infrared and visible light image gating fusion method based on illumination sensing network

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116106A1 (en) * 1995-06-07 2002-08-22 Breed David S. Vehicular monitoring systems using image processing
CN108169765A (en) * 2016-12-07 2018-06-15 法乐第(北京)网络科技有限公司 Improve the method and electronic equipment of automatic Pilot reliability
GB201911577D0 (en) * 2019-08-13 2019-09-25 Univ Of Hertfordshire Higher Education Corporation Method and apparatus
CN111198371A (en) * 2020-03-03 2020-05-26 杭州中车数字科技有限公司 Forward-looking obstacle detection system
CN111327800A (en) * 2020-01-08 2020-06-23 深圳深知未来智能有限公司 All-weather vehicle-mounted vision system and method suitable for complex illumination environment
US20210224993A1 (en) * 2020-01-20 2021-07-22 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for training generative network, method for generating near-infrared image and device
CN113935935A (en) * 2021-10-19 2022-01-14 天翼数字生活科技有限公司 Dark light image enhancement method based on fusion of visible light and near infrared light
US20220094847A1 (en) * 2020-09-21 2022-03-24 Ambarella International Lp Smart ip camera with color night mode
CN114332655A (en) * 2021-12-30 2022-04-12 西安建筑科技大学 A vehicle adaptive fusion detection method and system
CN115170430A (en) * 2022-07-21 2022-10-11 西北工业大学 Two-stage condition-based method for generating anti-network near-infrared image coloring
US20220335715A1 (en) * 2019-08-13 2022-10-20 University Of Hertfordshire Higher Education Corporation Predicting visible/infrared band images using radar reflectance/backscatter images of a terrestrial region
CN115457456A (en) * 2022-08-22 2022-12-09 武汉理工大学 Multispectral pedestrian detection method and system based on intelligent vehicle
CN115641514A (en) * 2022-09-30 2023-01-24 宁波大学 A Pseudo-Visible Light Cloud Image Generation Method for Nighttime Sea Fog Monitoring
CN116309228A (en) * 2023-03-27 2023-06-23 西安交通大学 Visible light image conversion infrared image method based on generative adversarial network
CN116704450A (en) * 2023-05-29 2023-09-05 招商局公路网络科技控股股份有限公司 Vehicle identity recognition method and device based on deep learning
CN117115630A (en) * 2023-08-30 2023-11-24 安徽大学 A multispectral vehicle re-identification method under strong light based on cycle consistency
CN117152093A (en) * 2023-09-04 2023-12-01 山东奇妙智能科技有限公司 Tire defect detection system and method based on data fusion and deep learning

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020116106A1 (en) * 1995-06-07 2002-08-22 Breed David S. Vehicular monitoring systems using image processing
CN108169765A (en) * 2016-12-07 2018-06-15 法乐第(北京)网络科技有限公司 Improve the method and electronic equipment of automatic Pilot reliability
US20220335715A1 (en) * 2019-08-13 2022-10-20 University Of Hertfordshire Higher Education Corporation Predicting visible/infrared band images using radar reflectance/backscatter images of a terrestrial region
GB201911577D0 (en) * 2019-08-13 2019-09-25 Univ Of Hertfordshire Higher Education Corporation Method and apparatus
CN111327800A (en) * 2020-01-08 2020-06-23 深圳深知未来智能有限公司 All-weather vehicle-mounted vision system and method suitable for complex illumination environment
US20210224993A1 (en) * 2020-01-20 2021-07-22 Beijing Baidu Netcom Science And Technology Co., Ltd. Method for training generative network, method for generating near-infrared image and device
CN111198371A (en) * 2020-03-03 2020-05-26 杭州中车数字科技有限公司 Forward-looking obstacle detection system
US20220094847A1 (en) * 2020-09-21 2022-03-24 Ambarella International Lp Smart ip camera with color night mode
CN113935935A (en) * 2021-10-19 2022-01-14 天翼数字生活科技有限公司 Dark light image enhancement method based on fusion of visible light and near infrared light
CN114332655A (en) * 2021-12-30 2022-04-12 西安建筑科技大学 A vehicle adaptive fusion detection method and system
CN115170430A (en) * 2022-07-21 2022-10-11 西北工业大学 Two-stage condition-based method for generating anti-network near-infrared image coloring
CN115457456A (en) * 2022-08-22 2022-12-09 武汉理工大学 Multispectral pedestrian detection method and system based on intelligent vehicle
CN115641514A (en) * 2022-09-30 2023-01-24 宁波大学 A Pseudo-Visible Light Cloud Image Generation Method for Nighttime Sea Fog Monitoring
CN116309228A (en) * 2023-03-27 2023-06-23 西安交通大学 Visible light image conversion infrared image method based on generative adversarial network
CN116704450A (en) * 2023-05-29 2023-09-05 招商局公路网络科技控股股份有限公司 Vehicle identity recognition method and device based on deep learning
CN117115630A (en) * 2023-08-30 2023-11-24 安徽大学 A multispectral vehicle re-identification method under strong light based on cycle consistency
CN117152093A (en) * 2023-09-04 2023-12-01 山东奇妙智能科技有限公司 Tire defect detection system and method based on data fusion and deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XUDONG KANG等: "Global-Local Feature Fusion Network for Visible-Infrared Vehicle Detection", 《IEEE GEOSCIENCE AND REMOTE SENSING LETTERS》, 19 March 2024 (2024-03-19), pages 1 - 5 *
蔡彬彬: "基于深度学习的无人机地面目标检测技术", 《中国优秀硕士学位论文全文数据库工程科技Ⅱ辑》, no. 06, 15 June 2022 (2022-06-15), pages 034 - 366 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119600393A (en) * 2024-09-04 2025-03-11 中国电子科技集团公司第二十八研究所 Infrared and visible light image gating fusion method based on illumination sensing network
CN119600393B (en) * 2024-09-04 2025-12-05 中国电子科技集团公司第二十八研究所 A method for gating and fusing infrared and visible light images based on illumination sensing networks

Also Published As

Publication number Publication date
CN117975383B (en) 2024-06-21

Similar Documents

Publication Publication Date Title
Cui et al. Deep learning for image and point cloud fusion in autonomous driving: A review
Li et al. Traffic light recognition for complex scene with fusion detections
CN113281780B (en) Method and device for marking image data and electronic equipment
Wang et al. V2I-CARLA: A novel dataset and a method for vehicle reidentification-based V2I environment
Mijić et al. Traffic sign detection using YOLOv3
CN115620090A (en) Model training method, low-illumination target re-identification method and device, and terminal equipment
CN114596548B (en) Target detection method, device, computer equipment and computer readable storage medium
CN119625279A (en) Multimodal target detection method, device and multimodal recognition system
CN118840646A (en) Image processing analysis system based on deep learning
Wen et al. YOFIR: High precise infrared object detection algorithm based on YOLO and FasterNet
CN117789144A (en) A cross network lane line detection method and device based on weight fusion
CN117975383B (en) Vehicle positioning and identifying method based on multi-mode image fusion technology
Wang et al. KCDNet: Multimodal object detection in modal information imbalance scenes
Zhu et al. Enhanced detection of small and occluded road vehicle targets using improved YOLOv5
CN110909656B (en) Pedestrian detection method and system integrating radar and camera
Liu et al. Mastering adverse weather: a two-stage approach for robust semantic segmentation in autonomous driving
CN117994625B (en) Feature fusion visibility evaluation method and system based on millimeter wave radar
CN118038409B (en) Vehicle drivable region detection method, device, electronic equipment and storage medium
CN119763056A (en) Target identification method, device, nonvolatile storage medium and computer equipment
Liu et al. A review of image and point cloud fusion-based 3D object detection for autonomous driving
CN119295877A (en) Adaptive perception and positioning method based on unsupervised fusion BEV
CN118629007A (en) Traffic sign recognition method and system
CN117830769A (en) Automatic driving target detection system safety test method based on semantic perception
CN119723579B (en) A monocular vision 3D object labeling method based on multimodal data
CN120656043B (en) Pedestrian clothing recognition method, device, equipment and storage medium based on improved YOLOv8 model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant