CN104008095A

CN104008095A - Object recognition method based on semantic feature extraction and matching

Info

Publication number: CN104008095A
Application number: CN201210556032.0A
Authority: CN
Inventors: 艾浩军; 艾雄军; 艾晓敏
Original assignee: Wuhan San Ji Internet Of Things Science And Technology Ltd
Current assignee: Wuhan San Ji Internet Of Things Science And Technology Ltd
Priority date: 2013-02-25
Filing date: 2013-02-25
Publication date: 2014-08-27

Abstract

The invention provides an object recognition method based on semantic feature extraction and matching and belongs to the field of information retrieval. The object recognition method based on semantic feature extraction and matching includes semantic feature extraction and semantic feature matching. The semantic feature extraction includes firstly extracting SIFT (Scale Invariant Feature Transform) feature points of training images of a class of objects, then performing spatial clustering on the SIFT feature points through k- means clustering, deciding a plurality of efficient points in every space class through a decision-making mechanism based on kernel function, and finally training the efficient points in every space class through a support vector machine classifier; a visual word with semantic features is trained from every space class, and finally a visual vocabulary describing the semantic features of a class of objects is extracted. The semantic feature matching includes firstly extracting SIFT feature points of an image of an object to be detected as the semantic description of the object to be detected, then using the support vector machine classifier for matching and classifying the semantic description of the object to be detected and visual vocabularies of classes of objects, and finally counting a histogram of the visual vocabulary of the object to be detected for determining the class of the object to be detected.

Description

object identification method based on semantic feature extraction and matching

Technical Field

The invention belongs to the field of information retrieval, and particularly relates to an object identification method based on semantic feature extraction and matching.

Background

The essence of object identification is to establish a computing system capable of identifying the object type of interest in an image, which has wide application requirements in real life and has quite high application value and research significance. In recent years, with the development of pattern classification techniques and the continuous development of artificial intelligence, object recognition techniques based on semantic feature extraction are becoming popular among a large number of scholars. The semantic features of the objects are obtained by extracting the local features of one class of objects and then converting the local features into semantic information describing one class of objects according to a certain processing criterion to form a semantic feature model of the one class of objects, so that feasible and effective object classification and identification effects are realized.

In the current field of object recognition, the Bag of Words algorithm is one of the most representative object recognition algorithms. The algorithm considers that an image is composed of several visual words with semantic information. A plurality of local features in the picture are extracted and converted into visual words, a visual word histogram of the picture is generated according to the relation between the visual words and the visual vocabulary, and the visual word histogram expresses the features of the picture, so that the recognition and classification of objects can be effectively realized.

The visual vocabulary in the Bag of Words algorithm is represented by the cluster centers after k-means clustering of local descriptors. And clustering the characteristic points of all pictures of one type of objects, wherein the number of clusters is the vocabulary of the visual word vocabulary, and the visual words of the vocabulary are the clustering centers of each type. However, only a single feature point in the center of a cluster is used as a description of a class of feature points, and local features are not fully utilized, and semantic information generated after clustering is not fully utilized. A single cluster center loses a large amount of semantic information and is not suitable as an effective visual word.

Disclosure of Invention

The invention aims to solve the problems in the prior art and provides a semantic feature extraction method based on kernel function decision.

The invention is realized by the following technical scheme:

a method for extracting and matching semantic features of an object in an object recognition process, the method comprising:

(1) a semantic feature extraction part: firstly, selecting a plurality of pictures of a class of objects as a training library, and extracting SIFT feature points of all the pictures; performing spatial clustering on all SIFT feature points through a k-means clustering algorithm, and then deciding a plurality of effective points in each spatial category by using a decision mechanism based on a kernel function; training effective points in each space category by using a support vector machine classifier, training a visual word with semantic features in each space category, and finally extracting a visual vocabulary table capable of describing the semantic features of a class of objects; selecting training pictures of multiple objects, and extracting the visual vocabulary of each object to form the visual vocabulary of the multiple objects.

(2) And a semantic feature matching part: firstly, extracting SIFT feature points of a picture of an object to be detected as semantic description of the object to be detected; matching and classifying the semantic description of the object to be detected and the visual vocabulary of the objects of multiple types by using a support vector machine classifier; and (4) counting the visual vocabulary histogram of the object to be detected, and determining the category of the object to be detected.

Wherein the semantic feature extraction part comprises the steps of:

(1) selecting a class of objectsFraming training pictures, and extracting SIFT feature points of each picture;

(2) all the characteristic points of the object of the classAre clustered intoExtracting the clustering center of each space class according to the space class；

(3) According to a decision mechanism based on kernel function, forMaking a decision on each feature point in the class to obtain a decision value of each feature point；

(4) Setting up support vector machine class training data identification. Selecting a plurality of effective characteristic points in a single space category as training points of the category; selecting a plurality of effective points of all other spatial categories as feature training points of other categories;

(5) aiming at the training data marked in the step (4), learning and training are carried out by utilizing an SVM classifier, and the training is carried out to obtain the object of the same typeA visual word, namely a visual vocabulary of the object;

(6) selecting training pictures of multiple classes of objects, and extracting the visual vocabulary of each class to form the visual vocabulary of the multiple classes of objects.

The calculation formula of the step (3) is as follows:

wherein,representing points to be measuredAnd cluster centerThe euclidean distance of (c).

The category in the step (4) is marked as. In the process of training a visual word, a training sample is setThen, the method for determining the training sample class identifier is as follows:

wherein,the spatial class is represented by a set of spatial classes,representing a range of class decision values.

The semantic feature matching section includes the steps of:

(1) aiming at the picture of the object to be identified, SIFT transformation is carried out on the picture, local feature points are detected, and the picture is extractedSIFT descriptor；

(2) Visual vocabulary for local descriptors and multi-class objectsMatching and counting each descriptorCorresponding visual wordThe type of object in which it is located;

(3) statistics ofAnd forming a visual word histogram of the unknown object by the matching number of the test descriptors and the visual vocabulary of each object category, and determining the category of the object to be tested.

The calculation formula of the step (2) is as follows:

compared with the prior art, the invention has the beneficial effects that: in the semantic feature extraction stage, the semantic loss phenomenon caused by taking the original single clustering center as a visual word is replaced, a plurality of effective points in each space category are selected as the visual word through a kernel function decision mechanism, and semantic feature information with stable categories and rich information is extracted.

Drawings

The invention is described in further detail below with reference to the accompanying drawings:

FIG. 1 is a block diagram of the semantic feature extraction step of the present invention.

FIG. 2 is a block diagram of the semantic feature matching step of the present invention.

Detailed Description

1. A semantic feature extraction part: as shown in fig. 1, firstly, selecting a plurality of pictures of a class of objects as a training library, and extracting SIFT feature points of all the pictures; performing spatial clustering on all SIFT feature points through a k-means clustering algorithm, and then deciding a plurality of effective points in each spatial category by using a decision mechanism based on a kernel function; training effective points in each space category by using a support vector machine classifier, training a visual word with semantic features in each space category, and finally extracting a visual vocabulary table capable of describing the semantic features of a class of objects; selecting training pictures of multiple objects, and extracting the visual vocabulary of each object to form the visual vocabulary of the multiple objects.

2. And a semantic feature matching part: as shown in fig. 2, firstly, SIFT feature points of a picture of an object to be detected are extracted as semantic descriptions of the object to be detected; matching and classifying the semantic description of the object to be detected and the visual vocabulary of the objects of multiple types by using a support vector machine classifier; and (4) counting the visual vocabulary histogram of the object to be detected, and determining the category of the object to be detected.

The semantic feature extraction part comprises the following steps:

The calculation formula of the step (3) is as follows:

The semantic feature matching section includes the steps of:

The calculation formula of the step (2) is as follows:

。

Claims

1. An object recognition method based on semantic feature extraction and matching, the method comprising:

(1) a semantic feature extraction part: firstly, selecting a plurality of pictures of a class of objects as a training library, and extracting SIFT feature points of all the pictures; performing spatial clustering on all SIFT feature points through a k-means clustering algorithm, and then deciding a plurality of effective points in each spatial category by using a decision mechanism based on a kernel function; training effective points in each space category by using a support vector machine classifier, training a visual word with semantic features in each space category, and finally extracting a visual vocabulary table capable of describing the semantic features of a class of objects; selecting training pictures of multiple objects, and extracting a visual vocabulary list of each object to form the visual vocabulary list of the multiple objects;

2. The method according to claim 1, wherein the semantic feature extraction section comprises the steps of:

(4) Setting up support vector machine class training data identificationSelecting a plurality of effective characteristic points in a single space category as training points of the category; selecting a plurality of effective points of all other spatial categories as feature training points of other categories;

3. The method according to claim 1, wherein the semantic feature matching component comprises the steps of:

4. The method of claim 2, wherein the calculation formula of step (3) is as follows:

5. The method of claim 2, wherein the category is identified as in step (4)In the process of training a visual word, a training sample is setThen, the method for determining the training sample class identifier is as follows:

6. The method of claim 3, wherein the calculation formula of step (2) is as follows:

。