[go: up one dir, main page]

CN104021224A - Image labeling method based on layer-by-layer label fusing deep network - Google Patents

Image labeling method based on layer-by-layer label fusing deep network Download PDF

Info

Publication number
CN104021224A
CN104021224A CN201410290316.9A CN201410290316A CN104021224A CN 104021224 A CN104021224 A CN 104021224A CN 201410290316 A CN201410290316 A CN 201410290316A CN 104021224 A CN104021224 A CN 104021224A
Authority
CN
China
Prior art keywords
layer
representation
image
deep network
label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410290316.9A
Other languages
Chinese (zh)
Inventor
徐常胜
袁召全
桑基韬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201410290316.9A priority Critical patent/CN104021224A/en
Publication of CN104021224A publication Critical patent/CN104021224A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image labeling method based on a layer-by-layer label fusing deep network. The method comprises the following steps: extracting a bottom layer vision characteristic for a training image with centralized training; layering the label of the training image to construct a hierarchical structure of the label; fusing the bottom vision characteristic information and label information layer by layer for the training image and obtaining the layered characteristic representation of the training image through parameter learning of the deep network; extracting a bottom layer vision characteristic for a testing image with centralized test, thereby obtaining the layered characteristic representation through deep network learning; and finally, forecasting the labeling information of the image according to layered characteristic representation of the testing image. The image labeling method disclosed by the invention belongs to layered labeling and is more precise than conventional labeling methods.

Description

基于逐层标签融合深度网络的图像标注方法Image annotation method based on layer-by-layer label fusion deep network

技术领域technical field

本发明涉及社交网络图像标注技术领域,尤其涉及一种基于逐层标签融合深度网络的图像标注方法。The invention relates to the technical field of social network image tagging, in particular to an image tagging method based on layer-by-layer tag fusion deep network.

背景技术Background technique

近年来,随着社交媒体的不断发展,社交平台上的图像数量呈爆炸式增长,如何对海量的社交图像进行标注成为网络多媒体领域重要的研究内容。In recent years, with the continuous development of social media, the number of images on social platforms has exploded. How to label massive social images has become an important research content in the field of network multimedia.

目前主流的图像标注方法主要集中在基于视觉信息的方法,该类方法首先进行底层特征提取,然后利用机器学习模型来对基于特征表示的图像进行分类。该类方法在一定程度上取得了较好的效果,然而由于仅利用视觉信息而忽视了其上下文的文本信息,其效果仍不够理想。The current mainstream image annotation methods mainly focus on methods based on visual information. This type of method first extracts the underlying features, and then uses machine learning models to classify images based on feature representation. This type of method has achieved good results to a certain extent, but its effect is still not ideal because it only uses visual information and ignores its contextual text information.

图像标注的核心在于利用图像相关的信息(包括视觉,上下文文本标签信息等)进行图像内容的理解,融合图像的标签信息和视觉信息,得到更加有表达能力的图像特征,对图像标注,特别是社交图像有重要的促进作用。然而,视觉特征和文本标签信息的异构性,给两类信息的融合带来了挑战,本发明提出的基于逐层标签融合深度网络的图像标注方法逐层地融合两类信息,解决了异构信息融合的难题,对于社交图像标注有着重要的作用。The core of image annotation is to use image-related information (including vision, contextual text label information, etc.) to understand image content, integrate image label information and visual information, and obtain more expressive image features. Social imagery has an important facilitative effect. However, the heterogeneity of visual features and text label information brings challenges to the fusion of the two types of information. The image annotation method based on the layer-by-layer label fusion deep network proposed in the present invention fuses the two types of information layer by layer and solves the problem of heterogeneity. The problem of structural information fusion plays an important role in social image annotation.

发明内容Contents of the invention

为了解决现有技术中存在的上述问题,本发明提出了一种基于逐层标签融合深度网络的图像标注方法。In order to solve the above-mentioned problems existing in the prior art, the present invention proposes an image labeling method based on layer-by-layer label fusion deep network.

本发明提出的一种基于逐层标签融合深度网络的图像标注方法包括以下步骤:A kind of image labeling method based on layer-by-layer label fusion depth network proposed by the present invention comprises the following steps:

步骤1、对于训练集中的训练图像,提取其底层视觉特征X;Step 1. For the training image in the training set, extract its underlying visual feature X;

步骤2、对于所述训练图像的标签进行层级化,构建标签的层级结构;Step 2. Hierarchize the labels of the training images to construct a hierarchical structure of the labels;

步骤3、对于所述训练图像,逐层融合其底层视觉特征信息和标签信息,并通过深度网络参数学习,得到所述训练图像的层级特征表示;Step 3, for the training image, fuse its underlying visual feature information and label information layer by layer, and learn the hierarchical feature representation of the training image through deep network parameter learning;

步骤4、对于测试集中的测试图像,提取其底层视觉特征,然后通过所述深度网络学习得到其层级特征表示,最后根据所述测试图像的层级特征表示预测其标注信息。Step 4. For the test images in the test set, extract their underlying visual features, then obtain their hierarchical feature representations through the deep network learning, and finally predict their annotation information according to the hierarchical feature representations of the test images.

互联网图像标注在很多重要的相关领域已经有了广泛的应用。由于视觉顶层信息与高层语义之间的语义鸿沟的存在,基于视觉的图像标注是一个具有挑战性的难题。本发明提出的上述基于逐层标签融合深度网络的图像标注的方法能够自动对社交图像进行标注,另外本发明层级的标注方法比传统的标注方法更加精确。Internet image annotation has been widely used in many important related fields. Vision-based image annotation is a challenging problem due to the existence of a semantic gap between visual top-level information and high-level semantics. The above-mentioned image tagging method based on layer-by-layer tag fusion deep network proposed by the present invention can automatically tag social images, and in addition, the layered tagging method of the present invention is more accurate than the traditional tagging method.

附图说明Description of drawings

图1是根据本发明一实施例的基于逐层标签融合深度网络的图像标注方法的流程图;Fig. 1 is the flowchart of the image tagging method based on layer-by-layer tag fusion depth network according to an embodiment of the present invention;

图2是标签层级示例图;Figure 2 is an example diagram of label hierarchy;

图3是根据本发明一实施例的逐层特征融合深度网络的模型结构图。Fig. 3 is a model structure diagram of a layer-by-layer feature fusion deep network according to an embodiment of the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明进一步详细说明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in further detail below in conjunction with specific embodiments and with reference to the accompanying drawings.

本发明所提出的方法所涉及的相关数据集包括:1)训练集,其中包括图像以及该图像所对应的社交标签;2)测试集,仅包括待标注的测试图像,而没有标签信息。The relevant data sets involved in the method proposed by the present invention include: 1) training set, which includes images and social labels corresponding to the images; 2) test set, which only includes test images to be labeled without label information.

考虑到图像底层视觉信息和社交标签信息的异构性,本发明提出了一种基于逐层标签融合深度网络的图像标注方法。该方法的核心思想是在深度网络的框架下,逐层地进行标签信息和视觉信息的融合,从而学习图像的层级特征,为图像的标注提供特征表示。Considering the heterogeneity of the underlying visual information and social tag information of the image, the present invention proposes an image tagging method based on a layer-by-layer tag fusion deep network. The core idea of this method is to fuse label information and visual information layer by layer under the framework of deep network, so as to learn the hierarchical features of images and provide feature representation for image annotation.

图1示出了本发明提出的基于逐层标签融合深度网络的图像标注方法流程图,如图1所示,所述方法包括:Fig. 1 shows the flow chart of the image labeling method based on the layer-by-layer label fusion depth network proposed by the present invention. As shown in Fig. 1, the method includes:

步骤1、对于训练集中的训练图像,提取其底层视觉特征;Step 1, for the training image in the training set, extract its underlying visual features;

步骤2、对于所述训练图像的标签进行层级化,构建标签的层级结构;Step 2. Hierarchize the labels of the training images to construct a hierarchical structure of the labels;

步骤3、对于所述训练图像,逐层融合其底层视觉特征信息和标签信息,并通过深度网络参数学习,得到所述训练图像的层级特征表示;Step 3, for the training image, fuse its underlying visual feature information and label information layer by layer, and learn the hierarchical feature representation of the training image through deep network parameter learning;

步骤4、对于测试集中的测试图像,提取其底层视觉特征,然后通过所述深度网络学习得到其层级特征表示,最后根据所述测试图像的层级特征表示预测其标注信息。Step 4. For the test images in the test set, extract their underlying visual features, then obtain their hierarchical feature representations through the deep network learning, and finally predict their annotation information according to the hierarchical feature representations of the test images.

下面详细介绍上述四个步骤的具体执行过程。The specific execution process of the above four steps will be described in detail below.

步骤1中,对象的底层视觉特征提取是得到对象的初始表示,对于图像信息,本发明优选采用尺度不变特征变换特征(SIFT)(比如1000维)作为图像的底层视觉特征,图像的底层视觉特征用X来表示。In step 1, the underlying visual feature extraction of the object is to obtain the initial representation of the object. For image information, the present invention preferably adopts the scale-invariant feature transform feature (SIFT) (such as 1000 dimensions) as the underlying visual feature of the image, and the underlying visual feature of the image Features are denoted by X.

步骤2中,利用一些可以用的工具,本发明优选WordNet,对于图像的社交标签构建层数为K的标签层级。比如:若某图像带有标签animal,plant,cat,dog,flower,则对应的标签层级如图2所示(此处层数为2)。In step 2, using some available tools, WordNet is preferred in the present invention, and a label hierarchy with a layer number of K is constructed for the social label of the image. For example: if an image has tags animal, plant, cat, dog, flower, the corresponding tag hierarchy is shown in Figure 2 (here the number of layers is 2).

所述步骤3为对于训练图像,逐层融合其底层视觉特征信息和标签信息,并通过深度网络参数学习,得到所述训练图像的层级特征。The step 3 is to fuse the underlying visual feature information and label information of the training image layer by layer, and obtain the hierarchical features of the training image through deep network parameter learning.

步骤3中,构建层数为L(L>K)的深度网络,并使标签层级结构的K层对应深度网络的最高层。设深度网络各层的变量表示为h={h(0),...,h(L)},其中,h(0)表示图像的底层视觉特征X;K层的标签层级结构对应的各个层的变量表示为y={y(L-K+1),...,y(L)}。In step 3, a deep network with layers L (L>K) is constructed, and the K layer of the label hierarchy corresponds to the highest layer of the deep network. Let the variables of each layer of the deep network be expressed as h={h (0) , ..., h (L) }, wherein, h (0) represents the underlying visual feature X of the image; each of the label hierarchy structure corresponding to the K layer The variables of a layer are denoted as y={y (L-K+1) ,...,y (L) }.

该步骤是本发明的重要部分,图3是根据本发明一实施例的逐层特征融合深度网络的模型结构图,参照图3,所述步骤3可以分为以下几个子步骤:This step is an important part of the present invention. FIG. 3 is a model structure diagram of a layer-by-layer feature fusion depth network according to an embodiment of the present invention. Referring to FIG. 3, the step 3 can be divided into the following sub-steps:

步骤3.1:通过构建自编码器(auto-encoder),基于重构误差对于深度网络中从h(0)层到h(L-K+1)层的参数进行初步调整;Step 3.1: Preliminarily adjust the parameters of the deep network from layer h (0) to layer h (L-K+1) based on the reconstruction error by constructing an auto-encoder;

所述步骤3.1进一步包括以下步骤:Said step 3.1 further comprises the following steps:

步骤3.1.1:从h(0)层向上到h(L-K+1)层,在每相邻两层之间构建一个自编码器,通过所述自编码器可由下一层的表示得到上一层表示的映射;Step 3.1.1: From layer h (0) up to layer h (L-K+1) , build an autoencoder between every two adjacent layers, through which the autoencoder can be obtained from the representation of the next layer The mapping represented by the previous layer;

比如,基于h(l-1)和h(l)层之间的自编码器,由h(l-1)层的表示可映射得到h(l)层的表示:For example, based on the autoencoder between the h (l-1) and h (l) layers, the representation of the h (l-1) layer can be mapped to the h (l) layer representation:

hh (( 11 )) == sthe s (( WW hh (( ll -- 11 )) hh (( ll -- 11 )) ++ bb (( ll )) )) -- -- -- (( 11 ))

其中,表示h(l-1)和h(l)层之间的权重参数,b(l)表示h(l)层的偏置(bias)参数,s()表示logistic函数: in, Represents the weight parameter between h (l-1) and h (l) layers, b (l) represents the bias (bias) parameter of h (l) layer, s() represents the logistic function:

这样由h(l-1)层的表示通过映射就可得到h(l)层的表示。In this way, the representation of h (l) layer can be obtained through mapping from the representation of h (l-1 ) layer.

步骤3.1.2:由上一层表示映射回来得到下一层的重构表示;Step 3.1.2: Map back from the representation of the previous layer to obtain the reconstructed representation of the next layer;

比如,由h(l)的表示映射回来可得到h(l-1)的重构表示z:For example, the reconstructed representation z of h (l-1) can be obtained by mapping the representation of h (l):

zz == sthe s (( WW hh ′′ (( ll -- 11 )) hh (( ll )) ++ bb ′′ )) -- -- -- (( 22 ))

其中,的转置表示,b′表示h(l-1)的偏置(bias)参数。in, for The transpose representation of , b' represents the bias (bias) parameter of h (l-1) .

步骤3.1.3:根据正确表示与重构表示之间的差错,对于所述深度网络的参数进行调整。Step 3.1.3: Adjust the parameters of the deep network according to the error between the correct representation and the reconstructed representation.

比如通过最小化z与h(l-1)层表示之间的重构差错就可实现对于所述深度网络参数的初步调整,在本发明一实施例中,优选使用最小化重构交叉熵来对上述参数进行初步调整:For example, the initial adjustment of the deep network parameters can be achieved by minimizing the reconstruction error between z and h (l-1) layer representations. In an embodiment of the present invention, it is preferable to minimize the reconstruction cross entropy to Make preliminary adjustments to the above parameters:

其中,k表示z的分量的下标,D(l-1)表示z的维数。Among them, k represents the subscript of the component of z, and D (l-1) represents the dimension of z.

如此进行下去,一直调整到h(L-K+1)层。Proceed in this way until the h (L-K+1) layer is adjusted.

步骤3.2:对于所述深度网络中的h(L-K+1)层到最高h(L)层,结合深度网络中的某一层,比如h(l)层和标签层级结构中的相应层,比如u(l)层,进行特征融合以及所述深度网络中相应参数的调整;Step 3.2: For the h (L-K+1 ) layer to the highest h (L) layer in the deep network, combine a certain layer in the deep network, such as the h (l) layer and the corresponding layer in the label hierarchy , such as the u (l) layer, performing feature fusion and adjusting corresponding parameters in the deep network;

该步骤又可以分为两个子步骤:(以h(l)为例)This step can be divided into two sub-steps: (take h (l) as an example)

步骤3.2.1:利用所述标签层级结构中的y(l)层标签调整所述深度网络中从h(0)到h(l)层的参数;Step 3.2.1: Utilize the y (1) layer label in the label hierarchy structure to adjust the parameters from h (0) to h (1) layer in the deep network;

该步骤中,首先计算交叉熵损失:In this step, first calculate the cross-entropy loss:

Lossloss (( {{ WW ,, bb }} )) == -- ΣΣ nno == 11 NN ΣΣ kk == 11 KK tt nknk lnln ythe y nknk -- -- -- (( 44 ))

其中,N表示样本的数目,K表示该层的标签的个数,ynt表示模型对第n个样本的预测的第k维的值,tnk表示训练样本中第n个样本的第k维的真实的值。Among them, N represents the number of samples, K represents the number of labels in this layer, y nt represents the value of the k-th dimension of the model's prediction of the n-th sample, and t nk represents the k-th dimension of the n-th sample in the training sample the true value of .

然后将该损失反过来对深度网络从h(0)到h(l)层进行参数调整,在本发明一实施例中,采用著名的后向传播算法进行全局参数调整。Then the loss is reversed to adjust the parameters of the deep network from h (0) to h (l) layers. In an embodiment of the present invention, the well-known backpropagation algorithm is used to adjust the global parameters.

步骤3.2.2:通过h(l)层和y(l)层表示合并学习得到h(l+1)层的特征表示;Step 3.2.2: Obtain the feature representation of the h ( l+1) layer by combining the representations of the h (l) layer and the y (l) layer;

该步骤中,将h(l)层和y(l)层的表示合并起来,与h(l+1)层的表示构成一个自编码器(auto-encoder):In this step, the representations of layer h (l) and layer y (l) are combined to form an auto-encoder with the representation of layer h (l+1) :

hh (( ll ++ 11 )) == sthe s (( WW hh (( ll )) hh (( ll )) ++ WW ythe y (( ll )) ythe y (( ll )) ++ bb (( ll ++ 11 )) )) -- -- -- (( 55 ))

同样,h(l),y(l)和h(l+1)之间的参数通过最小化重构交叉熵来优化。Likewise, the parameters between h (l) , y (l) and h (l+1) are optimized by minimizing the reconstruction cross-entropy.

如此进行下去,一直到h(L)层。And so on, until the h (L) layer.

通过上述逐层的特征融合,就可以将图像的标签信息融合到视觉信息中,同时深度网络的参数也得到了优化。Through the above-mentioned layer-by-layer feature fusion, the label information of the image can be fused into the visual information, and the parameters of the deep network are also optimized.

步骤4中,利用参数已经优化的深度网络,对于测试集中的测试图像进行标注。In step 4, use the deep network whose parameters have been optimized to annotate the test images in the test set.

所述步骤4进一步分为以下几个子步骤:The step 4 is further divided into the following sub-steps:

步骤4.1:对于测试图像提取其底层视觉特征Xtest,该步骤与步骤1中对训练集中的训练图像提取底层视觉特征的方法类似;Step 4.1: Extracting the underlying visual features X test of the test image, this step is similar to the method of extracting the underlying visual features of the training images in the training set in step 1;

步骤4.2:利用优化参数后的深度网络,得到所述测试图像底层视觉特征Xtest的层级特征表示{h(L-K+1),...,h(L)};Step 4.2: Using the deep network after optimizing the parameters, obtain the hierarchical feature representation {h (L-K+1) ,..., h (L) } of the underlying visual feature X test of the test image;

步骤4.3:利用该层级特征表示预测所述测试图像的标签信息{h(L-K+1),...,h(L)}:Step 4.3: Use the hierarchical feature representation to predict the label information {h (L-K+1) , ..., h (L) } of the test image:

ythe y ii (( ll )) == expexp (( WW ii TT hh ii (( ll )) )) ΣΣ jj expexp (( WW jj TT hh jj (( ll )) )) -- -- -- (( 66 ))

其中,Wi表示标签与特征h(l)之间的权重。Among them, W i represents the label and the weight between feature h (l) .

以上所述的具体实施例,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施例而已,并不用于限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The specific embodiments described above have further described the purpose, technical solutions and beneficial effects of the present invention in detail. It should be understood that the above descriptions are only specific embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.

Claims (10)

1. An image annotation method based on a layer-by-layer label fusion depth network is characterized by comprising the following steps:
step 1, extracting a bottom layer visual characteristic X of a training image in a training set;
step 2, carrying out hierarchy on the labels of the training images to construct a hierarchical structure of the labels;
step 3, fusing bottom layer visual characteristic information and label information of the training image layer by layer, and obtaining the hierarchical characteristic representation of the training image through deep network parameter learning;
and 4, extracting bottom layer visual features of the test images in the test set, then obtaining hierarchical feature representation of the test images through deep network learning, and finally predicting the labeling information of the test images according to the hierarchical feature representation of the test images.
2. The method of claim 1, wherein the underlying visual features of the training image are its scale-invariant feature transform features.
3. The method of claim 1, wherein the deep network has a number of layers L and the label hierarchy has a number of layers K, wherein L is>K, and the variable of each layer of the deep network is represented as h = { h = { h(0),...,h(L)In which h(0)An underlying visual feature X representing an image; the variable of each layer corresponding to the label hierarchy is represented as y = { y = { y =(L-K+1),...,y(L)}。
4. The method according to claim 3, wherein the step 3 comprises the steps of:
step 3.1: by constructing an auto-encoder, the reconstruction error is based on the slave h in the depth network(0)Layer to h(L-K+1)The parameters of the layer are preliminarily adjusted;
step 3.2: for h in the deep network(L-K+1)Layer to maximum h(L)Layer, incorporating a layer in a deep network, such as h(l)Layers and corresponding layers in the hierarchy of labels, e.g. y(l)And the layer is used for carrying out feature fusion and adjusting corresponding parameters in the deep network.
5. The method according to claim 4, characterized in that said step 3.1 further comprises the steps of:
step 3.1.1: from h(0)Layer up to h(L-K+1)Layers, a self-plaiting structure is constructed between each two adjacent layersA coder by which a mapping of a representation of an upper layer is derivable from a representation of a lower layer;
step 3.1.2: mapping back from the previous layer representation to obtain a reconstructed representation of the next layer;
step 3.1.3: adjusting parameters of the deep network up to h according to errors between a correct representation and a reconstructed representation(L-K+1)And (3) a layer.
6. The method according to claim 5, characterized in that in step 3.1.3, the parameters of the deep network are adjusted using a minimum reconstruction cross entropy.
7. The method according to claim 4, characterized in that said step 3.2 further comprises the steps of:
step 3.2.1: utilizing a certain level y in the hierarchy of labels(l)The label adjusts the slave h in the deep network(0)To h(l)Parameters of the layer;
step 3.2.2: through h(l)Layer and y(l)Layer representation merge learning to get h(l+1)The characteristic of the layer is expressed, and the corresponding parameter of the deep network is adjusted until h(L)And (3) a layer.
8. The method according to claim 7, characterized in that in step 3.2.1 and step 3.2.2, parameters are adjusted for the deep network using a back propagation algorithm based on cross entropy loss.
9. The method of claim 7, wherein in step 3.2.2, h is(l)Layer and y(l)The representation of the layer is combined with h(l+1)The representation of the layers constitutes an auto-encoder.
10. The method of claim 1, wherein the step 4 further comprises the steps of:
step 4.1: extracting bottom layer visual features of the test image;
step 4.2: obtaining a hierarchical feature representation of the bottom visual features of the test image by using the depth network;
step 4.3: predicting label information of the test image using a hierarchical feature representation of the test image.
CN201410290316.9A 2014-06-25 2014-06-25 Image labeling method based on layer-by-layer label fusing deep network Pending CN104021224A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410290316.9A CN104021224A (en) 2014-06-25 2014-06-25 Image labeling method based on layer-by-layer label fusing deep network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410290316.9A CN104021224A (en) 2014-06-25 2014-06-25 Image labeling method based on layer-by-layer label fusing deep network

Publications (1)

Publication Number Publication Date
CN104021224A true CN104021224A (en) 2014-09-03

Family

ID=51437978

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410290316.9A Pending CN104021224A (en) 2014-06-25 2014-06-25 Image labeling method based on layer-by-layer label fusing deep network

Country Status (1)

Country Link
CN (1) CN104021224A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572940A (en) * 2014-12-30 2015-04-29 中国人民解放军海军航空工程学院 Automatic image annotation method based on deep learning and canonical correlation analysis
CN105631479A (en) * 2015-12-30 2016-06-01 中国科学院自动化研究所 Imbalance-learning-based depth convolution network image marking method and apparatus
CN106570910A (en) * 2016-11-02 2017-04-19 南阳理工学院 Auto-encoding characteristic and neighbor model based automatic image marking method
CN108595558A (en) * 2018-04-12 2018-09-28 福建工程学院 A kind of image labeling method of data balancing strategy and multiple features fusion
CN108875934A (en) * 2018-05-28 2018-11-23 北京旷视科技有限公司 A kind of training method of neural network, device, system and storage medium
CN109271539A (en) * 2018-08-31 2019-01-25 华中科技大学 A kind of image automatic annotation method and device based on deep learning
WO2020073952A1 (en) * 2018-10-10 2020-04-16 腾讯科技(深圳)有限公司 Method and apparatus for establishing image set for image recognition, network device, and storage medium
CN111583321A (en) * 2019-02-19 2020-08-25 富士通株式会社 Image processing device, method and medium
CN112331314A (en) * 2020-11-25 2021-02-05 中山大学附属第六医院 Image annotation method and device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120233159A1 (en) * 2011-03-10 2012-09-13 International Business Machines Corporation Hierarchical ranking of facial attributes
CN103544392A (en) * 2013-10-23 2014-01-29 电子科技大学 Deep learning based medical gas identifying method
CN103593474A (en) * 2013-11-28 2014-02-19 中国科学院自动化研究所 Image retrieval ranking method based on deep learning
CN103823845A (en) * 2014-01-28 2014-05-28 浙江大学 Method for automatically annotating remote sensing images on basis of deep learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120233159A1 (en) * 2011-03-10 2012-09-13 International Business Machines Corporation Hierarchical ranking of facial attributes
CN103544392A (en) * 2013-10-23 2014-01-29 电子科技大学 Deep learning based medical gas identifying method
CN103593474A (en) * 2013-11-28 2014-02-19 中国科学院自动化研究所 Image retrieval ranking method based on deep learning
CN103823845A (en) * 2014-01-28 2014-05-28 浙江大学 Method for automatically annotating remote sensing images on basis of deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHAOQUAN YUAN ET AL: "tag-aware image classification via nested deep belief nets", 《IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572940B (en) * 2014-12-30 2017-11-21 中国人民解放军海军航空工程学院 A kind of image automatic annotation method based on deep learning and canonical correlation analysis
CN104572940A (en) * 2014-12-30 2015-04-29 中国人民解放军海军航空工程学院 Automatic image annotation method based on deep learning and canonical correlation analysis
CN105631479A (en) * 2015-12-30 2016-06-01 中国科学院自动化研究所 Imbalance-learning-based depth convolution network image marking method and apparatus
CN105631479B (en) * 2015-12-30 2019-05-17 中国科学院自动化研究所 Depth convolutional network image labeling method and device based on non-equilibrium study
CN106570910A (en) * 2016-11-02 2017-04-19 南阳理工学院 Auto-encoding characteristic and neighbor model based automatic image marking method
CN106570910B (en) * 2016-11-02 2019-08-20 南阳理工学院 Image automatic labeling method based on self-encoding features and neighbor model
CN108595558B (en) * 2018-04-12 2022-03-15 福建工程学院 Image annotation method based on data equalization strategy and multi-feature fusion
CN108595558A (en) * 2018-04-12 2018-09-28 福建工程学院 A kind of image labeling method of data balancing strategy and multiple features fusion
CN108875934A (en) * 2018-05-28 2018-11-23 北京旷视科技有限公司 A kind of training method of neural network, device, system and storage medium
CN109271539A (en) * 2018-08-31 2019-01-25 华中科技大学 A kind of image automatic annotation method and device based on deep learning
WO2020073952A1 (en) * 2018-10-10 2020-04-16 腾讯科技(深圳)有限公司 Method and apparatus for establishing image set for image recognition, network device, and storage medium
US11853352B2 (en) 2018-10-10 2023-12-26 Tencent Technology (Shenzhen) Company Limited Method and apparatus for establishing image set for image recognition, network device, and storage medium
CN111583321A (en) * 2019-02-19 2020-08-25 富士通株式会社 Image processing device, method and medium
CN112331314A (en) * 2020-11-25 2021-02-05 中山大学附属第六医院 Image annotation method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN104021224A (en) Image labeling method based on layer-by-layer label fusing deep network
CN116541911B (en) Packaging design system based on artificial intelligence
CN103823845B (en) Method for automatically annotating remote sensing images on basis of deep learning
Zhang et al. Fast and accurate land-cover classification on medium-resolution remote-sensing images using segmentation models
CN105631479B (en) Depth convolutional network image labeling method and device based on non-equilibrium study
CN105045907B (en) A kind of construction method of vision attention tagging user interest tree for Personalized society image recommendation
CN107220506A (en) Breast cancer risk assessment analysis system based on deep convolutional neural network
CN112990222B (en) A Guided Semantic Segmentation Method Based on Image Boundary Knowledge Transfer
CN112837338B (en) Semi-supervised medical image segmentation method based on generation countermeasure network
CN109359297A (en) A method and system for relation extraction
CN110399518A (en) A Visual Question Answering Enhancement Method Based on Graph Convolution
CN116975615A (en) Task prediction method and device based on video multi-mode information
Harrie et al. Machine learning in cartography
CN109597998A (en) A kind of characteristics of image construction method of visual signature and characterizing semantics joint insertion
CN103440651B (en) A kind of multi-tag image labeling result fusion method minimized based on order
CN115292568B (en) A method of extracting people's livelihood news events based on joint model
CN110533074B (en) Automatic image category labeling method and system based on double-depth neural network
CN112036659A (en) Prediction method of social network media information popularity based on combination strategy
CN106056609B (en) Method based on DBNMI model realization remote sensing image automatic markings
CN103530405A (en) Image retrieval method based on layered structure
Lu et al. Exploration and application of graphic design language based on artificial intelligence visual communication
Yi et al. Steel strip defect sample generation method based on fusible feature GAN model under few samples
CN113360659A (en) Cross-domain emotion classification method and system based on semi-supervised learning
CN118152525A (en) Water conservancy knowledge service system based on knowledge graph technology
Xu et al. Remote sensing image segmentation of mariculture cage using ensemble learning strategy

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140903

WD01 Invention patent application deemed withdrawn after publication