Embodiment
Fig. 1 is the process flow diagram of existing automatic trademark image searching system, and can see that from this figure existing trade mark searching system is divided into two basic processes: training process was divided into for four steps, and the first step is to obtain trademark image; Second step filtered out the noise in the pictorial trademark; Extract one group of feature with separability the 3rd image of step after filtering, the feature that extracts has invariant moments, edge direction, Fourier descriptors, profile description string etc. usually; The 4th step was that the feature that extracts is stored in the characteristics dictionary.Query script was divided into for five steps, and first three step is equally extracted feature with training process from image to be checked, and the 4th step was to utilize existing feature and characteristics dictionary to mate, and calculated similarity and returned one group of image the most similar as Query Result; Final step is to utilize relevant feedback to optimize result for retrieval.We can see the defective that system exists from above-described process, and at first, feature is directly to extract on the figure behind the denoising, can not emphasize some details, runs counter to the method for weighting in the manual coding, influences the precision of system; Secondly, the similarity degree between image is directly to calculate on primitive character, and the dimension of primitive character is higher usually, and calculated amount is very big, and when using various features simultaneously in the system, these features can not merge well; In addition, existing matching process is slower to the matching speed of large sample collection.
Fig. 2 is a schematic flow sheet of the present invention.As can be seen from this figure, the present invention has increased by two links---pre-service and dimensionality reduction on the basis of existing trade mark searching system, used quick search strategy in matching process, with solving top problem.Whole process was divided into for seven steps: the obtaining of image, denoising, pre-service, feature extraction, dimensionality reduction, coupling and relevant feedback fast.Each process is described in detail as follows:
Entered for the 3rd step after obtaining the trademark image of denoising: pre-service.In the present invention, the pretreatment module disturbance information of feature extraction of mainly eliminating the effects of the act.Shown in Figure 3 is pretreated process flow diagram.The detailed process of pre-service realization is as follows as can be seen from this figure: the key graphic that at first extracts in the original image is eliminated disturbance information; Then key component is carried out size normalization and eliminate the influence of displacement and convergent-divergent feature extraction; Then make things convenient for feature extraction with the outstanding image edge information of edge detection algorithm; Use computing machine anthropomorphic dummy manual coding at last, orient some the specific fundamental figures in the image, reduce the complexity of follow-up work.
The extraction of key graphic is that the emphasis of pre-service link also is a difficult point.There is very big difference in trademark image as a kind of image and natural image of synthetic.At first trademark image is to be formed by combining by the sketch of various objects, abstract figure, literal, though the mode of combination is varied, has obvious limit between each abstract subimage, and this extraction for key graphic in the trademark image provides may.By the understanding of observing a large amount of trademark images and trade mark being encoded, can sum up following rule: most associated mark images are made of jointly text and some significant figures, and trademark office is when carrying out the trademark image coding, remove the non-legible figure that obviously constitutes own, Chinese, foreign language, the alphabetical trade mark of general block letter and artistic calligraphy are only made character search, do not divide the graphical element coding; In the trade mark that a width of cloth is made of jointly figure and literal, figure is the main body of this trade mark, and when two trade marks exist similarly on figure, no matter there are much difference in their literal in layout and literal itself, thinks that all this two width of cloth trade mark is similar; The trademark image that is made of literal is not encoded, but is retrieved as literal purely.After summing up these rules, we have found the approach that another is dealt with problems, if promptly these less important Word messages in the trademark image are eliminated, and keep remaining visuals, then reach the purpose that figure extracts equally.
In order to extract the key graphic in the trademark image, at first must be separated literal and important figure, because boundary is apparent in view between each figure of trademark image, therefore can realize cutting apart of image with the connected domain analytical algorithm.Adopt line adjacent map algorithm among the present invention, its basic ideas are: image is lined by line scan, obtain the straight horizontal line segment, and with the straight-line segment of itself and lastrow relatively,, then integrate with a connected domain if be communicated with, otherwise be a new connected domain, so up to the end of image, by the connected domain analysis, available is the rectangular area that comprises each number of sub images.
In the present invention, define three kinds of diverse ways and eliminated Word message irrelevant in the trademark image, reach the purpose of outstanding key graphic, these three kinds of methods are respectively: based on the method for connected domain projection, based on the connected domain area method with based on the subgraph extraction method of structure, three kinds of methods are complementary, are used for handling different images respectively.Experimental result shows that the key graphic abstracting method in this system is effectively, and the trademark image above 95% can access correct result.Several diverse ways are described respectively below:
First method: based on the connected domain sciagraphy.Though the text in the trade mark may be by Chinese character, phonetic and English the composition and font, size, out-of-shape, but literal occurs with the form of row usually, be more or less the same with the height of the literal number in the delegation more than one and literal, and the visuals in the trade mark be positioned at separately usually delegation and and line of text between exist at interval.This is a modal situation in the pictorial trademark, at this situation, advises a kind of literal removing method based on the connected domain projection.The roughly flow process of this method is, connected domain is decomposed the projection of resulting subimage along continuous straight runs, and the result according to projection carries out layering then, and text layers and graph layer difference are bigger, determine the layer at figure place, keep this layer.In the general trademark image, image appears at same one deck, and the result that this method is handled only keeps one deck.This method to figure inside or and the literal of graphics overlay, can not obtain effect preferably.
Second method: based on the connected domain Method for Area.Can eliminate the literal of figure inside by this method, also can eliminate and the equitant literal of the projection of figure.Figure is the key component of trade mark, and in the original trademark image of the overwhelming majority, figure is occupied an leading position on area.No matter literal in the trade mark appears at the inside or the outside of figure, and compare shared area with figure all less, defines an area threshold (maximum spirte area 1/4) and then can reach the purpose of removing the less literal of noise and area; The spirte bigger to area (greater than maximum spirte area 1/8), before elimination, need to do some judgements, prevent to eliminate some useful spirtes by mistake, mainly utilize the peripheral characteristic and the stroke puncture feature of figure that literal and graphical demarcation are come, since the not closed figure that literal normally is made up of a lot of strokes, and the outline of figure is smoother.
The third method: based on the subgraph extraction method of structure.Two kinds of top situations can solve most situations, two kinds of situations above but the trademark image that still has smaller portions does not satisfy, in these trademark images, compare visuals with literal and do not occupy remarkable advantages, and visuals is common and literal is overlapping fully, and top disposal route can not obtain correct result.Add up the error result that preceding two kinds of methods are handled, obtain following conclusion: when common wide high difference at original pictorial trademark is big, figure not dominant situation on area can occur; In addition, compare with other image, the structure comparison rule of these images promptly has definite structure, and by observing, we finally determine 12 kinds of structures shown in Figure 5, and wherein rectangle is represented literal or line of text, circular corresponding visuals.Utilize these conclusions, in this invention, increased based on the subgraph extraction method of structure and handled these images, processing procedure is as follows: the ratio of width to height of computed image and judging whether greater than given threshold value (1.8) at first, the image bigger (greater than threshold value) to depth-width ratio, determine the structure of this image, if satisfy any one structure among Fig. 5, then keep corresponding image section, finish.
It is emphasized that most trademark images all are horizontal, promptly width greater than the height or be more or less the same, to vertical trade mark, when promptly height is much larger than width, do not need special disposal, can be rotated counterclockwise original trademark image 90 degree, handle by above-described method then.
After obtaining key graphic, will adopt MEC algorithm (Maximum Extent Circle) that trademark image is carried out adaptive size normalization.Because the geometry of pictorial trademark is very important classified information, adaptive size normalization is normalized to the image of a plurality of different sizes with original image, keeps former geometry as far as possible.For the shape of outstanding trademark image, the image after the normalization need carry out edge extracting.In research process, we have attempted multiple edge extracting algorithm: ShenCastan, Canny, and Sobel, Susan, Marr etc. through relatively, select the edge extracting algorithm of Canny as system.
Manual coding utilizes the global shape of pattern usually, local shape, and graphical informations etc. are as criteria for classification.Because pictorial trademark becomes increasingly complex, the state of development of these criteria for classifications and trade mark is more and more incompatible, and the method for manual coding is easy to make a mistake.And it is very huge beyond doubt to increase its workload of new standard in the worker, realizes that automatic coding can address this problem preferably and utilize to calculate.Certainly, it is quite difficult will finding out shapes all in the trade mark by computing machine, but can simplify problem, searches out some specific fundamental figure, reduces the complexity of follow-up work.We have defined 5 kinds of fundamental figures: circle, ellipse, rectangle, rhombus, triangle. and whether system can judge automatically fundamental figure in the trade mark, and orients this basic subgraph in 5.
System has adopted two kinds of methods to combine and has finished this purpose: Hausdorff distance and pattern understand.
The Hausdorff distance is the figure matching algorithm that D.P.Huttenlocher1993 proposes, and is defined as follows:
H(A,B)=max(h(A,B),h(B,A))
Our predefined has been got well some fundamental figure templates, at first gets rid of impossible zone with some geometrical rule, carries out the subgraph location then in remaining zone.In order to save positioning time, for closed figure, we directly position with the method for pattern understand.Pattern understand is mainly used the symmetry of basic subgraph, features such as depth-width ratio.The subgraph of having good positioning is carried out the operation of two steps: coding, remove.The trademark image that fundamental figure is arranged at first provides corresponding codes, removes basic subgraph then from trademark image.In the real system, the subgraph that we realize is positioned with circle, ellipse, prismatic, triangle and rectangle.Because calculated amount and storage space, the circle of using at last that has only is located.The accuracy of circle location can reach 95% (add up 200 width of cloth figure, wherein 191 width of cloth can correct positioning to).
The method of multicomponent retrieval has been proposed in some documents about the trade mark retrieval recently.This method at first segments the image into a plurality of compositions that do not link mutually, respectively each composition is extracted feature, and the similarity degree of entire image is then by obtaining in conjunction with these a plurality of features.For the multicomponent matching process, the key of problem is to obtain these subimages, and the simplest method is exactly that all connection compositions are all preserved as a subgraph, but there is very big defective in this method.At first, each image may comprise very many pieces, and characteristics dictionary will be very big like this, when image data base is big, preserves all compositions reality hardly.In addition, a lot of trademark images of trade mark are abstract figure and the sketches that combined by some basic compositions, have fully separately then lost original implication.In order to reduce these defectives, we merge original connection composition, stipulate that simultaneously the subgraph number of every width of cloth image is no more than three.Therefore, key issue is exactly the consolidation problem that is communicated with composition.In this section, we provide the thinking of two merging: first method is to utilize rule and feature to merge, another method realizes by image understanding, fundamental figure at first publishing picture with the Hausdorff location, remove the coding that basis in the figure only keeps fundamental figure then, keep remaining figure.
Figure 4 shows that the pre-service example of a trademark image.The different trademark image of this four width of cloth ((a), (b), (c), (d)) all be the sign of China Telecom, wherein visuals (the embedded artistic body Chinese character of circular background) is the common trait of four images.In the manual coding process, trade mark authentication personnel are easy to judge that four width of cloth trade marks are identical, but in existing trade mark searching system, the feature that extracts from this four width of cloth image differs very big, utilize arbitrary width of cloth image wherein can not retrieve remaining three width of cloth fully, the image after handling as can be seen from pretreated result is consistent substantially.Please see Figure shown in 4: after the different image of four width of cloth is transfused to, pre-service and crucial figure extract link extract phase visuals ((a1) together from four original trademark images respectively, (b1), (c1), (d1)), the normalization basically identical ((a2) (b2) (c2) (d2)) that can become the different key graphic that origin-location, size differ bigger then, utilize at last and extract marginal information ((a3) the image of edge extracting algorithm after normalization, (b3), (c3), (d3)).Can see that from the result key component in four width of cloth images has been given prominence in pre-service, has got rid of disturbance information, improves the reliability of system greatly.
Through entering the 4th step of this system after the pre-service: feature extraction.As everyone knows, feature extraction and pattern match are two big pillars of area of pattern recognition.Pattern is classified, at first will in all sorts of ways the character of identifying object is made various measurements, promptly extract the feature of reflection pattern, feature is the key of decision similarity and classification.After the purpose of classification is determined, how to find suitable feature just to become the key problem of identification, therefore a lot of scholars are devoted to seek the precision that feature efficiently improves system.Feature is to being to the shape information of figure or the reflection of energy information, and feature commonly used in the existing trademark system has geometric moment, Zernike square, edge orientation map, wavelet character, Fourier to describe son etc.These features extract special from former figure, outline map or transform domain, the feature of the existing overall situation also has local feature.Though these features can reflect some important feature of image, and shortcoming is separately all arranged.In order to improve the precision of system, the edge direction of brand logo is described better, in this system, also introduced two kinds of new characteristic direction linear element features and Zoning feature except using top several feature commonly used.Directional line element feature and Zoning feature are used in the character recognition field the earliest, and its superior performance is verified arriving of character recognition field.Found through experiments as the directional line element feature and the whole structure of zoning feature in native system of local feature more effective than global characteristics.Yet local feature also has the defective of himself, does not promptly have rotational invariance, therefore, has used various features in this system, calculates similarity between different images in conjunction with various features.
Hereinafter will specifically describe more used features and principle separately among the present invention.
Invariant moment features often is used to solve the invariant Pattern Recognition problem, and its advantage is to have displacement, rotation, convergent-divergent unchangeability.Hu derives one group of invariant moments from geometric moment, realized rotation, displacement, scaling unchangeability the earliest, has occurred Zernike square, Fourier orthogonal moment etc. on this basis again.
Fourier descriptors is the coefficient of discrete Fourier transform (DFT), by the frequency analysis generation of shape.The Fourier descriptors feature is that the profile of image is done, and extracts the profile of image earlier, changes into isometric n part profile is discrete then, then carries out Fourier transform, and the parameter that obtains with conversion is as feature.Fourier descriptors is in the description of shape and differentiate and play an important role, and has rotation, displacement, scaling unchangeability equally.But the profile of Fourier descriptors feature request figure must seal, and any sealing all may go wrong; In addition, the Fourier descriptors feature can not be handled embedded curve.Most of trademark image is to be formed by a lot of graphics combine, and in this system, we use two-dimension fourier transform, and adds up energy feature to replace the Fourier descriptors feature on transform domain.
The edge orientation histogram feature is a kind of edge direction characteristic, can investigate the similarity degree of different graphic from the overall situation.The edge orientation histogram feature extraction is fairly simple, at first use the edge of Canny operator abstract image, the edge calculation direction also equally spacedly is quantified as 72 intervals, adds up the pixel number that each interval falls into, carry out vectorial normalization then, resulting result is exactly the edge histogram feature.The edge histogram feature does not have rotational invariance.
Mathematical morphology is a subject that is based upon on integral geometry and the theory of probability basis, is a kind of new method that is applied to Flame Image Process and area of pattern recognition.Mathematical Morphology Method is compared with analytical approach with other spatial domains or frequency domain Flame Image Process, has some remarkable advantages.In recent years, mathematical morphology is extensive day by day in Application in Image Processing, and it uses the various aspects that almost are penetrated into Flame Image Process.As noise-removed filtering, rim detection, refinement, compressed encoding, feature extraction, shape analysis or the like.In some reports, also mention and extract feature with morphology methods.
Compare with Fourier conversion, Gabor conversion, wavelet transformation is the localization analysis of time (space) frequency, it progressively carries out multiple dimensioned refinement by flexible translation computing to signal (function), finally reach the segmentation of high frequency treatment time, the frequency segmentation of low frequency place, can adapt to the requirement that time frequency signal is analyzed automatically, thereby reliably focus on any details of signal, solve the difficult problem of Fourier conversion.The wavelet analysis of multiple dimensioned decomposition provides the space and the frequency information of image, has been used in the CBIR.In the wavelet analysis of multiple dimensioned decomposition, four wave bands are all arranged: low frequency part and three HFSs (high frequency of vertical high frequency, horizontal high frequency, horizontal direction and vertical direction), the corresponding parameter matrix of each wave band at every layer.Use 3 layers of Daubechies wavelet decomposition will produce 10 wave bands, calculating energy feature, average and variance can obtain 30 features of tieing up respectively.These features can accurately be described the texture and the shape of trade mark.
The approximate repetition that texture can be considered as some approximate shapes distributes, and the difficult point of texture description is to have confidential relation between it and the body form, and the distribution of Protean body form and nested type makes the classification of texture become very difficult.At the initial stage seventies, people such as Haralick have proposed the co-occurrence matrix of textural characteristics and have represented.He extracts significant statistic then and represents as texture at first according to the direction between pixel and co-occurrence matrix of distance structure from co-occurrence matrix.People such as Tamura have then proposed the texture method for expressing from the psychology angle of vision, all texture properties of expression all have vision meaning intuitively, it is very attractive that this makes that the Tamura texture table is shown in the image retrieval, and a more friendly user interface can be provided.QBIC system and MARS system have further improved this texture and have represented.
Several feature described above is very common in image indexing system, has also obtained good effect, but also there is defective in these features, can not give prominence to the local message of image preferably.In order to describe trademark image better, except using these common features, the present invention has also introduced two kinds of new features.Though these two kinds of features are more common in the character recognition field, also have no talent at present and carried out trial in field of image search, in implementation procedure, original feature extraction method has been carried out some modifications made it be more suitable for trademark image.Carry out concrete description with regard to the principle and the realization of two kinds of features below.
Directional line element feature feature: in the character recognition field, the validity of directional line element feature feature has obtained checking, in this system, trademark image is the same with character picture to be made up of bianry image, and all has certain shape, therefore, in this system, introduce high performance directional line element feature feature and can obtain effect preferably.Directional line element feature by level, vertical ,+45 °,
Adjacent two black pixels on certain direction constitute.Field with 3*3 is differentiated, and is the center with certain black pixel promptly, investigates around it distribution situation of black pixel in 8 pixels, if any one situation that meets among Fig. 6 is arranged, then the directional line element feature value on this direction is added a numerical value.Edge of image is smoother, lessly the situation that angle is right angle or acute angle occurs, and noise usually occurs with such form, so directional line element feature just shows as 12 kinds of connection situations shown in Figure 6.When utilizing the directional line element feature feature, we mainly obtain the local feature of image, to image block, add up every directional line element feature value respectively.
Zoning feature: Kimura and Shridhar use Zoning (zontation) on contour curve.At first image is divided into equal-sized polylith, in every, outline line is broken down into some adjacent pixels, these neighboring pixels are made up of following several directions: level (0 °), vertical (90 °) and two diagonals (45 °, 135 °), the number of adding up pixel on each direction respectively.
Laggard to the 5th step: dimensionality reduction through pre-service and feature extraction.Dimensionality reduction is the important step of large sample collection image retrieval, and dimensionality reduction can not only greatly improve retrieval rate, reduces storage space, and outstanding dimensionality reduction strategy can also improve retrieval effectiveness.The trade mark search problem is in fact non-supervised recognition problem, and this class problem uses PCA (principal component analysis (PCA)) to solve usually.PCA can eliminate the correlativity between each component of former directed quantity, may remove those coordinate axis that have less information and reach the purpose that reduces the feature space dimension.Though PCA can guarantee the resulting population entropy minimum after the conversion, reconstruction error still exists.M.E.T thinks that reconstruction error is approximate and meets Gaussian distribution, has proposed probability P CA (PPCA) and reconstruction error is introduced reconstruction formula, sets up the summary model.Calculate the reconstruction error variance of optimization then by maximum Likelihood.On this basis, the dimensionality reduction matrix that is optimized.PPCA has obtained the effect more desirable than PCA in digit recognition, we have introduced the trade mark retrieval with it.
All processing are all finished laggard to the 6th step: coupling.After coupling is meant the feature that obtains query image, measure the similarity degree of each width of cloth image under this measure in query image and the storehouse according to the measure of determining, sort and return the process of one group of similar image according to similarity degree, in this invention, the tolerance mode of employing is an Euclidean distance.As the image retrieval problem of super large sample set, retrieval rate is crucial technical indicator of system.Different with existing trade mark searching system, adopted the matching strategy of layering in this invention: the ground floor coupling is only used the stronger some dimensional features of classification capacity, through screening, removes candidate and concentrates and retrieving images dissimilar (distance is very big) sample.The intrinsic dimensionality that second layer increase is used to classify removes dissimilar candidate again from remaining candidate.Repeat the process of front, meet the demands up to residue candidate number.Quicksort is adopted in output result's ordering at last.The key of search strategy is the foundation of candidate screening rule fast, and good candidate screening rule can greatly improve retrieval rate.Among the present invention, adopt the histogram filtering scheme, specific algorithm is as follows:
1. ask the distance of image to be retrieved all samples in the database, i.e.d
R1(x), d
R2(x) ... .d
RX(x).
2. define two positive integer D and R, introduce an array { Q (i) }, length is that N.D and R are that normalized parameter .R is a histogram progression, and D will make Q (i)≤R
Q(i)=d
ri(x)/D,i=1,2.....N.
3. set up histogram array Num[1...R]. distance distribution histogram is initialized as zero
For?i?from?1?to?N
Num[Q(i)]=Num[Q(i)]+1;
4. statistic histogram and require to find reasonable thresholding T according to output:
For?i?from?2?to?R
Num[i]=Num[i]+Num|i-1]
If?Num[i]≥S
{
T=i×D,
and?then?exit.
}
This fast search algorithm, it is consuming time less than two seconds to do retrieval on 300,000 samples.If show candidate not consuming timely is less than 0.5 second.
The 7th step: relevant feedback.Only be difficult to provide gratifying result based on the image low-level feature, main cause is to exist very big gap between image low-level feature and high-level semantic.In order to address this problem, need work out better more effective graphical representation method on the one hand, can catch and set up association between low-level feature and the high-level semantic, so-called relevant feedback technology that Here it is by interactive means on the other hand.The relevant feedback technology is used for traditional text retrieval system at first, its basic thought is, in retrieving, system returns result for retrieval according to user's search request, the user can estimate and mark result for retrieval, and gives system with these information feedback, and system is then learnt according to these feedback informations, and return new Query Result, thereby make result for retrieval satisfy user's requirement more.Utilize relevant feedback to optimize result for retrieval.Whether two images are similar, are that subjective, different people has different similar standards to a great extent, especially complicated patterns as trade mark.In order to bring into play this role of subjective intentions of people, can in system, realize the relevant feedback technology.The user can select from result for retrieval to think and be correlated with or incoherent pattern that system is more new feature and other weights, inquiry again automatically.Relevant feedback technology in the information retrieval based on contents is broadly divided into 4 types: parameter regulation means, clustering method, Probability learning method and neural net method.At trademark image, advised a kind of simple and parameter regulation means of effective modification unique point weights among the present invention:
Wherein, Q original query point, Q
+Be positive sample (being correlated with) collection, n
+Be positive number of samples, Q
-Be negative sample collection (uncorrelated), n
-Be the negative sample number, Q ' is the query point feature after upgrading.Relevant feedback can greatly improve the precision of trade mark retrieval, obtains customer satisfaction system result.