[go: up one dir, main page]

CN112560902A - Book identification method and system based on spine visual information - Google Patents

Book identification method and system based on spine visual information Download PDF

Info

Publication number
CN112560902A
CN112560902A CN202011383651.5A CN202011383651A CN112560902A CN 112560902 A CN112560902 A CN 112560902A CN 202011383651 A CN202011383651 A CN 202011383651A CN 112560902 A CN112560902 A CN 112560902A
Authority
CN
China
Prior art keywords
spine
book
picture
visual
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011383651.5A
Other languages
Chinese (zh)
Inventor
孙坦
周硕
柴秀娟
张文蓉
鲜国建
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Agricultural Information Institute of CAAS
Original Assignee
Agricultural Information Institute of CAAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Agricultural Information Institute of CAAS filed Critical Agricultural Information Institute of CAAS
Priority to CN202011383651.5A priority Critical patent/CN112560902A/en
Publication of CN112560902A publication Critical patent/CN112560902A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

本发明提出一种基于书脊视觉信息的图书识别方法及系统,包括采集图书馆架上图书书脊图片、对其人工标注以构建书脊分割与书脊分类数据集;构建用于书脊图片深度特征提取的卷积神经网络,并利用训练数据进行训练得到特征提取模型;拍摄架上书本的书脊一侧图片,首先对其中的书脊进行实例分割,进而利用已经训练好的模型获取该书脊图片对应的视觉特征向量,然后与馆藏数据库匹配以识别该书脊对应图书的类别。本发明基于深度学习算法对图书的书脊图片进行识别,利用到书脊目标的全部视觉信息,不受限于文字识别方法所依赖的字典集,支持馆藏新增图书,具有更高的准确率和更好的鲁棒性、可扩展性;可对架上系列图书的书脊图片进行批量识别。

Figure 202011383651

The invention provides a book identification method and system based on the visual information of the spine, including collecting the spine pictures of the books on the shelf in the library, manually labeling them to construct a spine segmentation and spine classification data set; Then use the training data for training to obtain a feature extraction model; take a picture of the spine side of the book on the shelf, first segment the spine for instance, and then use the trained model to obtain the visual feature vector corresponding to the spine image , and then matched against the holdings database to identify the category of the book that the spine corresponds to. The invention recognizes the spine picture of the book based on the deep learning algorithm, utilizes all the visual information of the spine target, is not limited to the dictionary set that the text recognition method depends on, supports new books in the collection, and has higher accuracy and better accuracy. Good robustness and scalability; batch identification of spine pictures of a series of books on the shelf.

Figure 202011383651

Description

Book identification method and system based on spine visual information
Technical Field
The invention relates to the field of book information management, in particular to a book identification method and system based on spine visual information.
Background
At present, book information management still remains in manual book information arrangement based on human eye identification, or book information arrangement based on a radio frequency identification technology, a customized electronic tag (RFID) and a non-contact signal receiver, or book information arrangement based on a character identification technology, a picture acquisition device and an optical character identification algorithm, and a character identification algorithm based on deep learning.
The books on the shelf are taken, placed, distinguished and classified manually based on the identification of human eyes. Based on the radio frequency identification technology: and installing an electronic tag (RFID) for each book, inputting book information, and collecting the book information by using non-contact equipment when the book information needs to be identified. Based on the character recognition technology: including Optical Character Recognition (OCR) based methods and Deep Learning (Deep Learning) based methods. The method mainly carries out character recognition on the contents of the cover, the spine or the book searching number label of the book, and uses the recognition result to carry out text retrieval in the library database. For example, book sorting systems, book receiving platforms (CN201610632579.2), library book fetching robots (CN104552230A), and automatic collection method and system for digital resources of publications (CN104424271B) are based on optical character recognition, but they do not relate to the recognition of book spines, and they can only classify books by recognizing the covers of the books or the contents of the books. In actual use, books are often densely placed on the bookshelf, and only the book spine is exposed outside, so that the prior art is difficult to perform book category identification on pictures containing a plurality of book spines.
The prior art also has the following technical defects that the identification based on human eyes consumes huge time and labor cost, and the improvement of the working efficiency is greatly limited. Radio frequency identification-based technologies require heavy pre-construction work, rely on proprietary equipment and systems, and are costly. Based on the character recognition technology, the method is sensitive to the thickness and the abrasion degree of the book and the diversity of the artistic characters, and has poor stability; the character classification is essentially based on the dictionary set range, namely, the unknown languages and fonts which are not in the dictionary set cannot be identified, and the expandability is not realized.
Disclosure of Invention
The invention provides a spine recognition method which is low in cost, high in speed, high in precision and automatic, and solves the problem of book recognition on a library shelf and similar scenes.
Aiming at the defects of the prior art, the invention provides a book identification method based on spine visual information, which comprises the following steps:
step 1, obtaining book spine pictures marked with spine segmentation as a training set, training a deep convolution neural network model for segmenting the spine through the training set to obtain a spine segmentation model, and performing example segmentation on the collected book pictures on a shelf by using the spine segmentation model to obtain a plurality of spine pictures;
step 2, marking book categories for each spine picture to construct a spine classification data set, training a deep convolutional neural network model for spine classification through the spine classification data set to obtain a spine feature extraction model, extracting spine visual features of each book in a book database by using the spine feature extraction model, and integrating the spine visual features to construct a spine visual database;
and 3, inputting the to-be-recognized spine picture containing a plurality of spines into a spine segmentation model for instance segmentation, inputting the segmentation result into the spine feature extraction model to obtain visual feature vectors of the spines in the to-be-recognized spine picture, and matching the visual feature vectors with a database to recognize book categories of the spines in the to-be-recognized spine picture.
The book identification method based on the spine visual information comprises the step 1 of data set construction, wherein the step 1 comprises the steps of carrying out multi-angle shooting on books on a shelf by using picture acquisition equipment, and determining four coordinate points (x) in each spine area in a shooting resultN,yN)i,N∈[1,4]Form a closed quadrangle biPut it in a frameAnd selecting the marked book spine for segmentation.
The book identification method based on the spine visual information, wherein the step 2 comprises a book category labeling step, all spine areas B in the spine pictures of the books are obtainediObtaining a spine region BiMinimum circumscribed rectangle R ofiFour vertices (X)N,YN)i,N∈[1,4]And RiAngle of inclination theta of long sideiPerforming affine transformation on the original image by rotation θiThen according to (X)N,YN)i,N∈[1,4]Cutting to obtain regular spine picture BEiManually aligning book spine picture BEiLabeling category labels, wherein spine pictures of the same book have the same label.
The book identification method based on the spine visual information, wherein the construction method of the deep convolutional neural network model for spine classification in the step 2 comprises the following steps: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules2Feature extraction network m2Adding a fully connected classification layer classifier using an additive angle interval loss function at the tail end to obtain the structure of the deep convolutional neural network model for spine classification;
the step 2 includes training a model M according to a paradigm of a classification task using the spine classification dataset2=m2+ classifier: inputting a spine picture which is zoomed into a fixed size, training a label, M, to which the output spine picture belongs2Extracting the features in the model into a network m after training2Output feature map FiAs a visual feature vector for the spine.
The book identification method based on the spine visual information, wherein the step 3 comprises the step of sending the spine picture to BE identified into the spine segmentation model for processing to obtain the spine pictures BE of all books in the spine picture to BE identifiedi(ii) a In the identification process, two spine visual characterization vectors F are measured by using cosine similaritya=[a1,a2,…,a512]And Fb=[b1,b2,…,b512]Phase of (A) betweenDegree of similarity; spine feature extraction model m2Calculating each spine picture BEiVisual characterization of (F)iAnd performing nearest neighbor search on the data and data in the spine vision database to obtain a plurality of spine category id information with highest similarity with the target spine picture in the spine vision database, wherein the category id information with the highest similarity is used as a final identification result.
The invention also provides a book identification system based on the spine visual information, which comprises the following steps:
the system comprises a first training module, a second training module and a third training module, wherein the first training module is used for acquiring book spine pictures marked with book spine segmentation as a training set, training a deep convolution neural network model for segmenting book spines through the training set to obtain a book spine segmentation model, and using the book spine segmentation model to perform example segmentation on collected book pictures on a shelf to obtain a plurality of book spine pictures;
the second training module is used for marking book categories for each spine picture, constructing a spine classification data set, training a deep convolutional neural network model for spine classification through the spine classification data set to obtain a spine feature extraction model, extracting the spine visual features of each book in a book database by using the spine feature extraction model, and integrating the spine visual features to construct a spine visual database;
and the identification module is used for inputting the to-be-identified book spine picture containing a plurality of book spines into the book spine segmentation model for instance segmentation, inputting the segmentation result into the book spine feature extraction model to obtain the visual feature vector of each book spine in the to-be-identified book spine picture, and matching the visual feature vector with the database to identify the book category of each book spine in the to-be-identified book spine picture.
The book identification system based on the spine visual information, wherein the first training module comprises: using a picture acquisition device to shoot books on the shelf from multiple angles, and determining four coordinate points (x) in each spine area in the shooting resultN,yN)i,N∈[1,4]Form a closed quadrangle biAnd (5) selecting the book spine to mark the book spine segmentation.
Based on visual information of the spineThe book recognition system, wherein the second training module comprises: obtaining all book spine areas B in book spine picturesiObtaining a spine region BiMinimum circumscribed rectangle R ofiFour vertices (X)N,YN)i,N∈[1,4]And RiAngle of inclination theta of long sideiPerforming affine transformation on the original image by rotation θiThen according to (X)N,YN)i,N∈[1,4]Cutting to obtain regular spine picture BEiManually aligning book spine picture BEiLabeling category labels, wherein spine pictures of the same book have the same label.
The book identification system based on the spine visual information, wherein the construction process of the deep convolutional neural network model for spine classification in the second training module comprises the following steps: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules2Feature extraction network m2Adding a fully connected classification layer classifier using an additive angle interval loss function at the tail end to obtain the structure of the deep convolutional neural network model for spine classification;
the second training module includes: training a model M according to a paradigm of classification tasks using the spine classification dataset2=m2+ classifier: inputting a spine picture which is zoomed into a fixed size, training a label, M, to which the output spine picture belongs2Extracting the features in the model into a network m after training2Output feature map FiAs a visual feature vector for the spine.
The book identification system based on the spine visual information comprises a spine image BE module, a spine segmentation module and a spine image BE module, wherein the spine image BE module is used for sending the spine image to BE identified into the spine segmentation module for processing to obtain the spine images BE of all books in the spine image to BE identifiedi(ii) a Measuring the similarity between two spine visual representation vectors by using cosine similarity in the identification process; spine feature extraction model m2Calculating each spine picture BEiVisual characterization of (F)iThe data in the ridge vision database and the data in the ridge vision database are searched for the nearest neighbor to obtain the target book in the ridge vision databaseAnd a plurality of spine category id information with highest spine picture similarity, wherein the category id information with highest similarity serves as a final identification result.
According to the scheme, the invention has the advantages that:
the book identification is a core step of most book management work, the technology of the application can automate the step under the conditions of low cost and high precision, thereby greatly reducing the manpower and finally achieving the purpose of replacing manual book arrangement by a machine. The method identifies the spine pictures of the book based on the deep learning algorithm, does not need to configure complicated hardware facilities, and ensures low cost; all visual information of the spine target is utilized, the method is not limited by a dictionary set on which a character recognition method depends, newly added books are supported in the collection of the books, and the method has higher accuracy rate and better robustness and expandability; according to different application requirements, the spine pictures of a single spine or a series of books on the shelf can be identified individually or in batches, and the high efficiency of book identification is ensured.
Drawings
FIG. 1 is a flow chart of the technical solution;
FIG. 2 is an example of a multi-view spine information collection picture;
FIG. 3 is a manually labeled spine region (white quadrangle is labeled box);
FIG. 4 is an example of a spine segmentation, wherein different color masks represent different spine regions;
FIG. 5 is an example of spine picture extraction;
fig. 6 is a spine feature extraction model.
Detailed Description
Aiming at the book identification problem on a library shelf or other scenes, spine pictures are identified so as to determine the category of the spine pictures. The method mainly comprises the following steps: 1) collecting book spine pictures of books on a library shelf, and manually marking the pictures to construct a spine segmentation and spine classification data set; 2) constructing a convolutional neural network for extracting the depth features of the spine image, and training by using training data to obtain a feature extraction model; 3) in the testing process, a picture of one side of the spine of a book on the shelf can be shot, the spine is subjected to instance segmentation, then a trained model is utilized to obtain visual feature vectors corresponding to the spine picture, and then the spine picture is matched with a library database to identify the class of the book corresponding to the spine.
In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below. In order to achieve the above object, the present invention provides a spine recognition method based on deep convolutional neural network as shown in fig. 1, including the following steps:
1. and training a spine segmentation model. First, in the real environment of a library, a large number of books on shelves are collected. Then, manually marking the partially collected book pictures on the rack to construct a spine segmentation data set; as an example segmentation task, a deep convolutional neural network model is designed that implements the segmentation of the spine, and the spine segmentation model is trained end-to-end using the spine segmentation dataset.
2. And (5) training a spine classification model. And carrying out example segmentation on the collected book pictures on the rack by using the trained spine segmentation model. Manually marking the class id of the segmented spine picture, and constructing a spine classification data set; and designing a deep convolution neural model for realizing spine classification. As a classification task, the model is trained end-to-end using the spine classification dataset, and a spine feature extraction network is derived from the model.
3. And (5) spine recognition. Firstly, calculating the spine visual representation of each book in a library database by using a trained spine feature extraction model, and adding and storing the spine visual representation in the database; when books are put on the identification shelf, pictures are taken on one side of the spines of a row of target books, the spine segmentation model automatically segments all spine regions, the spine feature extraction model calculates the visual representation of each spine region, and finally, the visual representation of the spines is used for performing nearest neighbor search in entries of a library database so as to determine library information corresponding to the target spines.
The invention is a software algorithm solution of spine classification and book identification problems, does not need to install and configure a complex hardware system, and simultaneously replaces manpower in key steps of book identification, thereby greatly reducing the manpower cost. In the identification process, all visual characteristics of the book spine region are utilized for the target picture, and only the character information is utilized, so that the method can identify the book spine with any language and artistic design and better resist the influence of factors such as ambient light change, book abrasion and the like. The characteristic vector matching is adopted to determine the recognition result, so that the dependency of a character recognition method on a dictionary set is eliminated, and the method can conveniently support newly-added library books.
Training of spine instance segmentation model
1) And constructing a spine segmentation data set. In a real library scene, the books on the shelf are shot by using RGB picture acquisition equipment. In order to obtain different pictures of the same spine after division, each book on the bookshelf is photographed from three different angles (as shown in fig. 2). Most of the book is in the shooting range as far as possible on the premise of ensuring the picture to be clear. In this embodiment, the number of books collected on the shelf is about 300, and the original picture element size is 1080 × 1920. Manually labeling a book picture (90 sheets) on a shelf, and determining four coordinate points (x) for each spine region in the pictureN,yN)i,N∈[1,4]Form a closed quadrangle biIt is boxed (as in fig. 3) to construct the spine segmentation data set. 80% of these were used as training data sets and 20% as test data sets.
Fig. 3 manually labels the spine region (white quadrangle is the labeled box).
And training a spine segmentation model. An example segmentation task in the field of computer vision is to not only detect the position of an object from a picture, but also segment the object from the background at the pixel level. The spine segmentation task can be realized by adopting a very mature instance segmentation model (such as a Mask R-CNN framework). Training a spine segmentation model end-to-end using a spine segmentation dataset: inputting the original images of the books on the shelf and the corresponding book spine frame label information, training, segmenting and outputting all the book spine areas (as shown in figure 4).
3) Other possible embodiments. In this step, the pictures of books on the shelf may be collected from different numbers of viewing angles when the same book shelf is photographed, possibly in an archive or other similar scenes; for extracting the spine region in the book picture on the shelf, the spine instance segmentation model may also be implemented according to other architectures, such as polarmmask, SOLO, BlendMask, and the like.
Training of spine classification models
1) Acquiring a spine picture and constructing a spine classification data set. Model M for completing spine segmentation1After the training, the collected pictures of all books on the shelf are subjected to example segmentation to obtain all spine areas B in the picturesi. Because the output of the model is the spine region B obtained by segmentationiIs an irregular area composed of pixel points classified as books in the picture, and B is obtained by calculationiMinimum circumscribed rectangle R ofiFour vertices (X)N,YN)i,N∈[1,4]And RiAngle of inclination theta of long sideiPerforming affine transformation on the original image by rotation θiThen according to (X)N,YN)i,N∈[1,4]Cutting to obtain regular spine picture BEi(see fig. 5). Manually labeling the spine pictures with category labels to ensure that the spine pictures of the same book have the same labels.
2) And extracting the visual representation of the spine picture. Constructing 18-layer deep convolutional neural network as feature extraction network m by using residual error module2The end adds a full-connectivity classification layer classifier (see fig. 6) using an additive angular interval loss function (see equation 1). Training model M according to classification task paradigm by using spine classification dataset2=m2+ classifier: the input is scaled to a fixed size (800 x 80) spine picture, and the correct label (i.e., class id) to which the output spine picture belongs is trained. M2After training is completed, m in the model2Output feature map FiAs a visual representation of the spine.
Figure BDA0002809135440000071
Wherein N is the number of samples in the mini-batch, s and m are hyper-parameters of the method, yiIs a specific category, n is the number of categories, and theta is the included angle between the weight and the feature vector in the model calculation process.
3) Other possible embodiments. In this step, the spine classification model may be composed of more layers of residual modules, or may be constructed by using other classical feature extraction networks, such as VGG, inclusion, or other self-designed deep convolution networks; the dimensions of the feature vectors ultimately taken for a single spine picture may vary.
Book identification
1) By m2The feature extraction network calculates visual representations of all the spines in the library, in this example, the visual representation F of each bookiIs a 512-dimensional vector. All vectors are stored in a single file Dict and saved in a library database for one-time reading in during retrieval.
2) To identify the class id of a row of target books, a picture is first taken on the spine side and fed into the spine segmentation model M1Middle processing to obtain spine pictures BE of all books in the picturesi. In the identification process, cosine similarity (as formula 2) is used for measuring two spine visual characterization vectors Fa=[a1,a2,…,a512]And Fb=[b1,b2,…,b512]Degree of similarity therebetween, FaFor the visual characterization vector of the spine in the spine picture to be recognized, FbAnd the visual characterization vector of the book spine in the book spine visual database is obtained. Spine feature extraction model m2Calculating each spine picture BEiVisual characterization of (F)iAnd performing nearest neighbor search on the image data and the Dict in the library database to obtain 5 spine (top5) type id information with highest similarity with the target spine image in the database, wherein the id with the highest similarity is used as a final identification result.
Figure BDA0002809135440000081
3) Other possible embodiments. Other loss functions may be used in training the spine classification network; when the characteristic extraction network is used for calculating the library database, a file may be stored for the visual representation vector of each book, and the files are read in and matched circularly during retrieval; when feature vector nearest neighbor search is performed, other criteria may be used to evaluate the similarity between vectors, such as euclidean distance or other distance measurement methods.
In this embodiment, the simulation constructs a target database probe containing 5580 spine pictures to be recognized and a test database garley containing 3700 collected spine pictures. And traversing the spine picture in the probe, performing nearest neighbor search with the visual representation Dict in the galery, and taking the image with the maximum similarity as a final class id identification result. Through statistical analysis, the book category id identification accuracy rate reaches 99.32%. In the matching error example, most of books in the same series are too similar in spine, and considering that the shelf-loading positions of books in the same series are generally in the same area, the bookshelf position judgment accuracy can reach 99.93% for the shelf-loading and unloading requirements of books.
The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described embodiments.
The invention also provides a book identification system based on the spine visual information, which comprises the following steps:
the system comprises a first training module, a second training module and a third training module, wherein the first training module is used for acquiring book spine pictures marked with book spine segmentation as a training set, training a deep convolution neural network model for segmenting book spines through the training set to obtain a book spine segmentation model, and using the book spine segmentation model to perform example segmentation on collected book pictures on a shelf to obtain a plurality of book spine pictures;
the second training module is used for marking book categories for each spine picture, constructing a spine classification data set, training a deep convolutional neural network model for spine classification through the spine classification data set to obtain a spine feature extraction model, extracting the spine visual features of each book in a book database by using the spine feature extraction model, and integrating the spine visual features to construct a spine visual database;
and the identification module is used for inputting the to-be-identified book spine picture containing a plurality of book spines into the book spine segmentation model for instance segmentation, inputting the segmentation result into the book spine feature extraction model to obtain the visual feature vector of each book spine in the to-be-identified book spine picture, and matching the visual feature vector with the database to identify the book category of each book spine in the to-be-identified book spine picture.
The book identification system based on the spine visual information, wherein the first training module comprises: using a picture acquisition device to shoot books on the shelf from multiple angles, and determining four coordinate points (x) in each spine area in the shooting resultN,yN)i,N∈[1,4]Form a closed quadrangle biAnd (5) selecting the book spine to mark the book spine segmentation.
The book identification system based on the spine visual information, wherein the second training module comprises: obtaining all book spine areas B in book spine picturesiObtaining a spine region BiMinimum circumscribed rectangle R ofiFour vertices (X)N,YN)i,N∈[1,4]And RiAngle of inclination theta of long sideiPerforming affine transformation on the original image by rotation θiThen according to (X)N,YN)i,N∈[1,4]Cutting to obtain regular spine picture BEiManually aligning book spine picture BEiLabeling category labels, wherein spine pictures of the same book have the same label.
The book identification system based on the spine visual information, wherein the construction process of the deep convolutional neural network model for spine classification in the second training module comprises the following steps: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules2Feature extraction network m2Adding a full concatenation at the end using an additive angular interval penalty functionConnecting a classification layer classifier to obtain the structure of the deep convolutional neural network model for spine classification;
the second training module includes: training a model M according to a paradigm of classification tasks using the spine classification dataset2=m2+ classifier: inputting a spine picture which is zoomed into a fixed size, training a label, M, to which the output spine picture belongs2Extracting the features in the model into a network m after training2Output feature map FiAs a visual feature vector for the spine.
The book identification system based on the spine visual information comprises a spine image BE module, a spine segmentation module and a spine image BE module, wherein the spine image BE module is used for sending the spine image to BE identified into the spine segmentation module for processing to obtain the spine images BE of all books in the spine image to BE identifiedi(ii) a Measuring the similarity between two spine visual representation vectors by using cosine similarity in the identification process; spine feature extraction model m2Calculating each spine picture BEiVisual characterization of (F)iAnd performing nearest neighbor search on the data and data in the spine vision database to obtain a plurality of spine category id information with highest similarity with the target spine picture in the spine vision database, wherein the category id information with the highest similarity is used as a final identification result.
The specific scenarios of the invention can be as follows:
1. when a reader borrows a specific book, the reader searches a target book among a plurality of book lattices of a bookshelf even if the position of the bookshelf is retrieved. The method and the device can help readers to quickly identify the target book in complicated book lattices.
2. After the reader returns the books, the books need to be placed for the next borrowing by the reader. When the book is placed on the shelf, and the shelf is placed on the shelf. The bookshelf position of all books can be once shot, discerned, directly exported to books in a row to this application.
3. Since a reader may misplace a book or for other reasons after reading the book, the reader needs to check whether the book is in the correct bookshelf position during the library routine inspection. This work load is more huge, and it is almost impossible for the people to do, and this application can realize quick accurate books inspection.
4. With this application algorithm deployment on taking the mobile robot platform of arm, can realize the unmanned of the full flow of librarray management, from borrowing to going back the book, from the inspection to the arrangement, this application technique has given the robot to the accurate perception ability of books, and the action ability of cooperation arm just can really accomplish the machine and replace the manual work.

Claims (10)

1. A book identification method based on spine visual information is characterized by comprising the following steps:
step 1, obtaining book spine pictures marked with spine segmentation as a training set, training a deep convolution neural network model for segmenting the spine through the training set to obtain a spine segmentation model, and performing example segmentation on the collected book pictures on a shelf by using the spine segmentation model to obtain a plurality of spine pictures;
step 2, marking book categories for each spine picture to construct a spine classification data set, training a deep convolutional neural network model for spine classification through the spine classification data set to obtain a spine feature extraction model, extracting spine visual features of each book in a book database by using the spine feature extraction model, and integrating the spine visual features to construct a spine visual database;
and 3, inputting the to-be-recognized spine picture containing a plurality of spines into a spine segmentation model for instance segmentation, inputting the segmentation result into the spine feature extraction model to obtain visual feature vectors of the spines in the to-be-recognized spine picture, and matching the visual feature vectors with a database to recognize book categories of the spines in the to-be-recognized spine picture.
2. The method for book identification based on spine visual information as claimed in claim 1, wherein the step 1 comprises a data set construction step of taking a picture of the book on the shelf from multiple angles using a picture-taking device and determining four coordinates for each spine region in the result of the takingPoint (x)N,yN)i,N∈[1,4]Form a closed quadrangle biAnd (5) selecting the book spine to mark the book spine segmentation.
3. The book identification method based on the spine visual information as claimed in claim 1, wherein the step 2 comprises a book category labeling step for obtaining all spine regions B in the spine pictures of the bookiObtaining a spine region BiMinimum circumscribed rectangle R ofiFour vertices (X)N,YN)i,N∈[1,4]And RiAngle of inclination theta of long sideiPerforming affine transformation on the original image by rotation θiThen according to (X)N,YN)i,N∈[1,4]Cutting to obtain regular spine picture BEiManually aligning book spine picture BEiLabeling category labels, wherein spine pictures of the same book have the same label.
4. The book identification method based on spine visual information as claimed in claim 1, wherein the construction method of the deep convolutional neural network model for spine classification in step 2 comprises: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules2Feature extraction network m2Adding a fully connected classification layer classifier using an additive angle interval loss function at the tail end to obtain the structure of the deep convolutional neural network model for spine classification;
the step 2 includes training a model M according to a paradigm of a classification task using the spine classification dataset2=m2+ classifier: inputting a spine picture which is zoomed into a fixed size, training a label, M, to which the output spine picture belongs2Extracting the features in the model into a network m after training2Output feature map FiAs a visual feature vector for the spine.
5. The book identification method based on spine visual information as claimed in claim 4, wherein the step 3 comprises identifying the spine to be identifiedThe pictures are sent into the spine segmentation model for processing to obtain spine pictures BE of all books in the spine pictures to BE identifiedi(ii) a In the identification process, two spine visual characterization vectors F are measured by using cosine similaritya=[a1,a2,…,a512]And Fb=[b1,b2,…,b512]The degree of similarity between them; spine feature extraction model m2Calculating each spine picture BEiVisual characterization of (F)iAnd performing nearest neighbor search on the data and data in the spine vision database to obtain a plurality of spine category id information with highest similarity with the target spine picture in the spine vision database, wherein the category id information with the highest similarity is used as a final identification result.
6. A book identification system based on spine visual information, comprising:
the system comprises a first training module, a second training module and a third training module, wherein the first training module is used for acquiring book spine pictures marked with book spine segmentation as a training set, training a deep convolution neural network model for segmenting book spines through the training set to obtain a book spine segmentation model, and using the book spine segmentation model to perform example segmentation on collected book pictures on a shelf to obtain a plurality of book spine pictures;
the second training module is used for marking book categories for each spine picture, constructing a spine classification data set, training a deep convolutional neural network model for spine classification through the spine classification data set to obtain a spine feature extraction model, extracting the spine visual features of each book in a book database by using the spine feature extraction model, and integrating the spine visual features to construct a spine visual database;
and the identification module is used for inputting the to-be-identified book spine picture containing a plurality of book spines into the book spine segmentation model for instance segmentation, inputting the segmentation result into the book spine feature extraction model to obtain the visual feature vector of each book spine in the to-be-identified book spine picture, and matching the visual feature vector with the database to identify the book category of each book spine in the to-be-identified book spine picture.
7. The book recognition system based on spine visual information of claim 6, wherein the first training module comprises: using a picture acquisition device to shoot books on the shelf from multiple angles, and determining four coordinate points (x) in each spine area in the shooting resultN,yN)i,N∈[1,4]Form a closed quadrangle biAnd (5) selecting the book spine to mark the book spine segmentation.
8. The book recognition system based on spine visual information of claim 6, wherein the second training module comprises: obtaining all book spine areas B in book spine picturesiObtaining a spine region BiMinimum circumscribed rectangle R ofiFour vertices (X)N,YN)i,N∈[1,4]And RiAngle of inclination theta of long sideiPerforming affine transformation on the original image by rotation θiThen according to (X)N,YN)i,N∈[1,4]Cutting to obtain regular spine picture BEiManually aligning book spine picture BEiLabeling category labels, wherein spine pictures of the same book have the same label.
9. The book identification system based on spine visual information as claimed in claim 6, wherein the building process of the deep convolutional neural network model for spine classification in the second training module comprises: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules2Feature extraction network m2Adding a fully connected classification layer classifier using an additive angle interval loss function at the tail end to obtain the structure of the deep convolutional neural network model for spine classification;
the second training module includes: training a model M according to a paradigm of classification tasks using the spine classification dataset2=m2+ classifier: inputting a spine picture which is zoomed into a fixed size, training a label, M, to which the output spine picture belongs2Extracting the features in the model into a network m after training2Characteristics of the outputFIG. FiAs a visual feature vector for the spine.
10. The book identification system based on spine visual information as claimed in claim 9, wherein the identification module comprises processing the to-BE-identified spine picture into the spine segmentation model to obtain spine pictures BE of all books in the to-BE-identified spine picturei(ii) a Measuring the similarity between two spine visual representation vectors by using cosine similarity in the identification process; spine feature extraction model m2Calculating each spine picture BEiVisual characterization of (F)iAnd performing nearest neighbor search on the data and data in the spine vision database to obtain a plurality of spine category id information with highest similarity with the target spine picture in the spine vision database, wherein the category id information with the highest similarity is used as a final identification result.
CN202011383651.5A 2020-12-01 2020-12-01 Book identification method and system based on spine visual information Pending CN112560902A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011383651.5A CN112560902A (en) 2020-12-01 2020-12-01 Book identification method and system based on spine visual information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011383651.5A CN112560902A (en) 2020-12-01 2020-12-01 Book identification method and system based on spine visual information

Publications (1)

Publication Number Publication Date
CN112560902A true CN112560902A (en) 2021-03-26

Family

ID=75046013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011383651.5A Pending CN112560902A (en) 2020-12-01 2020-12-01 Book identification method and system based on spine visual information

Country Status (1)

Country Link
CN (1) CN112560902A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113487598A (en) * 2021-07-26 2021-10-08 中国科学院国家空间科学中心 Bookbinding error detection system based on computer vision
CN114882483A (en) * 2022-04-01 2022-08-09 南京大学 A book inventory method based on computer vision
CN115471830A (en) * 2021-06-10 2022-12-13 南京大学 A Computer Vision-Based Library Collection Method
CN116469121A (en) * 2023-05-25 2023-07-21 深圳市星桐科技有限公司 Learning object recognition method, device, equipment and storage medium
CN117591695A (en) * 2023-11-27 2024-02-23 深圳市海恒智能股份有限公司 An intelligent book retrieval system based on visual representation

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104966081A (en) * 2015-06-04 2015-10-07 广州美读信息技术有限公司 Spine image recognition method
US20150371085A1 (en) * 2014-06-19 2015-12-24 Bitlit Media Inc. Method and system for identifying books on a bookshelf
CN108664996A (en) * 2018-04-19 2018-10-16 厦门大学 A kind of ancient writing recognition methods and system based on deep learning
CN110929746A (en) * 2019-05-24 2020-03-27 南京大学 Electronic file title positioning, extracting and classifying method based on deep neural network
CN111368856A (en) * 2020-03-16 2020-07-03 广东技术师范大学 Spine extraction method and device of book checking system based on vision
CN111460185A (en) * 2020-03-30 2020-07-28 小船出海教育科技(北京)有限公司 Book searching method, device and system
CN111667639A (en) * 2020-05-28 2020-09-15 北京每日优鲜电子商务有限公司 Book return service realization method and device and intelligent book cabinet

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150371085A1 (en) * 2014-06-19 2015-12-24 Bitlit Media Inc. Method and system for identifying books on a bookshelf
CN104966081A (en) * 2015-06-04 2015-10-07 广州美读信息技术有限公司 Spine image recognition method
CN108664996A (en) * 2018-04-19 2018-10-16 厦门大学 A kind of ancient writing recognition methods and system based on deep learning
CN110929746A (en) * 2019-05-24 2020-03-27 南京大学 Electronic file title positioning, extracting and classifying method based on deep neural network
CN111368856A (en) * 2020-03-16 2020-07-03 广东技术师范大学 Spine extraction method and device of book checking system based on vision
CN111460185A (en) * 2020-03-30 2020-07-28 小船出海教育科技(北京)有限公司 Book searching method, device and system
CN111667639A (en) * 2020-05-28 2020-09-15 北京每日优鲜电子商务有限公司 Book return service realization method and device and intelligent book cabinet

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JIANKANG DENG 等: "ArcFace: Additive Angular Margin Loss for Deep Face Recognition", 《ARXIV》 *
SHUO ZHOU 等: "Library on-shelf book segmentation and recognition based on deep visual features", 《INFORMATION PROCESSING AND MANAGEMENT 59 (2022)》 *
崔晨: "基于图像的书脊检测与识别关键技术研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115471830A (en) * 2021-06-10 2022-12-13 南京大学 A Computer Vision-Based Library Collection Method
CN113487598A (en) * 2021-07-26 2021-10-08 中国科学院国家空间科学中心 Bookbinding error detection system based on computer vision
CN113487598B (en) * 2021-07-26 2024-11-29 中国科学院国家空间科学中心 Book binding error detection system based on computer vision
CN114882483A (en) * 2022-04-01 2022-08-09 南京大学 A book inventory method based on computer vision
CN114882483B (en) * 2022-04-01 2025-04-04 南京大学 A book inventory method based on computer vision
CN116469121A (en) * 2023-05-25 2023-07-21 深圳市星桐科技有限公司 Learning object recognition method, device, equipment and storage medium
CN117591695A (en) * 2023-11-27 2024-02-23 深圳市海恒智能股份有限公司 An intelligent book retrieval system based on visual representation
CN117591695B (en) * 2023-11-27 2024-07-23 深圳市海恒智能股份有限公司 An intelligent book retrieval system based on visual representation

Similar Documents

Publication Publication Date Title
CN112560902A (en) Book identification method and system based on spine visual information
CN114241464B (en) Real-time matching geographic positioning method and system for cross-view images based on deep learning
Alidoost et al. A CNN-based approach for automatic building detection and recognition of roof types using a single aerial image
CN111832578B (en) Point of interest information processing method, device, electronic device and storage medium
CN110008956B (en) Invoice key information positioning method, invoice key information positioning device, computer equipment and storage medium
CN110826549A (en) Inspection robot instrument image identification method and system based on computer vision
CN112633297B (en) Target object identification method and device, storage medium and electronic device
CN112633382A (en) Mutual-neighbor-based few-sample image classification method and system
CN110704712A (en) Recognition method and system of scene picture shooting location range based on image retrieval
CN112633114B (en) Unmanned aerial vehicle inspection intelligent early warning method and device for building change event
CN113642430B (en) VGG+ NetVLAD-based high-precision visual positioning method and system for underground parking garage
CN111476210A (en) Image-based text recognition method, system, device and storage medium
CN111553422A (en) Automatic identification and recovery method and system for surgical instruments
CN114663751A (en) Power transmission line defect identification method and system based on incremental learning technology
Gao et al. DCT-based local descriptor for robust matching and feature tracking in wide area motion imagery
CN118247472A (en) Geographic mapping target intelligent detection method based on unmanned aerial vehicle remote sensing image
CN113065559A (en) Image comparison method and device, electronic equipment and storage medium
Peng et al. Application of deep residual neural network to water meter reading recognition
CN116071544A (en) Image Description Prediction Method for Weakly Supervised Oriented Visual Understanding
CN119339134B (en) A method, apparatus, terminal device, and storage medium for classifying 3D point cloud data based on multiple scales.
CN120198760A (en) A multi-source spatial data fusion processing system and method
CN113378739A (en) Foundation cloud target detection method based on deep learning
Liang et al. Automated filtering of façade defect images using a similarity method for enhanced inspection documentation
CN108345895A (en) Advertising image recognition methods and advertising image identifying system
CN111126189A (en) Target searching method based on remote sensing image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210326

WD01 Invention patent application deemed withdrawn after publication