CN112560902A

CN112560902A - Book identification method and system based on spine visual information

Info

Publication number: CN112560902A
Application number: CN202011383651.5A
Authority: CN
Inventors: 孙坦; 周硕; 柴秀娟; 张文蓉; 鲜国建
Original assignee: Agricultural Information Institute of CAAS
Current assignee: Agricultural Information Institute of CAAS
Priority date: 2020-12-01
Filing date: 2020-12-01
Publication date: 2021-03-26

Abstract

The invention provides a book identification method and system based on the visual information of the spine, including collecting the spine pictures of the books on the shelf in the library, manually labeling them to construct a spine segmentation and spine classification data set; Then use the training data for training to obtain a feature extraction model; take a picture of the spine side of the book on the shelf, first segment the spine for instance, and then use the trained model to obtain the visual feature vector corresponding to the spine image , and then matched against the holdings database to identify the category of the book that the spine corresponds to. The invention recognizes the spine picture of the book based on the deep learning algorithm, utilizes all the visual information of the spine target, is not limited to the dictionary set that the text recognition method depends on, supports new books in the collection, and has higher accuracy and better accuracy. Good robustness and scalability; batch identification of spine pictures of a series of books on the shelf.

Description

Book identification method and system based on spine visual information

Technical Field

The invention relates to the field of book information management, in particular to a book identification method and system based on spine visual information.

Background

At present, book information management still remains in manual book information arrangement based on human eye identification, or book information arrangement based on a radio frequency identification technology, a customized electronic tag (RFID) and a non-contact signal receiver, or book information arrangement based on a character identification technology, a picture acquisition device and an optical character identification algorithm, and a character identification algorithm based on deep learning.

The books on the shelf are taken, placed, distinguished and classified manually based on the identification of human eyes. Based on the radio frequency identification technology: and installing an electronic tag (RFID) for each book, inputting book information, and collecting the book information by using non-contact equipment when the book information needs to be identified. Based on the character recognition technology: including Optical Character Recognition (OCR) based methods and Deep Learning (Deep Learning) based methods. The method mainly carries out character recognition on the contents of the cover, the spine or the book searching number label of the book, and uses the recognition result to carry out text retrieval in the library database. For example, book sorting systems, book receiving platforms (CN201610632579.2), library book fetching robots (CN104552230A), and automatic collection method and system for digital resources of publications (CN104424271B) are based on optical character recognition, but they do not relate to the recognition of book spines, and they can only classify books by recognizing the covers of the books or the contents of the books. In actual use, books are often densely placed on the bookshelf, and only the book spine is exposed outside, so that the prior art is difficult to perform book category identification on pictures containing a plurality of book spines.

The prior art also has the following technical defects that the identification based on human eyes consumes huge time and labor cost, and the improvement of the working efficiency is greatly limited. Radio frequency identification-based technologies require heavy pre-construction work, rely on proprietary equipment and systems, and are costly. Based on the character recognition technology, the method is sensitive to the thickness and the abrasion degree of the book and the diversity of the artistic characters, and has poor stability; the character classification is essentially based on the dictionary set range, namely, the unknown languages and fonts which are not in the dictionary set cannot be identified, and the expandability is not realized.

Disclosure of Invention

The invention provides a spine recognition method which is low in cost, high in speed, high in precision and automatic, and solves the problem of book recognition on a library shelf and similar scenes.

Aiming at the defects of the prior art, the invention provides a book identification method based on spine visual information, which comprises the following steps:

step 1, obtaining book spine pictures marked with spine segmentation as a training set, training a deep convolution neural network model for segmenting the spine through the training set to obtain a spine segmentation model, and performing example segmentation on the collected book pictures on a shelf by using the spine segmentation model to obtain a plurality of spine pictures;

step 2, marking book categories for each spine picture to construct a spine classification data set, training a deep convolutional neural network model for spine classification through the spine classification data set to obtain a spine feature extraction model, extracting spine visual features of each book in a book database by using the spine feature extraction model, and integrating the spine visual features to construct a spine visual database;

and 3, inputting the to-be-recognized spine picture containing a plurality of spines into a spine segmentation model for instance segmentation, inputting the segmentation result into the spine feature extraction model to obtain visual feature vectors of the spines in the to-be-recognized spine picture, and matching the visual feature vectors with a database to recognize book categories of the spines in the to-be-recognized spine picture.

The book identification method based on the spine visual information comprises the step 1 of data set construction, wherein the step 1 comprises the steps of carrying out multi-angle shooting on books on a shelf by using picture acquisition equipment, and determining four coordinate points (x) in each spine area in a shooting result_N，y_N)_i,N∈[1,4]Form a closed quadrangle b_iPut it in a frameAnd selecting the marked book spine for segmentation.

The book identification method based on the spine visual information, wherein the step 2 comprises a book category labeling step, all spine areas B in the spine pictures of the books are obtained_iObtaining a spine region B_iMinimum circumscribed rectangle R of_iFour vertices (X)_N，Y_N)_i,N∈[1,4]And R_iAngle of inclination theta of long side_iPerforming affine transformation on the original image by rotation θ_iThen according to (X)_N，Y_N)_i,N∈[1,4]Cutting to obtain regular spine picture BE_iManually aligning book spine picture BE_iLabeling category labels, wherein spine pictures of the same book have the same label.

The book identification method based on the spine visual information, wherein the construction method of the deep convolutional neural network model for spine classification in the step 2 comprises the following steps: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules₂Feature extraction network m₂Adding a fully connected classification layer classifier using an additive angle interval loss function at the tail end to obtain the structure of the deep convolutional neural network model for spine classification;

the step 2 includes training a model M according to a paradigm of a classification task using the spine classification dataset₂＝m₂+ classifier: inputting a spine picture which is zoomed into a fixed size, training a label, M, to which the output spine picture belongs₂Extracting the features in the model into a network m after training₂Output feature map F_iAs a visual feature vector for the spine.

The book identification method based on the spine visual information, wherein the step 3 comprises the step of sending the spine picture to BE identified into the spine segmentation model for processing to obtain the spine pictures BE of all books in the spine picture to BE identified_i(ii) a In the identification process, two spine visual characterization vectors F are measured by using cosine similarity_a＝[a₁,a₂,…,a₅₁₂]And F_b＝[b₁,b₂,…,b₅₁₂]Phase of (A) betweenDegree of similarity; spine feature extraction model m₂Calculating each spine picture BE_iVisual characterization of (F)_iAnd performing nearest neighbor search on the data and data in the spine vision database to obtain a plurality of spine category id information with highest similarity with the target spine picture in the spine vision database, wherein the category id information with the highest similarity is used as a final identification result.

The invention also provides a book identification system based on the spine visual information, which comprises the following steps:

the system comprises a first training module, a second training module and a third training module, wherein the first training module is used for acquiring book spine pictures marked with book spine segmentation as a training set, training a deep convolution neural network model for segmenting book spines through the training set to obtain a book spine segmentation model, and using the book spine segmentation model to perform example segmentation on collected book pictures on a shelf to obtain a plurality of book spine pictures;

the second training module is used for marking book categories for each spine picture, constructing a spine classification data set, training a deep convolutional neural network model for spine classification through the spine classification data set to obtain a spine feature extraction model, extracting the spine visual features of each book in a book database by using the spine feature extraction model, and integrating the spine visual features to construct a spine visual database;

and the identification module is used for inputting the to-be-identified book spine picture containing a plurality of book spines into the book spine segmentation model for instance segmentation, inputting the segmentation result into the book spine feature extraction model to obtain the visual feature vector of each book spine in the to-be-identified book spine picture, and matching the visual feature vector with the database to identify the book category of each book spine in the to-be-identified book spine picture.

The book identification system based on the spine visual information, wherein the first training module comprises: using a picture acquisition device to shoot books on the shelf from multiple angles, and determining four coordinate points (x) in each spine area in the shooting result_N，y_N)_i,N∈[1,4]Form a closed quadrangle b_iAnd (5) selecting the book spine to mark the book spine segmentation.

Based on visual information of the spineThe book recognition system, wherein the second training module comprises: obtaining all book spine areas B in book spine pictures_iObtaining a spine region B_iMinimum circumscribed rectangle R of_iFour vertices (X)_N，Y_N)_i,N∈[1,4]And R_iAngle of inclination theta of long side_iPerforming affine transformation on the original image by rotation θ_iThen according to (X)_N，Y_N)_i,N∈[1,4]Cutting to obtain regular spine picture BE_iManually aligning book spine picture BE_iLabeling category labels, wherein spine pictures of the same book have the same label.

The book identification system based on the spine visual information, wherein the construction process of the deep convolutional neural network model for spine classification in the second training module comprises the following steps: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules₂Feature extraction network m₂Adding a fully connected classification layer classifier using an additive angle interval loss function at the tail end to obtain the structure of the deep convolutional neural network model for spine classification;

the second training module includes: training a model M according to a paradigm of classification tasks using the spine classification dataset₂＝m₂+ classifier: inputting a spine picture which is zoomed into a fixed size, training a label, M, to which the output spine picture belongs₂Extracting the features in the model into a network m after training₂Output feature map F_iAs a visual feature vector for the spine.

The book identification system based on the spine visual information comprises a spine image BE module, a spine segmentation module and a spine image BE module, wherein the spine image BE module is used for sending the spine image to BE identified into the spine segmentation module for processing to obtain the spine images BE of all books in the spine image to BE identified_i(ii) a Measuring the similarity between two spine visual representation vectors by using cosine similarity in the identification process; spine feature extraction model m₂Calculating each spine picture BE_iVisual characterization of (F)_iThe data in the ridge vision database and the data in the ridge vision database are searched for the nearest neighbor to obtain the target book in the ridge vision databaseAnd a plurality of spine category id information with highest spine picture similarity, wherein the category id information with highest similarity serves as a final identification result.

According to the scheme, the invention has the advantages that:

the book identification is a core step of most book management work, the technology of the application can automate the step under the conditions of low cost and high precision, thereby greatly reducing the manpower and finally achieving the purpose of replacing manual book arrangement by a machine. The method identifies the spine pictures of the book based on the deep learning algorithm, does not need to configure complicated hardware facilities, and ensures low cost; all visual information of the spine target is utilized, the method is not limited by a dictionary set on which a character recognition method depends, newly added books are supported in the collection of the books, and the method has higher accuracy rate and better robustness and expandability; according to different application requirements, the spine pictures of a single spine or a series of books on the shelf can be identified individually or in batches, and the high efficiency of book identification is ensured.

Drawings

FIG. 1 is a flow chart of the technical solution;

FIG. 2 is an example of a multi-view spine information collection picture;

FIG. 3 is a manually labeled spine region (white quadrangle is labeled box);

FIG. 4 is an example of a spine segmentation, wherein different color masks represent different spine regions;

FIG. 5 is an example of spine picture extraction;

fig. 6 is a spine feature extraction model.

Detailed Description

Aiming at the book identification problem on a library shelf or other scenes, spine pictures are identified so as to determine the category of the spine pictures. The method mainly comprises the following steps: 1) collecting book spine pictures of books on a library shelf, and manually marking the pictures to construct a spine segmentation and spine classification data set; 2) constructing a convolutional neural network for extracting the depth features of the spine image, and training by using training data to obtain a feature extraction model; 3) in the testing process, a picture of one side of the spine of a book on the shelf can be shot, the spine is subjected to instance segmentation, then a trained model is utilized to obtain visual feature vectors corresponding to the spine picture, and then the spine picture is matched with a library database to identify the class of the book corresponding to the spine.

In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below. In order to achieve the above object, the present invention provides a spine recognition method based on deep convolutional neural network as shown in fig. 1, including the following steps:

1. and training a spine segmentation model. First, in the real environment of a library, a large number of books on shelves are collected. Then, manually marking the partially collected book pictures on the rack to construct a spine segmentation data set; as an example segmentation task, a deep convolutional neural network model is designed that implements the segmentation of the spine, and the spine segmentation model is trained end-to-end using the spine segmentation dataset.

2. And (5) training a spine classification model. And carrying out example segmentation on the collected book pictures on the rack by using the trained spine segmentation model. Manually marking the class id of the segmented spine picture, and constructing a spine classification data set; and designing a deep convolution neural model for realizing spine classification. As a classification task, the model is trained end-to-end using the spine classification dataset, and a spine feature extraction network is derived from the model.

3. And (5) spine recognition. Firstly, calculating the spine visual representation of each book in a library database by using a trained spine feature extraction model, and adding and storing the spine visual representation in the database; when books are put on the identification shelf, pictures are taken on one side of the spines of a row of target books, the spine segmentation model automatically segments all spine regions, the spine feature extraction model calculates the visual representation of each spine region, and finally, the visual representation of the spines is used for performing nearest neighbor search in entries of a library database so as to determine library information corresponding to the target spines.

The invention is a software algorithm solution of spine classification and book identification problems, does not need to install and configure a complex hardware system, and simultaneously replaces manpower in key steps of book identification, thereby greatly reducing the manpower cost. In the identification process, all visual characteristics of the book spine region are utilized for the target picture, and only the character information is utilized, so that the method can identify the book spine with any language and artistic design and better resist the influence of factors such as ambient light change, book abrasion and the like. The characteristic vector matching is adopted to determine the recognition result, so that the dependency of a character recognition method on a dictionary set is eliminated, and the method can conveniently support newly-added library books.

Training of spine instance segmentation model

1) And constructing a spine segmentation data set. In a real library scene, the books on the shelf are shot by using RGB picture acquisition equipment. In order to obtain different pictures of the same spine after division, each book on the bookshelf is photographed from three different angles (as shown in fig. 2). Most of the book is in the shooting range as far as possible on the premise of ensuring the picture to be clear. In this embodiment, the number of books collected on the shelf is about 300, and the original picture element size is 1080 × 1920. Manually labeling a book picture (90 sheets) on a shelf, and determining four coordinate points (x) for each spine region in the picture_N，y_N)_i,N∈[1,4]Form a closed quadrangle b_iIt is boxed (as in fig. 3) to construct the spine segmentation data set. 80% of these were used as training data sets and 20% as test data sets.

Fig. 3 manually labels the spine region (white quadrangle is the labeled box).

And training a spine segmentation model. An example segmentation task in the field of computer vision is to not only detect the position of an object from a picture, but also segment the object from the background at the pixel level. The spine segmentation task can be realized by adopting a very mature instance segmentation model (such as a Mask R-CNN framework). Training a spine segmentation model end-to-end using a spine segmentation dataset: inputting the original images of the books on the shelf and the corresponding book spine frame label information, training, segmenting and outputting all the book spine areas (as shown in figure 4).

3) Other possible embodiments. In this step, the pictures of books on the shelf may be collected from different numbers of viewing angles when the same book shelf is photographed, possibly in an archive or other similar scenes; for extracting the spine region in the book picture on the shelf, the spine instance segmentation model may also be implemented according to other architectures, such as polarmmask, SOLO, BlendMask, and the like.

Training of spine classification models

1) Acquiring a spine picture and constructing a spine classification data set. Model M for completing spine segmentation₁After the training, the collected pictures of all books on the shelf are subjected to example segmentation to obtain all spine areas B in the pictures_i. Because the output of the model is the spine region B obtained by segmentation_iIs an irregular area composed of pixel points classified as books in the picture, and B is obtained by calculation_iMinimum circumscribed rectangle R of_iFour vertices (X)_N，Y_N)_i,N∈[1,4]And R_iAngle of inclination theta of long side_iPerforming affine transformation on the original image by rotation θ_iThen according to (X)_N，Y_N)_i,N∈[1,4]Cutting to obtain regular spine picture BE_i(see fig. 5). Manually labeling the spine pictures with category labels to ensure that the spine pictures of the same book have the same labels.

2) And extracting the visual representation of the spine picture. Constructing 18-layer deep convolutional neural network as feature extraction network m by using residual error module₂The end adds a full-connectivity classification layer classifier (see fig. 6) using an additive angular interval loss function (see equation 1). Training model M according to classification task paradigm by using spine classification dataset₂＝m₂+ classifier: the input is scaled to a fixed size (800 x 80) spine picture, and the correct label (i.e., class id) to which the output spine picture belongs is trained. M₂After training is completed, m in the model₂Output feature map F_iAs a visual representation of the spine.

Wherein N is the number of samples in the mini-batch, s and m are hyper-parameters of the method, y_iIs a specific category, n is the number of categories, and theta is the included angle between the weight and the feature vector in the model calculation process.

3) Other possible embodiments. In this step, the spine classification model may be composed of more layers of residual modules, or may be constructed by using other classical feature extraction networks, such as VGG, inclusion, or other self-designed deep convolution networks; the dimensions of the feature vectors ultimately taken for a single spine picture may vary.

Book identification

1) By m₂The feature extraction network calculates visual representations of all the spines in the library, in this example, the visual representation F of each book_iIs a 512-dimensional vector. All vectors are stored in a single file Dict and saved in a library database for one-time reading in during retrieval.

2) To identify the class id of a row of target books, a picture is first taken on the spine side and fed into the spine segmentation model M₁Middle processing to obtain spine pictures BE of all books in the pictures_i. In the identification process, cosine similarity (as formula 2) is used for measuring two spine visual characterization vectors F_a＝[a₁,a₂,…,a₅₁₂]And F_b＝[b₁,b₂,…,b₅₁₂]Degree of similarity therebetween, F_aFor the visual characterization vector of the spine in the spine picture to be recognized, F_bAnd the visual characterization vector of the book spine in the book spine visual database is obtained. Spine feature extraction model m₂Calculating each spine picture BE_iVisual characterization of (F)_iAnd performing nearest neighbor search on the image data and the Dict in the library database to obtain 5 spine (top5) type id information with highest similarity with the target spine image in the database, wherein the id with the highest similarity is used as a final identification result.

3) Other possible embodiments. Other loss functions may be used in training the spine classification network; when the characteristic extraction network is used for calculating the library database, a file may be stored for the visual representation vector of each book, and the files are read in and matched circularly during retrieval; when feature vector nearest neighbor search is performed, other criteria may be used to evaluate the similarity between vectors, such as euclidean distance or other distance measurement methods.

In this embodiment, the simulation constructs a target database probe containing 5580 spine pictures to be recognized and a test database garley containing 3700 collected spine pictures. And traversing the spine picture in the probe, performing nearest neighbor search with the visual representation Dict in the galery, and taking the image with the maximum similarity as a final class id identification result. Through statistical analysis, the book category id identification accuracy rate reaches 99.32%. In the matching error example, most of books in the same series are too similar in spine, and considering that the shelf-loading positions of books in the same series are generally in the same area, the bookshelf position judgment accuracy can reach 99.93% for the shelf-loading and unloading requirements of books.

The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described embodiments.

The book identification system based on the spine visual information, wherein the second training module comprises: obtaining all book spine areas B in book spine pictures_iObtaining a spine region B_iMinimum circumscribed rectangle R of_iFour vertices (X)_N，Y_N)_i,N∈[1,4]And R_iAngle of inclination theta of long side_iPerforming affine transformation on the original image by rotation θ_iThen according to (X)_N，Y_N)_i,N∈[1,4]Cutting to obtain regular spine picture BE_iManually aligning book spine picture BE_iLabeling category labels, wherein spine pictures of the same book have the same label.

The book identification system based on the spine visual information, wherein the construction process of the deep convolutional neural network model for spine classification in the second training module comprises the following steps: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules₂Feature extraction network m₂Adding a full concatenation at the end using an additive angular interval penalty functionConnecting a classification layer classifier to obtain the structure of the deep convolutional neural network model for spine classification;

The book identification system based on the spine visual information comprises a spine image BE module, a spine segmentation module and a spine image BE module, wherein the spine image BE module is used for sending the spine image to BE identified into the spine segmentation module for processing to obtain the spine images BE of all books in the spine image to BE identified_i(ii) a Measuring the similarity between two spine visual representation vectors by using cosine similarity in the identification process; spine feature extraction model m₂Calculating each spine picture BE_iVisual characterization of (F)_iAnd performing nearest neighbor search on the data and data in the spine vision database to obtain a plurality of spine category id information with highest similarity with the target spine picture in the spine vision database, wherein the category id information with the highest similarity is used as a final identification result.

The specific scenarios of the invention can be as follows:

1. when a reader borrows a specific book, the reader searches a target book among a plurality of book lattices of a bookshelf even if the position of the bookshelf is retrieved. The method and the device can help readers to quickly identify the target book in complicated book lattices.

2. After the reader returns the books, the books need to be placed for the next borrowing by the reader. When the book is placed on the shelf, and the shelf is placed on the shelf. The bookshelf position of all books can be once shot, discerned, directly exported to books in a row to this application.

3. Since a reader may misplace a book or for other reasons after reading the book, the reader needs to check whether the book is in the correct bookshelf position during the library routine inspection. This work load is more huge, and it is almost impossible for the people to do, and this application can realize quick accurate books inspection.

4. With this application algorithm deployment on taking the mobile robot platform of arm, can realize the unmanned of the full flow of librarray management, from borrowing to going back the book, from the inspection to the arrangement, this application technique has given the robot to the accurate perception ability of books, and the action ability of cooperation arm just can really accomplish the machine and replace the manual work.

Claims

1. A book identification method based on spine visual information is characterized by comprising the following steps:

2. The method for book identification based on spine visual information as claimed in claim 1, wherein the step 1 comprises a data set construction step of taking a picture of the book on the shelf from multiple angles using a picture-taking device and determining four coordinates for each spine region in the result of the takingPoint (x)_N，y_N)_i,N∈[1,4]Form a closed quadrangle b_iAnd (5) selecting the book spine to mark the book spine segmentation.

3. The book identification method based on the spine visual information as claimed in claim 1, wherein the step 2 comprises a book category labeling step for obtaining all spine regions B in the spine pictures of the book_iObtaining a spine region B_iMinimum circumscribed rectangle R of_iFour vertices (X)_N，Y_N)_i,N∈[1,4]And R_iAngle of inclination theta of long side_iPerforming affine transformation on the original image by rotation θ_iThen according to (X)_N，Y_N)_i,N∈[1,4]Cutting to obtain regular spine picture BE_iManually aligning book spine picture BE_iLabeling category labels, wherein spine pictures of the same book have the same label.

4. The book identification method based on spine visual information as claimed in claim 1, wherein the construction method of the deep convolutional neural network model for spine classification in step 2 comprises: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules₂Feature extraction network m₂Adding a fully connected classification layer classifier using an additive angle interval loss function at the tail end to obtain the structure of the deep convolutional neural network model for spine classification;

5. The book identification method based on spine visual information as claimed in claim 4, wherein the step 3 comprises identifying the spine to be identifiedThe pictures are sent into the spine segmentation model for processing to obtain spine pictures BE of all books in the spine pictures to BE identified_i(ii) a In the identification process, two spine visual characterization vectors F are measured by using cosine similarity_a＝[a₁,a₂,…,a₅₁₂]And F_b＝[b₁,b₂,…,b₅₁₂]The degree of similarity between them; spine feature extraction model m₂Calculating each spine picture BE_iVisual characterization of (F)_iAnd performing nearest neighbor search on the data and data in the spine vision database to obtain a plurality of spine category id information with highest similarity with the target spine picture in the spine vision database, wherein the category id information with the highest similarity is used as a final identification result.

6. A book identification system based on spine visual information, comprising:

7. The book recognition system based on spine visual information of claim 6, wherein the first training module comprises: using a picture acquisition device to shoot books on the shelf from multiple angles, and determining four coordinate points (x) in each spine area in the shooting result_N，y_N)_i,N∈[1,4]Form a closed quadrangle b_iAnd (5) selecting the book spine to mark the book spine segmentation.

8. The book recognition system based on spine visual information of claim 6, wherein the second training module comprises: obtaining all book spine areas B in book spine pictures_iObtaining a spine region B_iMinimum circumscribed rectangle R of_iFour vertices (X)_N，Y_N)_i,N∈[1,4]And R_iAngle of inclination theta of long side_iPerforming affine transformation on the original image by rotation θ_iThen according to (X)_N，Y_N)_i,N∈[1,4]Cutting to obtain regular spine picture BE_iManually aligning book spine picture BE_iLabeling category labels, wherein spine pictures of the same book have the same label.

9. The book identification system based on spine visual information as claimed in claim 6, wherein the building process of the deep convolutional neural network model for spine classification in the second training module comprises: construction of a multi-layered deep convolutional neural network as a feature extraction network m using residual modules₂Feature extraction network m₂Adding a fully connected classification layer classifier using an additive angle interval loss function at the tail end to obtain the structure of the deep convolutional neural network model for spine classification;

the second training module includes: training a model M according to a paradigm of classification tasks using the spine classification dataset₂＝m₂+ classifier: inputting a spine picture which is zoomed into a fixed size, training a label, M, to which the output spine picture belongs₂Extracting the features in the model into a network m after training₂Characteristics of the outputFIG. F_iAs a visual feature vector for the spine.

10. The book identification system based on spine visual information as claimed in claim 9, wherein the identification module comprises processing the to-BE-identified spine picture into the spine segmentation model to obtain spine pictures BE of all books in the to-BE-identified spine picture_i(ii) a Measuring the similarity between two spine visual representation vectors by using cosine similarity in the identification process; spine feature extraction model m₂Calculating each spine picture BE_iVisual characterization of (F)_iAnd performing nearest neighbor search on the data and data in the spine vision database to obtain a plurality of spine category id information with highest similarity with the target spine picture in the spine vision database, wherein the category id information with the highest similarity is used as a final identification result.