[go: up one dir, main page]

CN111008629A - Cortex-M3-based method for identifying number of tip - Google Patents

Cortex-M3-based method for identifying number of tip Download PDF

Info

Publication number
CN111008629A
CN111008629A CN201911245954.8A CN201911245954A CN111008629A CN 111008629 A CN111008629 A CN 111008629A CN 201911245954 A CN201911245954 A CN 201911245954A CN 111008629 A CN111008629 A CN 111008629A
Authority
CN
China
Prior art keywords
enhancement
cortex
network
image
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911245954.8A
Other languages
Chinese (zh)
Inventor
刘柏罕
刘谋海
贺达江
米贤武
丁黎明
黄利军
陈灵曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaihua University
Original Assignee
Huaihua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaihua University filed Critical Huaihua University
Priority to CN201911245954.8A priority Critical patent/CN111008629A/en
Publication of CN111008629A publication Critical patent/CN111008629A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/02Recognising information on displays, dials, clocks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

本发明公开了一种基于Cortex‑M3的表端数字识别方法,包括安装于水表上端用于识别水表图像的水表读数器,该水表读数器采用Cortex‑M3处理器,所述Cortex‑M3处理器在基于深度学习卷积网络算法的基础上,采用一个网络模型,该网络模型通过优化深度学习网络的深度和宽度,使其能够运行在Cortex‑M3处理器上,以此搭建形成表端数字识别算法。本发明在深度学习卷积网络算法的基础上,创造性的提出一个网络模型,并通过优化深度学习网络的深度和宽度,并能够运行在Cortex‑M3处理器上,搭建了数字识别算法。该算法占用RAM和ROM资源在处理器范围之内,可以部署在采集器或者设备终端中。

Figure 201911245954

The invention discloses a Cortex-M3-based meter end digital identification method, comprising a water meter reader installed on the upper end of the water meter for recognizing the water meter image, the water meter reader adopts a Cortex-M3 processor, and the Cortex-M3 processor On the basis of the deep learning convolutional network algorithm, a network model is adopted. The network model can run on the Cortex‑M3 processor by optimizing the depth and width of the deep learning network, so as to form a digital recognition system on the table. algorithm. The invention creatively proposes a network model on the basis of the deep learning convolutional network algorithm, and builds a digital recognition algorithm by optimizing the depth and width of the deep learning network, which can run on the Cortex-M3 processor. The algorithm occupies RAM and ROM resources within the scope of the processor and can be deployed in collectors or device terminals.

Figure 201911245954

Description

Cortex-M3-based method for identifying number of tip
Technical Field
The invention relates to the field of water meter reading, in particular to a Cortex-M3-based meter end number identification method.
Background
In order to solve the problem of remote reading of the remote water meter, a device for shooting reading is adopted, so that a base meter can be free from being replaced. If the image data is uploaded to the cloud, the mobile charge and the time are consumed. The remote reading of the traditional remote water meter is relatively high in construction cost, time and manpower are consumed, image data are uploaded to the cloud 6, and the mobile cost and the time are consumed.
The traditional method of digital identification uses FPGA or DSP to calculate, which is high in cost. The deep learning is a branch of machine learning and is an algorithm for performing characterization learning on data by taking an artificial neural network as a framework. The benefit of deep learning is to replace the manual feature acquisition with unsupervised or semi-supervised feature learning and hierarchical feature extraction efficient algorithms. To date, several deep learning frameworks have been proposed, such as deep neural networks, convolutional neural networks, deep belief networks, and recurrent neural networks. They have been applied to the fields of computer vision, speech recognition, natural language processing, audio recognition, and bioinformatics and have achieved excellent results.
The efficiency of deep learning comes at the cost of massive computation and massive memory. Taking the highly efficient and well-known ShuffleNet as an example, ShuffleNet still requires 5M Ram and 140 MFLOPS. The Cortex-M3 series single chip microcomputer cannot support the large Ram and the consumption of calculated amount at present.
Disclosure of Invention
The invention aims to solve the technical problem that a Cortex-M3-based meter end number identification method provides a deep learning network model, can greatly reduce the cost of a collector or a meter reader, and is greatly helpful for designing and popularizing water meter, gas meter and heat meter collection.
The invention is realized by the following technical scheme: a meter-end digital identification method based on Cortex-M3 comprises a water meter reader which is arranged at the upper end of a water meter and used for identifying images of the water meter, wherein the water meter reader adopts a Cortex-M3 processor, and the Cortex-M3 processor adopts a network model on the basis of a deep learning convolutional network algorithm, and the network model can run on a Cortex-M3 processor by optimizing the depth and the width of the deep learning network so as to build a meter-end digital identification algorithm.
Preferably, the network model is a small network for 128KRAM and is named DigitalNet, which mainly uses group convolution and channel shuffling technology.
As a preferable technical solution, the DigitalNet has a basic unit, the DigitalNet is improved on the basis of a residual unit, dense 1x1 convolution is replaced by 1x1 groupconvolume, a channel shuffle operation is added after the first 1x1 convolution, and then a 3x3 depthwise volume is connected;
for the residual unit, 3x3 avg pool with stride 2 is adopted for the original input, so that a feature map with the same size as the output is obtained, and then the obtained feature map is connected with the output instead of being added.
As a preferred technical solution, the whole network structure of DigitalNet is composed of three basic units, the input size of a picture is 64 × 48, there are always three stages, each Stage is divided into two parts, the first part is a characteristic graph halving Stage with stride of 2, the second part has residual modules with group convolution, the residual modules can be stacked continuously, the precision can increase with the increase of the depth, and the last modules are global max pole and fullconnect, and the final recognition result is output.
As a preferable technical scheme, a network training method of a network model of DigitalNet adopts a data enhancement mode, the data enhancement adopts an online enhancement method, and the online enhancement method adopts five methods of Gaussian enhancement, corrosion enhancement, expansion enhancement, scaling enhancement and data mixup enhancement.
As a preferred technical solution, gaussian enhancement: the template coefficient of the Gaussian filter is reduced along with the increase of the distance from the center of the template, so that the image blurring degree of the Gaussian filter is smaller than that of the average value filter, the Gaussian enhancement is used for enhancing the compatibility of the algorithm to different fonts,
Figure BDA0002307636260000021
and (3) corrosion enhancement: scanning each pixel of the image by using a 3x3 structural element, and performing AND operation by using the structural element and a binary image covered by the structural element, wherein if the structural element and the binary image are both 1, the pixel of the image is 1, otherwise, the pixel is 0, and the corrosion enhancement increases the compatibility of the model to font thickness variation;
and (3) expansion enhancement: scanning each pixel of the image by using a 3x3 structural element, and performing AND operation on the structural element and the binary image covered by the structural element, wherein if the structural element and the binary image are both 0, the pixel of the image is 0, otherwise, the pixel is 1, and the expansion enhancement is used for increasing the compatibility of the model to font thickness variation;
zooming enhancement: zooming the image, wherein bilinear interpolation is used during zooming, and the zooming enhances the compatibility of the model to the change of the digital size;
data mixup enhancement: mixup regularizes the network by letting it support simple linear behavior between training samples, the Mixup data enhancement formula is as follows:
Figure BDA0002307636260000031
Figure BDA0002307636260000032
the invention has the beneficial effects that: the invention innovatively provides a deep learning network model, can greatly reduce the cost of a collector or a meter reader, and is greatly helpful for designing and popularizing water meter, gas meter and heat meter collection.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a diagram of a generic convolution;
FIG. 2 is a schematic diagram of group convolution;
FIG. 3 is a schematic diagram of channel shuffling;
FIG. 4 is a diagram of the basic cell architecture of DigitalNet;
FIG. 5 is a schematic diagram of an confusable image;
fig. 6 is a numerical sample diagram.
Detailed Description
All of the features disclosed in this specification, or all of the steps in any method or process so disclosed, may be combined in any combination, except combinations of features and/or steps that are mutually exclusive.
Any feature disclosed in this specification (including any accompanying claims, abstract and drawings), may be replaced by alternative features serving equivalent or similar purposes, unless expressly stated otherwise. That is, unless expressly stated otherwise, each feature is only an example of a generic series of equivalent or similar features.
In the description of the present invention, it is to be understood that the terms "one end", "the other end", "outside", "upper", "inside", "horizontal", "coaxial", "central", "end", "length", "outer end", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used only for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, should not be construed as limiting the present invention.
Further, in the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
The use of terms such as "upper," "above," "lower," "below," and the like in describing relative spatial positions herein is for the purpose of facilitating description to describe one element or feature's relationship to another element or feature as illustrated in the figures. The spatially relative positional terms may be intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the figures. For example, if the device in the figures is turned over, elements described as "below" or "beneath" other elements or features would then be oriented "above" the other elements or features. Thus, the exemplary term "below" can encompass both an orientation of above and below. The device may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein interpreted accordingly
In the present invention, unless otherwise explicitly specified or limited, the terms "disposed," "sleeved," "connected," "penetrating," "plugged," and the like are to be construed broadly, e.g., as a fixed connection, a detachable connection, or an integral part; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
As shown in figure 1, the meter-end digital recognition method based on Cortex-M3 comprises a water meter reader which is arranged at the upper end of a water meter and used for recognizing images of the water meter, the water meter reader adopts a Cortex-M3 processor, and the Cortex-M3 processor adopts a network model on the basis of a deep learning convolutional network algorithm, wherein the network model can be constructed by optimizing the depth and the width of a deep learning network so as to run on a Cortex-M3 processor.
In view of the features of the Cortex-M3 device, the present invention has designed a small network for 128KRAM and is named DigitalNet.
The design goal of DigitalNet is also how to achieve the best model accuracy with limited computational resources, which requires a good balance between speed and accuracy. At present, the main design ideas of CNN model compression are mainly two aspects: model structure design and model compression. Here we use a specific network structure approach to achieve model scaling and speeding, rather than compressing or migrating a large trained model. DigitalNet takes ShuffleNet as a plate making rule of the blue book, and has the characteristics of maintaining the precision and greatly reducing the calculated amount of a model and the size of the model as same as the ShuffleNet. DigitalNet mainly uses group convolution and channel shuffling technology.
The general convolution technique: if the size of the input feature map is C H W, the number of convolution kernels is N, the number of output feature maps is N, the number of the output feature maps is the same as the number of the convolution kernels, the size of each convolution kernel is C K, the total parameter number of the N convolution kernels is N C K, the connection mode of the input map and the output map is shown in FIG. 1, and each output feature of the convolution network is related to all input features
Group convolution technique: group convolution, as the name implies, groups the input feature maps, and then each group is convolved separately. Assuming that the size of the input feature maps is C × H × W, the number of output feature maps is N, if it is set to be divided into G groups, the number of input feature maps of each group is C/G, the number of output feature maps of each group is N/G, the size of each convolution kernel is C × K/G, the total number of convolution kernels is still N, the number of convolution kernels of each group is N/G, a convolution kernel is only convolved with the input map of the same group, and the total number of parameters of a convolution kernel is N × C × K/G, it is seen that the total number of parameters is reduced to the original 1/G, and the connection mode is shown in fig. 2.
Channel shuffling technique: in small networks we can use convolution instead of ordinary convolution, which can greatly reduce the model size and the amount of computation. However, if multiple groups are stacked together, a channel output is derived from only a small fraction of the input channels, as shown in FIG. 3(a), and such an attribute reduces the information throughput between channel groups, reducing the information presentation capacity. If we allow the group convolution to get a different set of input data, i.e. the effect shown in fig. 3(b), then the input and output channels will be fully correlated. Specifically, for the output channels of the previous layer, we can do a shuffle operation, as shown in fig. 3(c), and then divide the output channels into several groups for input to the next layer. In this way, we can reduce the model and the calculation amount on the premise of keeping the precision.
The basic unit of DigitalNet is improved on the basis of a residual unit, as shown in fig. 4(a), which is a residual unit comprising 3 layers: first a 1x1 convolution, then a depthwise convolution of 3x3, where the 3x3 convolution is the bottleneck layer, followed by a 1x1 convolution, and finally a short-circuit connection, adding the input directly to the output. Now, the following modifications are made: the dense 1x1 convolution is replaced by a group constraint of 1x1, but a channel shuffle operation is added after the first 1x1 convolution, followed by a depthwise constraint of 3x 3. The modification is shown in fig. 4 (b). For the residual unit.
If stride is 1, then the input and output shape are consistent and can be added directly, and when stride is 2, the number of channels increases and the profile size decreases, then the input and output do not match. Typically, a 1x1 convolution can be used to map the input to the same shape as the output. However, in DigitalNet, a different strategy is adopted, as shown in fig. 4 (c): the original input is used with 3x3 avg pool with stride 2, so that a characteristic map with the same size as the output is obtained, and then the obtained characteristic map is connected (concat) with the output instead of being added. The purpose of this is mainly to reduce the amount of computation and the size of the parameters.
The overall structure of digitalnt is made up of the above three basic units, the overall structure is shown in table 1, the input size of the picture is 64 x 48, there are always three stages. Each Stage is divided into two parts, the first part is a characteristic diagram halving Stage with stride of 2, the second part is provided with residual modules for group convolution, the residual modules can be continuously stacked, and the precision can be increased along with the increase of the depth. And finally, the modules are global max pool and fullconnect, and the final recognition result is output.
Figure BDA0002307636260000071
Table 1 DigitalNe overall network architecture.
The network training method adopts data enhancement and a training method, and the data enhancement is an effective method for expanding the scale of a data sample. Deep learning is a method based on big data, and the larger the scale and the higher the quality of the data, the better the recognition result and generalization ability of the model.
However, when data is actually collected, it is often difficult to cover all scenes, such as: for lighting conditions, attitude of the actual object, shooting angle, etc. In this regard, we can address this problem with data enhancement of existing data. The data enhancement has two modes of off-line enhancement and on-line enhancement. Because online enhancement can obtain more data than offline enhancement and does not occupy hard disk space, online enhancement is adopted, and Gaussian enhancement, corrosion enhancement, expansion enhancement, scaling enhancement and data mixup enhancement are adopted together.
Gaussian enhancement: the gaussian filter is a linear filter, and can effectively suppress noise and smooth an image. The coefficients of the template of the gaussian filter decrease with increasing distance from the center of the template. Therefore, the gaussian filter has a smaller degree of image blurring than the mean filter, and gaussian enhancement is used to enhance the compatibility of the algorithm with different fonts.
Figure BDA0002307636260000081
And (3) corrosion enhancement: scanning each pixel of the image by using a 3x3 structural element, and performing AND operation by using the structural element and the binary image covered by the structural element, wherein if the structural element and the binary image are both 1, the pixel of the image is 1, otherwise, the pixel is 0, and the corrosion enhancement increases the compatibility of the model to the font thickness variation.
And (3) expansion enhancement: each pixel of the image is scanned with a 3x3 structuring element, and the structuring element is anded with the binary image it overlays. If both are 0, then the pixel of the resulting image is 0, otherwise 1, and dilation enhancement increases the model's compatibility to font weight variations.
Zooming enhancement: and scaling the image, wherein bilinear interpolation is used in scaling. Scaling enhancements mainly increase the compatibility of the model to variations in digital size.
Data mixup enhancement: mixup regularizes the network by letting it support simple linear behavior between training samples, the Mixup data enhancement formula is as follows:
Figure BDA0002307636260000082
Figure BDA0002307636260000083
the training method comprises the following steps:
the batch _ size of the training parameter is 1000, the learning rate is 0.1, and the training can be completed by running 40 epochs, which is about 20 hours.
TABLE 2 network model test results
Figure BDA0002307636260000091
During training it was found that for some numbers it was easy to identify, while other locations were easy to identify with errors, as shown in fig. 5, the first 8 disturbed by the light was easily identified as 6, the second 8.5 disturbed by the light was easily identified as 9.5, the third 6 disturbed by the light was easily identified as 5, the fourth and fifth 1.5 disturbed by dust was easily identified as 7.5.
In order to solve the problem, the local is replaced by the Focal local from Softmax, the Focal local can automatically carry out hard mining, the specific gravity of the local is adjusted according to the condition of a sample, and the formula of the Focal local is as follows, wherein a is 0.25, and y is 2.
Figure BDA0002307636260000092
As shown in fig. 6, a standard data set of about 50 million pictures of 20 commercially available phenotypes was collected and divided into a test set (60%), a training set (20%), and a validation set (20%).
The influence of augment and loss on the identification precision is respectively tested by taking the verification set as an evaluation standard, wherein the error rate (Pw) is the number of samples which are identified incorrectly and have the identification confidence coefficient of more than 0.5 in the total samples, and the success rate (Pr) is the proportion of the number of samples which are identified correctly and have the confidence coefficient of about 0.5 in the total samples.
We can see that data enhancement can reduce the error rate and enhance the generalization capability of the model, but also reduce the recognition success rate. The automatic difficult mining of Focal local can simultaneously reduce the error rate and improve the recognition success rate, and the DigitalNet recognition precision can meet the practical application requirements.
The invention innovatively provides a deep learning network model, which can greatly reduce the cost of a collector or a meter reader and is greatly helpful for designing and popularizing water meter, gas meter and heat meter collection.
The above description is only an embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that are not thought of through the inventive work should be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope defined by the claims.

Claims (6)

1.一种基于Cortex-M3的表端数字识别方法,其特征在于:包括安装于水表上端用于识别水表图像的水表读数器,该水表读数器采用Cortex-M3处理器,所述Cortex-M3处理器在基于深度学习卷积网络算法的基础上,采用一个网络模型,该网络模型通过优化深度学习网络的深度和宽度,使其能够运行在Cortex-M3处理器上,以此搭建形成表端数字识别算法。1. a method for digital identification of a meter end based on Cortex-M3, it is characterized in that: comprise the water meter reader that is installed on the water meter upper end for identifying the water meter image, this water meter reader adopts the Cortex-M3 processor, and the Cortex-M3 Based on the deep learning convolutional network algorithm, the processor adopts a network model, which optimizes the depth and width of the deep learning network so that it can run on the Cortex-M3 processor to form a table-side Number recognition algorithm. 2.根据权利要求1所述的基于Cortex-M3的表端数字识别方法,其特征在于:所述网络模型为针对于128KRAM的小型网络,并命名为DigitalNet,DigitalNet主要运用组卷积以及通道混洗技术。2. the table-end digital identification method based on Cortex-M3 according to claim 1, is characterized in that: described network model is for the small network for 128KRAM, and is named as DigitalNet, DigitalNet mainly uses group convolution and channel mixing. wash technique. 3.根据权利要求2所述的基于Cortex-M3的表端数字识别方法,其特征在于:所述DigitalNet具有一个基本单元,所述DigitalNet的基本单元在一个残差单元的基础上改进而成,将密集的1x1卷积替换成1x1的group convolution,在第一个1x1卷积之后增加了一个channel shuffle操作,然后连接一个3x3的depthwise convolution;3. The method for identifying numbers at the table end based on Cortex-M3 according to claim 2, wherein the DigitalNet has a basic unit, and the basic unit of the DigitalNet is improved on the basis of a residual unit, Replace the dense 1x1 convolution with a 1x1 group convolution, add a channel shuffle operation after the first 1x1 convolution, and then connect a 3x3 depthwise convolution; 对于残差单元,对原输入采用stride=2的3x3 avg pool,这样得到和输出一样大小的特征图,然后将得到特征图与输出进行连接,而不是相加。For the residual unit, a 3x3 avg pool with stride=2 is used for the original input, so that a feature map of the same size as the output is obtained, and then the obtained feature map is connected to the output instead of adding. 4.根据权利要求1所述的基于Cortex-M3的表端数字识别方法,其特征在于:DigitalNet的整体网络结构由三种基本单元构成,图片的输入大小为64*48,总有三个Stage,每个Stage分为两部分,第一部分为stride为2的特征图减半阶段,第二个部分带有组卷积的残差模块,残差模块可以不断层叠,精度会随着深度的增加而增加,最后模块是全局max pool和fullconnect,输出最终识别结果。4. the table-side digital identification method based on Cortex-M3 according to claim 1, is characterized in that: the overall network structure of DigitalNet is made up of three kinds of basic units, and the input size of the picture is 64*48, and there are always three Stages, Each stage is divided into two parts, the first part is the feature map halving stage with stride 2, and the second part has a residual module of group convolution. The residual module can be continuously stacked, and the accuracy will increase with the increase of depth. Increase, the last module is the global max pool and fullconnect, and output the final recognition result. 5.根据权利要求1所述的基于Cortex-M3的表端数字识别方法,其特征在于:DigitalNet的网络模型的网络训练方法采用数据增强方式,所述数据增强采用在线增强法,所述在线增强法采用高斯增强、腐蚀增强、膨胀增强、缩放增强和数据mixup增强五种方法。5. the table-end digital identification method based on Cortex-M3 according to claim 1, is characterized in that: the network training method of the network model of DigitalNet adopts data enhancement mode, and described data enhancement adopts online enhancement method, and described online enhancement The method adopts five methods: Gaussian enhancement, erosion enhancement, dilation enhancement, scaling enhancement and data mixup enhancement. 6.根据权利要求5所述的基于Cortex-M3的表端数字识别方法,其特征在于:6. the table-end digital identification method based on Cortex-M3 according to claim 5, is characterized in that: 高斯增强:高斯滤波器的模板系数,则随着距离模板中心的增大而系数减小,所以,高斯滤波器相比于均值滤波器对图像模糊程度较小,高斯增强用来增强算法对不同字体的兼容度,Gaussian enhancement: The template coefficient of the Gaussian filter decreases as the distance from the center of the template increases. Therefore, the Gaussian filter has less blur on the image than the mean filter. Gaussian enhancement is used to enhance the algorithm for different font compatibility,
Figure FDA0002307636250000021
Figure FDA0002307636250000021
腐蚀增强:用3x3的结构元素,扫描图像的每一个像素,用结构元素与其覆盖的二值图像做“与”操作,如果都为1,结果图像的该像素为1,否则为0,腐蚀增强增加模型对字体粗细变化的兼容度;Corrosion enhancement: Use 3x3 structuring elements to scan each pixel of the image, and perform an "AND" operation with the structuring element and the binary image it covers. If both are 1, the pixel of the resulting image is 1, otherwise it is 0, corrosion enhancement Increase the compatibility of the model to font weight changes; 膨胀增强:用3x3的结构元素,扫描图像的每一个像素,用结构元素与其覆盖的二值图像做“与”操作,如果都为0,结果图像的该像素为0,否则为1,膨胀增强增加模型对字体粗细变化的兼容度;Dilation enhancement: scan each pixel of the image with a 3x3 structuring element, and perform an "AND" operation with the structuring element and the binary image it covers. If both are 0, the pixel of the resulting image is 0, otherwise it is 1, and dilation enhancement Increase the compatibility of the model to font weight changes; 缩放增强:对图像进行缩放,缩放时使用双线性插值,缩放增强主要增加模型对数字大小的变化的兼容性;Zoom enhancement: zoom the image, use bilinear interpolation when zooming, and zoom enhancement mainly increases the compatibility of the model to changes in the size of numbers; 数据mixup增强:Mixup通过让网络支持训练样本之间的简单线性行为让网络正则化,Mixup数据增强公式如下:Data mixup enhancement: Mixup regularizes the network by allowing the network to support simple linear behavior between training samples. The Mixup data enhancement formula is as follows:
Figure FDA0002307636250000022
Figure FDA0002307636250000022
Figure FDA0002307636250000023
Figure FDA0002307636250000023
CN201911245954.8A 2019-12-07 2019-12-07 Cortex-M3-based method for identifying number of tip Pending CN111008629A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911245954.8A CN111008629A (en) 2019-12-07 2019-12-07 Cortex-M3-based method for identifying number of tip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911245954.8A CN111008629A (en) 2019-12-07 2019-12-07 Cortex-M3-based method for identifying number of tip

Publications (1)

Publication Number Publication Date
CN111008629A true CN111008629A (en) 2020-04-14

Family

ID=70115552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911245954.8A Pending CN111008629A (en) 2019-12-07 2019-12-07 Cortex-M3-based method for identifying number of tip

Country Status (1)

Country Link
CN (1) CN111008629A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283412A (en) * 2021-12-22 2022-04-05 上海蒙帕信息技术有限公司 Reading identification method and system of digital instrument panel

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105118262A (en) * 2015-08-25 2015-12-02 武汉理工大学 Community-oriented high-efficiency remote meter reading method and system
CN106228240A (en) * 2016-07-30 2016-12-14 复旦大学 Degree of depth convolutional neural networks implementation method based on FPGA
US20170316312A1 (en) * 2016-05-02 2017-11-02 Cavium, Inc. Systems and methods for deep learning processor
CN109360396A (en) * 2018-09-27 2019-02-19 长江大学 Remote meter reading method and system based on image recognition technology and NB-IoT technology
CN109447239A (en) * 2018-09-26 2019-03-08 华南理工大学 A kind of embedded convolutional neural networks accelerated method based on ARM

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105118262A (en) * 2015-08-25 2015-12-02 武汉理工大学 Community-oriented high-efficiency remote meter reading method and system
US20170316312A1 (en) * 2016-05-02 2017-11-02 Cavium, Inc. Systems and methods for deep learning processor
CN106228240A (en) * 2016-07-30 2016-12-14 复旦大学 Degree of depth convolutional neural networks implementation method based on FPGA
CN109447239A (en) * 2018-09-26 2019-03-08 华南理工大学 A kind of embedded convolutional neural networks accelerated method based on ARM
CN109360396A (en) * 2018-09-27 2019-02-19 长江大学 Remote meter reading method and system based on image recognition technology and NB-IoT technology

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
XIANGYU ZHANG等: "ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices", pages 2 - 4 *
单建华著: "《深度学习之图像识别:核心技术与案例实战》", 机械工业出版社, pages: 98 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283412A (en) * 2021-12-22 2022-04-05 上海蒙帕信息技术有限公司 Reading identification method and system of digital instrument panel
CN114283412B (en) * 2021-12-22 2025-09-19 上海蒙帕信息技术有限公司 Reading identification method and system of digital instrument panel

Similar Documents

Publication Publication Date Title
CN106534616B (en) A video image stabilization method and system based on feature matching and motion compensation
CN112200163B (en) Underwater benthic organism detection method and system
CN113763300B (en) A Multi-focus Image Fusion Method Combined with Depth Context and Convolutional Conditional Random Field
CN113033570A (en) Image semantic segmentation method for improving fusion of void volume and multilevel characteristic information
CN110796009A (en) Method and system for detecting marine vessel based on multi-scale convolution neural network model
CN112990336B (en) Deep three-dimensional point cloud classification network construction method based on competitive attention fusion
CN112418087B (en) A Neural Network-Based Fish Recognition Method in Underwater Video
CN112465801A (en) Instance segmentation method for extracting mask features in scale division mode
CN115100545A (en) Object Detection Method for Failed Satellite Widgets in Low Illumination
CN112750125A (en) Glass insulator piece positioning method based on end-to-end key point detection
CN112418229A (en) A real-time segmentation method of unmanned ship marine scene images based on deep learning
CN115527253A (en) Attention mechanism-based lightweight facial expression recognition method and system
CN118917389B (en) A diffusion model architecture search method and system based on attention mechanism
CN119784692B (en) Wind turbine generator blade crack defect detection method based on characteristic recombination network
CN111815708A (en) Service robot grasping detection method based on two-pass convolutional neural network
CN115797367A (en) Tea tree tea bud segmentation method based on deep learning
CN113920421B (en) Full convolution neural network model capable of achieving rapid classification
CN119027676A (en) Segmentation algorithm and quantitative calculation of various defects in bridge images based on perceptual analysis
CN118262256A (en) Multi-scale feature fusion small target detection algorithm for unmanned aerial vehicle aerial image
CN117274111A (en) An image distortion correction method and system based on multi-scale feature fusion
CN111008629A (en) Cortex-M3-based method for identifying number of tip
CN117036990A (en) A drone perspective object detection method, device and storage medium
CN119399254B (en) A remote sensing image registration method based on convolutional attention
CN114782983A (en) Road scene pedestrian detection method based on improved feature pyramid and boundary loss
CN119399223A (en) A medical image segmentation method based on KAN network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200414

RJ01 Rejection of invention patent application after publication