CN119863875B

CN119863875B - Self-service cashing method, system and storage medium based on multi-mode interaction

Info

Publication number: CN119863875B
Application number: CN202510348316.8A
Authority: CN
Inventors: 王俊杰
Original assignee: Xiamen Hema Technology Co ltd
Current assignee: Xiamen Hema Technology Co ltd
Priority date: 2025-03-24
Filing date: 2025-03-24
Publication date: 2025-06-20
Anticipated expiration: 2045-03-24
Also published as: CN119863875A

Abstract

The invention discloses a self-service cashing method, a self-service cashing system and a storage medium based on multi-mode interaction, and belongs to the technical field of data processing, wherein the method comprises the steps of shooting an initial image of a preset area, splitting the initial image into a plurality of commodity images, and identifying each commodity image to acquire commodity information and first confidence coefficient contained in the commodity images; the method comprises the steps of generating a first commodity list based on commodity information with the first confidence coefficient being larger than a first threshold value, carrying out joint reasoning based on all commodity images and weight information to obtain commodity information of unidentified commodities and a second confidence coefficient, verifying the commodity information based on picking and placing information of a selling goods shelf if the second confidence coefficient is smaller than the first threshold value, adding the commodity information of unidentified commodities into the first commodity list to obtain a second commodity list if verification is successful, generating amount information based on the second commodity list, and displaying a collection interface. The method and the device are combined with various methods to acquire the commodity information to be settled, so that the accuracy of commodity identification is greatly improved.

Description

Self-service cashing method, system and storage medium based on multi-mode interaction

Technical Field

The invention belongs to the technical field of data processing, and particularly relates to a self-service cashing method, a self-service cashing system and a storage medium based on multi-modal interaction.

Background

Self-service cashing system has become the core facility that scene such as super, convenience store promoted operation efficiency, independently cashes and mainly relies on bar code discernment at present, however this process needs the user to seek the bar code on the commodity packing by oneself, and when commodity packing is comparatively complicated, perhaps when settlement personnel are unskilled, there is the problem that settlement time is longer, influence settlement efficiency.

In order to solve the problem, a method is proposed in the prior art, for example, a chinese patent document with publication number CN109214806a discloses a self-service settlement method, a device and a storage medium, the method uses a plurality of image recognition technologies to recognize a monitoring image, recognizes the category and the number of the commodity to be settled from the monitoring image, then obtains the actual weight of the commodity to be settled by combining a weight sensor, calculates the standard weight of the commodity by combining the recognition result, compares the actual weight with the standard weight, thereby verifying the accuracy of the recognition result, and the method can complete settlement recognition of the commodity without scanning a bar code, thereby greatly improving the settlement efficiency.

However, only by identifying the commodity image in the settlement area in the above manner, the condition of low commodity identification accuracy may exist, and although the method introduces the verification of the identification result by acquiring the commodity weight, the method can only determine whether the identification result is correct, and under the incorrect condition, the identification accuracy cannot be improved from the source by manually confirming.

Disclosure of Invention

In order to solve the problems, the invention provides a self-service cashing method, a self-service cashing device and a storage medium based on multi-mode interaction, so as to solve the problems in the prior art.

In order to achieve the above-mentioned purpose, the present invention provides a self-service cashing method based on multi-modal interaction, comprising:

after receiving a start instruction, shooting an initial image of a preset area, splitting the initial image into a plurality of commodity images, and identifying each commodity image based on a depth identification model to acquire commodity information and first confidence coefficient contained in the commodity image;

generating a first commodity list based on the commodity information with the first confidence coefficient larger than a first threshold value, and counting a first quantity and a second quantity of identified commodities and unidentified commodities in the initial image;

If the first quantity and the second quantity meet preset conditions, acquiring weight information of the preset area, and performing joint reasoning based on all the commodity images and the weight information to acquire commodity information and second confidence of unidentified commodities;

If the second confidence coefficient is smaller than the first threshold value, positioning a corresponding selling rack based on the commodity information of the unidentified commodity, and verifying the commodity information based on the picking and placing information of the selling rack;

If verification is successful, adding the commodity information of the unidentified commodity into the first commodity list to obtain a second commodity list, and if verification is failed, generating a manual input prompt;

and after receiving the completion information, generating amount information based on the second commodity list, and displaying a collection interface until receiving the completion settlement information or canceling payment information.

Further, the identifying of the commodity image includes the steps of:

If the bar code area is detected in the commodity image, commodity information is generated based on the bar code, the corresponding first confidence coefficient is set to be 100%, if the bar code area is not identified, the commodity image is identified based on a depth identification model, the commodity information and the first confidence coefficient are obtained, the commodity type is determined based on the commodity information, if the commodity type belongs to a preset type, the first confidence coefficient is larger than the first threshold value, the commodity quality is continuously identified, and if the damaged commodity is identified, a commodity quality prompt is generated.

Further, identifying the commodity quality includes the steps of:

The commodity image is subjected to gridding segmentation to obtain a plurality of grid areas, an original average pixel value of each grid area is obtained, a plurality of interval ranges are set, each interval range corresponds to one mapping pixel, and the original average pixel value is mapped into the corresponding mapping pixel based on the interval range;

Dividing the grid region into a plurality of types of subareas, acquiring statistical features of the subareas, wherein the statistical features comprise mapping pixels and occupation ratios existing in the subareas, establishing a standard feature library, wherein the standard feature library comprises standard features of each subarea of each commodity type under a damaged state, comparing the statistical features of the subareas in the commodity image with the standard features, and determining the commodity quality based on a comparison result.

Further, verifying the commodity information based on the picking and placing information of the vending shelf comprises the following steps:

Acquiring a goods shelf image of the goods shelf, determining goods information of the goods to be moved out based on goods shelf data when goods to be moved out of the goods shelf image appear, carrying out target tracking on a user of the goods to be moved out of the goods shelf, acquiring a user moving image, intercepting a face image and a body image in the user moving image for pre-storage, marking the goods information of the goods shelf to be sold in the face image and the body image as image labels, searching based on the goods information of the goods not identified and the image labels when verification requirements appear, comparing the face image and the body image containing the goods information of the goods not identified with a current user image if the face image and the body image containing the goods information of the goods not identified are found, and determining that the goods information identified at present is correct if the comparison is passed.

Further, performing joint reasoning based on all the commodity images and the weight information includes the steps of:

Establishing a weight database, wherein the weight database comprises standard weight and floating range of each commodity, setting error distribution of a weight sensor, acquiring first weight of identified commodities based on the weight database, calculating second weight of unidentified commodities based on total weight of the commodities in the preset area and the first weight, and correcting the second weight based on the floating range and the error distribution to acquire weight distribution range;

Acquiring candidate information, wherein the candidate information is the commodity information with the first confidence coefficient larger than a second threshold value in a single commodity image recognition result, acquiring the standard weight and the floating range of the candidate information based on the weight database, generating candidate combinations meeting the weight distribution range based on the standard weight and the floating range, and calculating a combination score of each candidate combination based on the weight distribution range of the candidate combinations;

And calculating a candidate score of each candidate combination based on the first confidence coefficient and the combination score, screening the candidate combination corresponding to the largest candidate score as a target combination, wherein the included candidate information is taken as an inference result of unidentified commodities, and the corresponding candidate score is taken as the second confidence coefficient.

Further, calculating the candidate score for the candidate combination comprises the steps of:

And obtaining the purchase probability of each piece of candidate information based on a historical purchase record, carrying out weighted summation on the first confidence coefficient, the combination score and the purchase probability based on a preset weight, obtaining the combination score of the candidate combination, normalizing the combination score to obtain a normalized score, generating a penalty value based on the number of the candidate combinations currently existing, and correcting all the normalized scores based on the penalty value to obtain the candidate score.

Further, the preset condition includes that both the first number and the second number are smaller than a third threshold.

Further, if it is detected that the user moves the commodity back to the shelf in the shelf image or the user leaves the market area, the face image and the body image of the user are deleted.

The invention also provides a self-service cashing system based on multi-mode interaction, which is used for realizing the self-service cashing method based on multi-mode interaction, and comprises the following steps:

The identification module is used for shooting an initial image of a preset area after receiving a start instruction, splitting the initial image into a plurality of commodity images, identifying each commodity image based on a depth identification model to obtain commodity information and first confidence coefficient contained in the commodity image, and generating a first commodity list based on the commodity information with the first confidence coefficient larger than a first threshold value.

The first verification module is used for counting first quantity and second quantity of the identified commodities and unidentified commodities in the initial image, acquiring weight information of the preset area if the first quantity and the second quantity meet preset conditions, and performing joint reasoning based on all the commodity images and the weight information to acquire commodity information and second confidence of the unidentified commodities.

And the second verification module is used for positioning a corresponding selling goods shelf based on the goods information of the unidentified goods if the second confidence coefficient is smaller than the first threshold value, verifying the goods information based on the picking and placing information of the selling goods shelf, adding the goods information of the unidentified goods into the first goods list if the verification is successful, obtaining a second goods list, and generating a manual input prompt if the verification is failed.

And the settlement module is used for generating the amount information based on the second commodity list after receiving the completion information and displaying a collection interface until receiving the completion settlement information or canceling the payment information.

The invention also discloses a computer storage medium which stores program instructions, wherein the program instructions control equipment where the computer storage medium is located to execute the method when running.

The beneficial effects are that:

After receiving the starting instruction, the method and the device acquire commodity information in each commodity image rapidly and accurately by shooting an initial image of a preset area, splitting the initial image into a plurality of commodity images based on an edge detection algorithm, and identifying the commodity images based on a depth identification model. The method effectively solves the problem of low efficiency of manually scanning the commodity bar codes one by one, and improves the cashing efficiency.

The quantity of the identified commodities and the unidentified commodities in the initial image is counted, so that the system can clearly sort out a list of the commodities to be settled after the initial identification of the commodities. If the number of the identified commodities and the unrecognized commodities meets the preset condition, continuously acquiring weight information of a preset area, and carrying out joint reasoning based on the weight information to acquire commodity information of the unrecognized commodities. And the accuracy and the comprehensiveness of commodity identification are improved by adding weight information to perform combined reasoning. The invention also positions the selling goods shelves based on the goods information of the unidentified goods, and verifies the goods information according to the picking and placing information of the goods shelves. The method and the system ensure the correctness of the estimated commodity information through verification, solve the problem that the commodity information is still inaccurate or cannot be subjected to joint reasoning after joint reasoning, and in conclusion, acquire the commodity information to be settled by combining a plurality of methods, thereby greatly improving the accuracy of commodity identification.

Drawings

FIG. 1 is a flow chart of steps of a self-service cashing method based on multi-modal interaction of the present invention;

fig. 2 is a schematic structural diagram of the self-service cashier device of the present invention;

FIG. 3 is a schematic diagram of the present invention for dividing merchandise images;

fig. 4 is a schematic structural diagram of a self-service cashier system based on multi-modal interaction.

In the figure, 1, a placement area, 2, a display area, 3, a wireless transmission module and 4, a code scanning area.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

It should be noted that, in the present invention, all the acquisition and processing of information related to images, data, etc. are performed on the premise of conforming to relevant laws and regulations and policies.

As shown in fig. 1, the self-service cashing method based on multi-modal interaction of the invention comprises the following steps:

and S1, after receiving a start instruction, shooting an initial image of a preset area, splitting the initial image into a plurality of commodity images, and identifying each commodity image based on a depth identification model to acquire commodity information and first confidence coefficient contained in the commodity image.

The self-service cashier device for realizing the self-service cashier device comprises a placement area 1 and a display area 2, wherein the placement area 1 is used for placing commodities, a weight sensor is arranged in the placement area 1, the display area 2 is used for displaying commodity lists and amount information, the self-service cashier device further comprises a wireless transmission module 3 which is used for realizing the sending and receiving of data of the self-service cashier device, and when the method based on the self-service cashier device cannot automatically identify the commodities, the commodity can be identified based on the bar codes by means of a code scanning area 4. After the self-service cashier device receives the starting instruction, the starting instruction is input by clicking a screen, the camera is called to shoot a preset area, namely an initial image in the placement area 1, and a fixed camera can be arranged above the cashier device to shoot. Particularly, the device reminds a user through voice when placing articles, and reminding information comprises the following contents of not stacking articles when placing and singly placing and identifying fresh fruit articles.

Obtaining an initial image after shooting is completed, detecting an object contour in the initial image based on a Canny edge detection algorithm, cutting the initial image into a plurality of commodity images only comprising one object contour according to the object contour, identifying each commodity image by using a depth identification model based on CNN, outputting commodity information and corresponding first confidence coefficient included in the commodity image by the depth identification model, wherein if the commodity information output by the depth identification model is A brand potato chips, the first confidence coefficient is 98%, the higher the first confidence coefficient is, and the higher the probability that the depth identification model considers that the commodity image includes A brand potato chips is. In particular, if a whole bag of fruit is included in the image, each contour identified therein is split into one commodity image based on the contour edge, and if all commodity images are identified as the same fruit, commodity prices are calculated based on the unit price and weight of the fruit.

And S2, generating a first commodity list based on commodity information with the first confidence coefficient larger than a first threshold value, and counting the first quantity and the second quantity of the identified commodities and the unidentified commodities in the initial image.

And S3, if the first quantity and the second quantity meet preset conditions, acquiring weight information of a preset area, and carrying out joint reasoning based on all commodity images and the weight information to acquire commodity information and second confidence of unidentified commodities.

In this embodiment, the preset condition includes that both the first number and the second number are smaller than the third threshold.

The first threshold is set to 95%, if the first confidence coefficient of the commodity information is greater than the first threshold, the judging result is directly considered to be correct, the judging result is added to the first commodity list, meanwhile, the corresponding selling price is obtained from the price database, and for fruits, the selling price is obtained by multiplying the unit price of the fruits by the weight. The first commodity list is a list of commodities to be settled, and then the first quantity and the second quantity in the initial image are counted, for example, a plurality of commodities are simultaneously placed in a preset area, wherein 5 commodities are identified and added to the first commodity list, 3 commodities are not identified, and the first quantity is 5, and the second quantity is 4. The third threshold is set to 3, and when the number of both the recognized article and the unrecognized article exceeds 3, the error of the joint inference by weight is greatly increased, in which case the article information is determined using the subsequent steps.

When it is determined that the joint reasoning can be performed, the joint reasoning is performed according to the weight information of the recognized commodity and the unidentified commodity, and a specific reasoning method is described later. After the reasoning is completed, commodity information of the unidentified commodity and the second confidence degree can be obtained.

And S4, if the second confidence coefficient is smaller than the first threshold value, positioning the corresponding selling goods shelf based on the goods information of the unidentified goods, and verifying the goods information based on the picking and placing information of the selling goods shelf.

And S5, if the verification is successful, adding the commodity information of the unidentified commodity into the first commodity list to obtain a second commodity list, and if the verification is failed, generating a manual input prompt.

And S6, after receiving the completion information, generating amount information based on the second commodity list, and displaying a collection interface until receiving the completion settlement information or canceling the payment information.

If the second confidence coefficient is still smaller than the first threshold value, or when joint reasoning cannot be carried out, the fact that commodity information cannot be accurately determined is indicated, at the moment, a commodity shelf monitoring image selling the commodity is obtained according to the commodity information which is currently estimated, whether a user who carries out settlement purchases the estimated commodity or not is determined according to the monitoring image, namely, the commodity information is verified, if the verification is passed, the estimated commodity information is added into the first commodity list and is updated into the second commodity list, and if the verification is failed, the user is recommended to adopt manual scanning bar code for inputting the non-input commodity.

After the user clicks the display interface to finish inputting, the system receives the finishing information, generates the amount information according to the second commodity list, and displays the collection interface after the determination until the user payment is detected to be finished or the payment is cancelled.

Specifically, the method for identifying the commodity image comprises the following steps:

If the bar code area is detected in the commodity image, commodity information is generated based on the bar code, the corresponding first confidence coefficient is set to be 100%, if the bar code area is not identified, the commodity image is identified based on the depth identification model, commodity information and the first confidence coefficient are obtained, the commodity type is determined based on the commodity information, if the commodity type belongs to a preset type, the first confidence coefficient is larger than a first threshold value, the commodity quality is continuously identified, and if the damaged commodity is identified, a commodity quality prompt is generated.

Firstly, detecting a bar code area in a commodity image, wherein the bar code area usually shows high gradient change (strip edge) in the horizontal direction and small gradient change in the vertical direction, so a detection method based on the gradient direction can be used, specifically, the gradient amplitude is calculated by using a Sobel operator to calculate the X and Y direction gradients of the image, then a threshold value is set to filter a low gradient area, the high gradient area is reserved, then the gradient direction is counted, and an area with high uniformity in the gradient direction is screened, and if the horizontal or vertical gradient direction ratio in the area exceeds a preset threshold value (such as 80%), the effective bar code area is judged.

After detecting that the barcode region exists in the image, the commodity information is identified based on the barcode, and the first confidence is set to 100%, indicating that the commodity information acquired by the barcode is necessarily correct. If the bar code area is not identified, the commodity image is identified by using the depth identification model, so that commodity information and corresponding first confidence coefficient are obtained.

The preset type of the embodiment is fruits or vegetables, and the quality judging function is introduced into the self-service cash register, so that a prompt can be automatically generated when the quality problem of the fruits or vegetables is detected, and the purchase satisfaction of users is improved. For example, when the red Fuji apples exist in the commodity image based on the depth recognition model, the quality detection is continuously carried out, and when the red Fuji apples are found to be damaged, a reminder is generated. In particular, fruits are typically bagged settlements, where it is common for bottom fruits to be obscured by top fruits, in which case a reminder may still be generated when damage to the top fruits is identified.

The method for evaluating the commodity quality comprises the following steps:

And carrying out gridding segmentation on the commodity image to obtain a plurality of grid areas, obtaining an original average pixel value of each grid area, setting a plurality of interval ranges, wherein each interval range corresponds to one mapping pixel, and mapping the mapping pixel into a corresponding mapping pixel based on the interval range where the original average pixel value is located.

For ease of description, the merchandise image is segmented into 6*6 grid areas, each including 4*4 number of pixels. Particularly, the finer the grid division is, the more accurate the final judgment result is, but the slower the judgment speed is, the coarser the grid division is, the more inaccurate the final judgment result is, but the judgment speed is increased. The original average pixel value is obtained by calculating the average pixel values of all the same channels in the grid area, 10 interval ranges are set for all three channels of RGB, for example, 0-25 is set as an interval unit for all three channels of R, G, B, the corresponding mapping pixel point is 13, when the original average pixel value of a certain grid area in an R channel is 19, the original average pixel value in the interval range of 0-25 is mapped to 13, the processing methods of other channels are the same, the calculation complexity can be reduced by mapping, and the calculation speed is increased.

Dividing a grid area into a plurality of types of subareas, acquiring statistical characteristics of the subareas, wherein the statistical characteristics comprise mapping pixels and occupation ratios existing in the subareas, establishing a standard characteristic library, wherein the standard characteristic library comprises standard characteristics of each subarea of each commodity type under a damaged state, comparing the statistical characteristics of the subareas in a commodity image with the standard characteristics, and determining commodity quality based on comparison results.

As shown in fig. 3, the commodity image is taken as a center, and a plurality of grid areas are divided into one area around the circumference, so that the commodity image is divided into 3 layers from outside to inside, wherein the area at the outermost layer is defined as a subarea a, the area at the middle layer is defined as a subarea B, the area at the innermost layer is defined as a subarea C, and in other embodiments, the commodity image can be equally divided into a plurality of rectangular areas 3*3. Then, calculating the statistical characteristics of each sub-area, for example, the sub-area B includes three mapping pixel values 13, 38 and 63 in the R channel, and the occupation ratio of the mapping pixel 38 is 25% when the ratio of the mapping pixel values in the sub-area B is 1:1:2, and continuously acquiring the mapping pixel and the occupation ratio of the G channel and the B channel in this way.

The standard database includes standard features of the commodity in a damaged state, for example, when the surface of red Fuji apples is rotten, the subarea B should include a mapping pixel 38 in the R channel, the mapping pixel 38 should occupy a proportion of 20% or more, the G channel should include a mapping pixel 88, the mapping pixel 88 should occupy a proportion of 20% or more, the B channel should include a mapping pixel 138, and the mapping pixel 138 should occupy a proportion of 20% or more. When the mapping pixels and the proportions included in the subareas meet the conditions, the subareas are considered to meet the standard characteristics. In the R channel of the sub-region B described above, the proportion occupied by the map pixels 38 is 25% which is greater than the proportion in the standard feature, i.e. meets the standard feature of the R channel. Commodity damage is considered to occur in this sub-region when the statistical characteristics of the 3 channels all meet the standard characteristics. In the commodity image, when damage features appear in at least one sub-area, the commodity quality is determined to be problematic. In other embodiments, to increase the calculation speed, the commodity image may be processed into a gray scale image and then processed.

Because a large number of training sets are required for training by using the deep neural network for recognition, the method provided by the invention can be rapidly deployed under the conditions that more training sets cannot be obtained and the recognition effect cannot be better by using the migration learning method.

In this embodiment, verifying merchandise information based on pick-and-place information of a vending rack includes the following steps:

Acquiring a goods shelf image of a vending goods shelf, determining goods information of the goods to be moved out based on goods shelf data when the goods are moved out of the goods shelf image, carrying out target tracking on a user of the goods to be moved out to acquire a user moving image, intercepting a face image and a body image in the user moving image for pre-storing, marking the goods information of the vending goods shelf in the face image and the body image as image labels, searching based on the goods information of the goods to be unidentified and the image labels when verification requirements are met, comparing the face image and the body image containing the goods information to the current user image if the face image and the body image containing the goods information to be unidentified exist, and determining that the goods information to be identified is correct when the comparison is passed.

First, a monitoring camera is arranged in each goods shelf area, the monitoring camera shoots images of the goods shelf areas, and the images are recognized based on a depth recognition model to determine whether goods shelves in the goods shelf areas have taking and placing behaviors. The shelf data includes merchandise information sold by the respective shelf in each surveillance camera surveillance area. When the commodity is detected to be moved out of the vending shelf, the position of the shelf is determined firstly based on the monitoring image, and then the commodity information of the vending shelf is obtained by locating the corresponding shelf data based on the position of the shelf. If the goods sold by the vending shelf are the goods A, continuing to intercept the image of the person who moves out, and tracking the person based on a target tracking algorithm until the face image and the body image of the person are acquired, wherein the characteristics of the body image to be recorded comprise the color of the dressing. And finally, pre-storing the facial image and the body image, and marking commodity information as commodity A.

When the self-service cash register determines that the commodity information of the unidentified commodity is commodity A, but when the first confidence coefficient or the second confidence coefficient is smaller than a first threshold value, searching the pre-stored image data for facial data and body data containing the image label as commodity A, and if the facial data and the body data are not searched, failing to verify the commodity information. If the face image and the body image containing the commodity A are searched, the face image and the body image of the current settlement user are continuously acquired and are compared with the searched face image and body image, and if the comparison result shows that the face image and the body image are both one person, the commodity information is verified.

In this embodiment, if it is detected that the user moves the merchandise back to the shelf in the shelf image or the user leaves the mall area, the face image and the body image of the user are deleted.

When the user moves the merchandise back to the shelf, it is indicated that the user has abandoned the purchase and therefore deleted it, avoiding subsequent verification errors. After the user leaves the mall, the corresponding face image and body image are removed, thereby protecting the user privacy.

In this embodiment, the joint reasoning based on all commodity images and weight information includes the following steps:

A weight database is established, the weight database comprises standard weight and floating range of each commodity, error distribution of a weight sensor is set, first weight of the identified commodity is obtained based on the weight database, second weight of the unidentified commodity is calculated based on total weight of the commodity in a preset area and the first weight, and the weight distribution range is obtained by correcting the second weight based on the floating range and the error distribution.

The weight database includes a standard weight of each commodity, such as 200g for chip A, and a float range for each commodity, such as 5g for chip A. For the weight sensor, the error distribution was set to 5g. The following example describes the reasoning process, assuming that there are two identified products and two unidentified products, the identified products are chips a, the first weight is 200g, the total weight of the current predetermined area is 1200g, the second weight is 1200-2×200=800 g, and the weight distribution of the second weight is [800-3×5g,800+3×5g ], namely [785g,815g ] after correction due to the presence of two chips a and the addition of the sensor error.

And acquiring candidate information, wherein the candidate information is commodity information with the first confidence coefficient larger than a second threshold value in the single commodity image recognition result, acquiring the standard weight and the floating range of the candidate information based on a weight database, generating candidate combinations meeting the weight distribution range based on the standard weight and the floating range, and calculating the combination score of each candidate combination based on the weight distribution range of the candidate combinations.

And identifying the image of the unidentified commodity, determining that the first confidence coefficient of the unidentified commodity is 0.6, the first confidence coefficient of the commodity B is 0.55, the first confidence coefficient of the commodity C is 0.4, and setting the second threshold value to be 0.5, wherein the commodity A and the commodity B are taken as two pieces of candidate information. In particular, if at least one of the commodity a and the commodity B is fresh fruit, the reasoning is abandoned and the user is reminded to carry out individual settlement, because the weight of the commodity cannot be accurately obtained during settlement due to the interference of other commodities. The standard weight of the commodity A is 400g, the floating range is +/-20 g, the standard weight of the commodity B is 395g, and the floating range is +/-25 g. Then according to the generation of the candidate combination A, the weight of the candidate combination A is between [760g,840g ], the weight of the candidate combination B is between [765g,815g ], and other candidate combinations are not exemplified one by one.

This embodiment first defines that the weights of commodity a and commodity B obey a normal distribution, e.g., commodity a obeys a normal distribution of 400 μ and 20 σ. The probability density value of commodity a under 400g of standard weight and the probability density value of commodity B under 395g of standard weight are calculated respectively based on a probability density function of normal distribution, which is specifically common knowledge and will not be described here. Calculated, the probability density value P (400) =0.0199 for commodity a at 400g of standard weight, and the probability density value P (400) =0.0158 for commodity B at 395g of standard weight.

The combination score for each candidate combination is calculated based on the following first formula: , wherein, For the combined score to be a score,For the number of candidate information included in the candidate combination,The probability density value of the ith candidate information in the candidate combination under the standard weight,Is a natural constant which is used for the production of the high-temperature-resistant ceramic material,For the penalty factor, its value is 0.1,Is the total weight of the candidate combination, in particular, obtained by summing the standard weights of the candidate combination including the candidate information, e.g. the standard weight of candidate combination a is 400+400 = 800g,Is a second weight. By substituting the above values into the first formula, a combination score of candidate combination a, in particular, can be obtainedSimilarly, a combination score 0.000226 for candidate combination B may be calculated. In the first formula, the greater the probability density value, or the closer the total weight of the candidate combination is to the second weight, the greater the calculated combination score.

And calculating a candidate score of each candidate combination based on the first confidence coefficient and the combination score, screening the candidate combination corresponding to the maximum candidate score as a target combination, wherein the included candidate information is taken as an inference result of unidentified commodities, and the corresponding candidate score is taken as a second confidence coefficient.

The calculation of the candidate scores of the candidate combinations according to the present embodiment includes the steps of:

and obtaining the purchase probability of each candidate information based on the historical purchase record, carrying out weighted summation on the first confidence coefficient, the combination score and the purchase probability based on preset weights, obtaining the combined score of the candidate combination, normalizing the combined score to obtain a normalized score, generating a penalty value based on the number of the candidate combinations currently existing, and correcting all the normalized scores based on the penalty value to obtain the candidate score.

In this embodiment, the user is required to log in the account before settling, and the settlement is performed after logging in the account, but the method sacrifices certain convenience, but the purchase record of the user can be stored in the user account, so that the user can check at any time and perform subsequent after-sales processing. After the user logs in the account, the purchase probability of the candidate information is determined according to the purchase record of the user, for example, the ratio of the historical purchase times of the commodity A to the historical total purchase times is used as the purchase probability of the commodity A.

The purchase probability, the first confidence coefficient and the combined score are weighted and summed according to preset weights built in the system to obtain a combined score, wherein the preset weights of the purchase probability, the first confidence coefficient and the combined score are respectively 0.1, 0.6 and 0.3, and the weighted and summed is based on a second formula, and the second formula is as follows: Where S is the joint score of the candidate combination, For the average value of the purchase probability of the candidate information (commodity) included in the candidate combination,For a first confidence average for a candidate combination that includes commodity, R is a combination score for the candidate combination,、AndPreset weights for purchase probability, first confidence and combined score, respectively. For the candidate combination A, the average value of the purchase probability of the candidate information is 0.16, the average value of the first confidence coefficient of the candidate information is 0.85, the combination score is 0.000396, the combination score of the candidate combination A is obtained by substituting the candidate combination A into a second formula and is-2.02, and similarly, the combination score of the candidate combination B is calculated and obtained to be-2.39.

In this embodiment, the normalization is performed using the Softmax function, and the normalization score for candidate combination a is 0.592, and the normalization score for candidate combination B is 0.408. The more candidate combinations are generated, the higher the penalty value is, which indicates that the number of combinations meeting the weight distribution range is too large, and the calculation accuracy is greatly reduced, for example, the penalty value is 0.95, the candidate score of the candidate combination A obtained after correction is 0.562, and the candidate score of the candidate combination B is 0.3876.

Through the calculation, if the candidate score of the candidate combination A is determined to be the largest, the candidate combination A is the target combination, wherein two commodities A included in the target combination are reasoning results of unidentified commodities, and the candidate score of the candidate combination A is the second confidence coefficient.

As shown in fig. 4, the present invention further provides a self-service cashing system based on multi-mode interaction, where the system is configured to implement the self-service cashing method based on multi-mode interaction, and the system includes:

The identification module is used for shooting an initial image of a preset area after receiving a start instruction, splitting the initial image into a plurality of commodity images, identifying each commodity image based on the depth identification model to acquire commodity information and first confidence coefficient contained in the commodity image, and generating a first commodity list based on commodity information with the first confidence coefficient larger than a first threshold value.

The first verification module counts the first quantity and the second quantity of the identified commodities and the unidentified commodities in the initial image, obtains weight information of a preset area if the first quantity and the second quantity meet preset conditions, and performs joint reasoning based on all the commodity images and the weight information to obtain commodity information and second confidence of the unidentified commodities.

And the second verification module is used for locating the corresponding selling goods shelf based on the goods information of the unidentified goods if the second confidence coefficient is smaller than the first threshold value, verifying the goods information based on the picking and placing information of the selling goods shelf, adding the goods information of the unidentified goods into the first goods list if the verification is successful, obtaining a second goods list, and generating a manual input prompt if the verification is failed.

It will be understood that the terms "first," "second," and the like, as used herein, may be used to describe various elements, but these elements are not limited by these terms unless otherwise specified. These terms are only used to distinguish one element from another element. For example, a first xx script may be referred to as a second xx script, and similarly, a second xx script may be referred to as a first xx script, without departing from the scope of this disclosure.

The technical features of the foregoing embodiments may be arbitrarily combined, and for brevity, all of the possible combinations of the technical features of the foregoing embodiments are not described, however, they should be considered as the scope of the disclosure as long as there is no contradiction between the combinations of the technical features.

The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims

1. A self-service checkout method based on multimodal interaction, characterized by comprising:

After receiving the start instruction, an initial image of a predetermined area is captured, the initial image is split into a plurality of product images, and each of the product images is recognized based on a deep recognition model to obtain product information and a first confidence level included therein;

generating a first commodity list based on the commodity information whose first confidence level is greater than a first threshold, and counting a first number and a second number of recognized commodities and unrecognized commodities in the initial image;

If the first quantity and the second quantity meet a preset condition, obtaining weight information of the predetermined area, performing joint reasoning based on all the commodity images and the weight information, and obtaining the commodity information and a second confidence level of the unrecognized commodity;

If the second confidence level is less than the first threshold, locating a corresponding sales shelf based on the product information of the unidentified product, and verifying the product information based on the pick-and-place information of the sales shelf;

If the verification is successful, the product information of the unidentified product is added to the first product list to obtain the second product list; if the verification fails, a manual entry reminder is generated;

After receiving the completion information, generating amount information based on the second commodity list and displaying the payment interface until receiving the completion information or the payment cancellation information;

Performing joint reasoning based on all the product images and the weight information includes the following steps:

Establishing a weight database, the weight database including a standard weight and a floating range of each commodity, setting an error distribution of a weight sensor, obtaining a first weight of an identified commodity based on the weight database, calculating a second weight of an unidentified commodity based on a total weight of commodities in the predetermined area and the first weight, and correcting the second weight based on the floating range and the error distribution to obtain a weight distribution range thereof;

Acquire candidate information, where the candidate information is the commodity information whose first confidence is greater than a second threshold in a single commodity image recognition result, acquire the standard weight and the floating range of the candidate information based on the weight database, generate a candidate combination that satisfies the weight distribution range based on the standard weight and the floating range, and calculate a combination score for each candidate combination based on the weight distribution range of the candidate combination;

The candidate score of each candidate combination is calculated based on the first confidence and the combination score, and the candidate combination corresponding to the maximum candidate score is screened as the target combination, wherein the candidate information included therein is used as the inference result of the unidentified commodity, and the corresponding candidate score is used as the second confidence.

2. The method according to claim 1, characterized in that identifying the commodity image comprises the following steps:

If a barcode area is detected in the product image, the product information is generated based on the barcode, and the corresponding first confidence is set to 100%; if the barcode area is not identified, the product image is identified based on the deep recognition model to obtain the product information and the first confidence; the product type is determined based on the product information; if the product type belongs to a preset type and the first confidence is greater than the first threshold, the product quality continues to be identified; if damaged products are identified, a product quality reminder is generated.

3. The method according to claim 2, characterized in that identifying the quality of the product comprises the following steps:

The product image is divided into grids to obtain a plurality of grid areas, an original average pixel value of each of the grid areas is obtained, a plurality of interval ranges are set, each of the interval ranges corresponds to a mapping pixel, and based on the interval range where the original average pixel value is located, it is mapped to the corresponding mapping pixel;

The grid area is divided into multiple types of sub-areas, and statistical features of the sub-areas are obtained, wherein the statistical features include the mapped pixels existing in the sub-areas and their occupation ratios; a standard feature library is established, wherein the standard feature library includes standard features of each sub-area of each product type in a damaged state; the statistical features of the sub-areas in the product image are compared with the standard features, and the quality of the product is determined based on the comparison results.

4. The method according to claim 1, characterized in that verifying the commodity information based on the pick-and-place information of the selling shelf comprises the following steps:

Acquire a shelf image of the selling shelf, and when a product is removed from the shelf in the shelf image, determine the product information of the removed product based on the shelf data, track the user who removed the product to obtain a user movement image, capture a facial image and a body image in the user movement image for pre-storage, and mark the product information of the selling shelf in the facial image and the body image as an image tag; when a verification requirement arises, perform a search based on the product information of the unrecognized product and the image tag; if the facial image and the body image containing the product information of the unrecognized product are retrieved, compare them with the current user image; if the comparison is successful, determine that the currently recognized product information is correct.

5. The method according to claim 1, wherein calculating the candidate score of the candidate combination comprises the following steps:

The purchase probability of each type of candidate information is obtained based on historical purchase records, and the first confidence, the combination score and the purchase probability are weighted and summed based on preset weights to obtain a joint score of the candidate combination, and the joint score is normalized to obtain a normalized score, a penalty value is generated based on the number of candidate combinations currently existing, and all the normalized scores are corrected based on the penalty value to obtain the candidate score.

6 . The method according to claim 5 , wherein the preset condition includes that both the first number and the second number are smaller than a third threshold.

7. The method according to claim 4 is characterized in that if it is detected in the shelf image that the user moves the product back to the shelf, or the user leaves the shopping mall area, the facial image and the body image of the user are deleted.

8. A self-service cashier system based on multimodal interaction, used to implement the method according to any one of claims 1 to 7, characterized in that it includes:

The recognition module, after receiving the start instruction, captures an initial image of a predetermined area, splits the initial image into a plurality of product images, recognizes each of the product images based on a deep recognition model to obtain product information and a first confidence level included therein, and generates a first product list based on the product information whose first confidence level is greater than a first threshold.

a first verification module, which counts the first number and the second number of identified commodities and unidentified commodities in the initial image, obtains the weight information of the predetermined area if the first number and the second number meet the preset conditions, performs joint reasoning based on all the commodity images and the weight information, and obtains the commodity information and the second confidence of the unidentified commodities, wherein when performing joint reasoning based on all the commodity images and the weight information, a weight database is established, the weight database includes the standard weight and floating range of each commodity, the error distribution of the weight sensor is set, the first weight of the identified commodity is obtained based on the weight database, the second weight of the unidentified commodity is calculated based on the total weight of the commodities in the predetermined area and the first weight, and the second weight is corrected based on the floating range and the error distribution to obtain its weight distribution range;

Calculating a candidate score for each candidate combination based on the first confidence and the combination score, screening the candidate combination corresponding to the maximum candidate score as the target combination, wherein the candidate information included therein is used as the inference result of the unidentified commodity, and the corresponding candidate score is used as the second confidence;

a second verification module, if the second confidence level is less than the first threshold, locating a corresponding sales shelf based on the product information of the unidentified product, verifying the product information based on the pick-and-place information of the sales shelf, and if the verification is successful, adding the product information of the unidentified product to the first product list to obtain a second product list, and generating a manual entry reminder if the verification fails;

The settlement module, after receiving the completion information, generates amount information based on the second commodity list and displays a payment collection interface until receiving the completion information or the payment cancellation information.

9. A computer storage medium, characterized in that the computer storage medium stores program instructions, wherein when the program instructions are executed, the device where the computer storage medium is located is controlled to execute the method according to any one of claims 1 to 7.