WO2019201141A1

WO2019201141A1 - Methods and systems for image processing

Info

Publication number: WO2019201141A1
Application number: PCT/CN2019/082188
Authority: WO
Inventors: Xiaogang Du
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2018-04-18
Filing date: 2019-04-11
Publication date: 2019-10-24
Anticipated expiration: 2020-10-18
Also published as: CN110622172A

Abstract

A method and system for image processing are provided in the present disclosure. The method may include obtaining a first image including an object in a first representation. The method may also include determining at least one first position of the object in the first image. The method may further include generating a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object. The second image may include the object in the second representation.

Description

METHODS AND SYSTEMS FOR IMAGE PROCESSING

CROSS REFERENCE

This application claims priority of Chinese Application No. 201810349840.7 filed on April 18, 2018, and priority of Chinese Application No. 201810374653.4 filed on April 24, 2018, the entire contents of each of which are hereby incorporated by reference.

TECHNICAL FIELD

The present disclosure generally relates to a method and system for image processing, and particularly relates to a method and system for processing an image captured by a mobile terminal.

BACKGROUND

With the current developments, mobile network technology has been playing an increasingly important role in people’s daily lives. All kinds of daily activities, such as identity verification, financial transactions, and online payments, etc., could all be performed via the mobile network. Usually, to enable the above activities, a user will need to upload some documents or images to confirm his/her identity and/or qualification. For example, in order to pay a bill or provide a service via the mobile internet, for authentication, a user needs to upload an image or his/her ID card. The uploaded image (s) must satisfy a series of conditions so as to provide clear and accurate information. As another example, to extract text from a document such as a paper document, an image (e.g., a photo) of the document may first be obtained, and an optical character identification (OCR) may be performed on the image to extract the required text. To successfully perform the OCR, the quality of the image has to be above a certain standard. Usually, such an image is obtained by a specific imaging equipment (e.g., a scanner) , and the OCR is performed by an OCR software packaged with the imaging equipment. Such an approach, however, is not convenient and cost-efficient.

Mobile terminals are becoming an indispensable part of people’s daily life. By using images captured by a mobile terminal in the identity/qualification conformation and/or the OCR, the efficiency can be greatly improved and the cost can be reduced. However, due to luminance intensity, complexity of background, human factors, etc., an image taken by a mobile terminal usually suffers from inferior quality. For example, the image taken by a mobile terminal is often tilted or distorted. Consequently, images taken by a mobile terminal may fail to satisfy the standard for, e.g., identity/qualification conformation and OCR.

Therefore, it is desirable to provide methods and systems for more efficient and accurate processing of images, especially images captured with mobile terminals.

SUMMARY

According to an aspect of the present disclosure, a method for processing an image is provided. The method may include obtaining a first image including an object in a first representation. The method may also include determining at least one first position of the object in the first image. The method may further include generating a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object. The second image may include the object in the second representation.

In some embodiments, the second representation may be related to at least one of a reference size of the object, a reference image occupation ration of the object, or a reference direction with respect to the object. The first representation may be related to at least one of an original size, an original image occupation ration, or an original direction of the object.

In some embodiments, the determining the at least one first position may include: detecting a plurality of peripheral lines of the object in the first image, and determining the at least one first position based on the plurality of peripheral lines.

In some embodiments, the adjusting the first image based on the at least one first position and the second representation with respect to the object may include: determining, based at least in part on the second representation and the plurality of peripheral lines, at least one second position corresponding to the at least one first position; obtaining a correction matrix based on the at least one first position and the at least one second position; and applying the correction matrix on the first image.

In some embodiments, the detecting the plurality of peripheral lines of the object may include: detecting a plurality of line segments associated with the object in the first image by using a line segment detector to treat the first image; filtering the plurality of line segments to obtain a plurality of filtered line segments, wherein the filtering may be based at least in part on directions of the plurality of line segments; and determining the plurality of peripheral lines based on the plurality of filtered line segments.

In some embodiments, the filtering may be further based on confidence scores of the plurality of line segments.

In some embodiments, the method may further include: identifying, from the plurality of line segments, line segments along a same straight line; and updating the plurality of line segments by combining the line segments identified as being along a same straight line.

In some embodiments, the plurality of filtered line segments may include a plurality of line segment sets corresponding to the plurality of peripheral lines. The first image may include a plurality of predetermined parts corresponding to the plurality of peripheral lines. The filtering the plurality of line segments to obtain the plurality of filtered line segments may include, for each of the plurality of predetermined parts, selecting, from the plurality of line segments, a set of line segments in the predetermined part as one of the plurality of line segment sets, wherein the direction of each of the set of line segments may be within a preset range associated with the predetermined part. The determining the plurality of peripheral lines based on the plurality of filtered line segments may include, for each of the plurality of line segment sets, identifying a longest line segment of the line segment set as the corresponding peripheral line of the object.

In some embodiments, the at least one first position may include positions of one or more vertices of the object in the first image, and the determining the at least one first position based on the plurality of peripheral lines may include determining an intersection of each adjacent pair of the plurality of peripheral lines as the one or more vertices.

In some embodiments, the determining, based at least in part on the second representation and the plurality of peripheral lines, at least one second position corresponding to the at least one first position may include: determining a first size of the object in the first image based on the plurality of peripheral lines; determining a reference size of the object based on the first size and the second representation; and determining the at least one second position based on the reference size.

In some embodiments, the at least one first position may correspond to at least one part of the object, and the determining the at least one first position may include recognizing the at least one part of the object in the first image using an object recognition technique.

In some embodiments, the adjusting the first image based on the at least one first position and the second representation with respect to the object may include: determining a rotation mode based on the at least one first position and the second representation; and rotating the first image according to the rotation mode.

In some embodiments, the determining the rotation mode based on the at least one first position and the second representation may include: determining at least one second position corresponding to the at least one first position based on the second representation with respect to the object; and determining the rotation mode based on a mapping relationship between the at least one first position and the at least one second position.

In some embodiments, the object recognition technique is based on a conventional neural network (CNN) model.

In some embodiments, the object may be a document, and the at least one part of the object may include at least a first part including a title of the document, a second part including an image of an owner, and a third part including a stamp, a signature, a district identifier, another image of the owner, or a code bar.

In some embodiments, the method may further include cropping the second image or the first image so that the second image or the first image only includes the object.

According to another aspect of the present disclosure, a system for image processing is provided. The system may include at least one storage medium including a set of instructions, and at least one processor in communication with the at least one storage medium. When executing the set of instructions, the at least one processor may be directed to obtain a first image including an object in a first representation. The at least one processor may also be directed to determine at least one first position of the object in the first image. The at least one processor may further be directed to generate a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object. The second image may include the object in the second representation.

According yet to another aspect of the present disclosure, a system for image processing is provided. The system may include a first image module, a first position module, and an adjustment module. The first image module may be configured to obtain a first image including an object in a first representation. The first position module may be configured to determine at least one first position of the object in the first image. The adjustment module may be configured to generate a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object. The second image may include the object in the second representation.

According yet to another aspect of the present disclosure, a non-transitory computer readable medium including instructions compatible for image processing is provided. When executed by a processor of an electronic device, the instructions may direct the electronic device to execute an image processing process. The image processing process may include obtaining a first image including an object in a first representation. The image processing process may also include determining at least one first position of the object in the first image. The image processing process may further include generating a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object. The second image may include the object in the second representation.

Additional features will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following and the accompanying drawings or may be learned by production or operation of the examples. The features of the present disclosure may be realized and attained by practice or use of various aspects of the methodologies, instrumentalities and combinations set forth in the detailed examples discussed below.

BRIEF DESCRIPTIONS OF THE DRAWINGS

The present disclosure is further described in terms of exemplary embodiments. These exemplary embodiments are described in detail with reference to the drawings. These embodiments are non-limiting exemplary embodiments, in which like reference numerals represent similar structures throughout the several views of the drawings, and wherein:

FIG. 1 is a schematic diagram illustrating an exemplary online-to-offline service system according to some embodiments of the present disclosure;

FIG. 2 is a schematic diagram illustrating an exemplary computing device;

FIG. 3 is a schematic diagram illustrating an exemplary image processing device according to some embodiments of the present disclosure;

FIG. 4 is a flowchart illustrating an exemplary process for image processing according to some embodiments of the present disclosure;

FIG. 5 is a flowchart illustrating an exemplary process for image processing according to some embodiments of the present disclosure;

FIG. 6 is a flowchart illustrating an exemplary process for image processing according to some embodiments of the present disclosure;

FIG. 7 is a flowchart illustrating an exemplary process for image processing according to some embodiments of the present disclosure;

FIG. 8 is a schematic diagram illustrating a source image including an ID card according to some embodiments of the present disclosure;

FIG. 9 is a schematic diagram illustrating line segments detected by operating an LSD algorithm on the source image of FIG. 8 according to some embodiments of the present disclosure;

FIG. 10 is a schematic diagram illustrating peripheral lines determined based on line segments of FIG. 9 according to some embodiments of the present disclosure;

FIG. 11 is a schematic diagram illustrating a determination of the vertices of the object to be recognized based on intersections of the peripheral lines of FIG. 10 according to some embodiments of the present disclosure;

FIG. 12 is a schematic diagram illustrating a corrected image obtained based on the vertices of FIG. 11 according to some embodiments of the present disclosure;

FIG. 13 is a schematic diagram illustrating an exemplary template according to some embodiments of the present disclosure;

FIG. 14 is a schematic diagram illustrating an image to be processed in Situation 1 according to some embodiments of the present disclosure;

FIG. 15 is a schematic diagram illustrating an image to be processed in Situation 2 according to some embodiments of the present disclosure;

FIG. 16 is a schematic diagram illustrating an image to be processed in Situation 3 according to some embodiments of the present disclosure;

FIG. 17 is a schematic diagram illustrating an image to be processed in Situation 4 according to some embodiments of the present disclosure;

FIG. 18 is a flowchart illustrating an exemplary process for processing an image according to some embodiments of the present disclosure;

FIG. 19 is a schematic diagram illustrating an enlarged view of the image to be processed in FIG. 14;

FIG. 20 is a schematic diagram illustrating an exemplary pattern for obtaining positions of targets according to some embodiments of the present disclosure;

FIG. 21 is a schematic diagram illustrating positions of targets in Situation 1 of FIG. 14 according to some embodiments of the present disclosure;

FIG. 22 is a schematic diagram illustrating positions of targets in Situation 2 of FIG. 15 according to some embodiments of the present disclosure;

FIG. 23 is a schematic diagram illustrating positions of targets in Situation 3 of FIG. 16 according to some embodiments of the present disclosure;

FIG. 24 is a schematic diagram illustrating positions of targets in Situation 4 of FIG. 17 according to some embodiments of the present disclosure;

FIG. 25 is a schematic diagram illustrating positions of targets in Situation 1 of FIG. 14 according to some embodiments of the present disclosure;

FIG. 26 is a schematic diagram illustrating positions of targets in Situation 2 of FIG. 15 according to some embodiments of the present disclosure;

FIG. 27 is a schematic diagram illustrating positions of targets in Situation 3 of FIG. 16 according to some embodiments of the present disclosure;

FIG. 28 is a schematic diagram illustrating positions of targets in Situation 4 of FIG. 17 according to some embodiments of the present disclosure;

FIG. 29 is a schematic diagram illustrating an image to be processed in Situation 5 according to some embodiments of the present disclosure; and

FIG. 30 is a schematic diagram illustrating a pattern for processing the image to be processed in Situation 5 of FIG. 29 according to some embodiments of the present disclosure.

DETAILED DESCRIPTION

The present disclosure generally relates to a method and system for processing an image so that the content of the image may be displayed in a standard form. For example, the image may include a document such as an ID card, a license, a bank card, a certificate, a passport, paper document including text, etc. By processing the image, the direction and/or the occupation of the document with respect to the image may be adjusted according to a standard template, a desired format, or a reference image.

The following description is presented to enable any person skilled in the art to make and use the present disclosure, and is provided in the context of a particular application and its requirements. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present disclosure is not limited to the embodiments shown, but is to be accorded the widest scope consistent with the claims.

The flowcharts used in the present disclosure illustrate operations that systems implement according to some embodiments of the present disclosure. It is to be expressly understood, the operations of the flowcharts may be implemented not in order. Conversely, the operations may be implemented in inverted order, or simultaneously. Moreover, one or more other operations may be added to the flowcharts. One or more operations may be removed from the flowcharts.

Moreover, while the systems and methods disclosed in the present disclosure are described primarily regarding online-to-offline transportation service, it should also be understood that this is only one exemplary embodiment. The system or method of the present disclosure may be applied to any other kind of online-to-offline service. For example, the system or method of the present disclosure may be applied to different transportation systems including land, ocean, aerospace, or the like, or any combination thereof. The vehicle of the transportation systems may include a taxi, a private car, a hitch, a bus, a vessel, an aircraft, a driverless vehicle, a bicycle, a tricycle, a motorcycle, or the like, or any combination thereof. The transportation system may also include any transportation system that applies management and/or distribution, for example, a system for transmitting and/or receiving an express, or a system for a take-out service. The application scenarios of the system or method of the present disclosure may include a web page, a plug-in of a browser, a client terminal, a custom system, an internal analysis system, an artificial intelligence robot, or the like, or any combination thereof.

The terms “passenger, ” “requester, ” “service requester, ” and “customer” in the present disclosure are used interchangeably to refer to an individual, an entity or a tool that may request or order a service. Also, the terms “driver, ” “provider, ” “service provider, ” and “supplier” in the present disclosure are used interchangeably to refer to an individual, an entity, or a tool that may provide a service or facilitate the providing of the service. The term “user” in the present disclosure may refer to an individual, an entity, or a tool that may request a service, order a service, provide a service, or facilitate the providing of the service. For example, the user may be a passenger, a driver, an operator, or the like, or any combination thereof. In the present disclosure, terms “passenger” and “passenger terminal” may be used interchangeably, and terms “driver” and “driver terminal” may be used interchangeably.

The term “service request” in the present disclosure refers to a request that initiated by a passenger, a requester, a service requester, a customer, a driver, a provider, a service provider, a supplier, or the like, or any combination thereof. The service request may be accepted by any one of a passenger, a requester, a service requester, a customer, a driver, a provider, a service provider, or a supplier. The service request may be chargeable, or free.

The positioning technology used in the present disclosure may include a global positioning system (GPS) , a global navigation satellite system (GLONASS) , a compass navigation system (COMPASS) , a Galileo positioning system, a quasi-zenith satellite system (QZSS) , a wireless fidelity (WiFi) positioning technology, or the like, or any combination thereof. One or more of the above positioning technologies may be used interchangeably in the present disclosure.

An aspect of the present disclosure relates to systems and methods for displaying information relating to an online-to-offline service (e.g., a taxi service) . In order to help a passenger who initiates a service request of the taxi service to identify a vehicle of a driver who accepts the service request easily and quickly, an online-to-offline service platform may generate an image showing a type of the vehicle of the driver, a color of the vehicle of the driver, a plate number of the vehicle of the driver, and/or a mark on a surface of the vehicle of the driver, and showing the vehicle of the driver from a perspective of the passenger. Alternatively or additionally, the online-to-offline service platform may generate a map showing a real-time position of the vehicle of the driver and real-time positions of other vehicles surrounding the vehicle of the driver. In order to help the passenger to monitor the process of the online-to-offline service without unlocking the passenger’s smart phone when the passenger’s smart phone is locked, the online online-to-offline service platform may determine information corresponding to the process of the on-demand service, and send the information corresponding to the process of the on-demand service along with a display instruction to the passenger’s smart phone. The display instruction may prompt the passenger’s smart phone to display the information corresponding to the process of the online-to-offline service on a lock screen interface of the passenger’s smart phone when the passenger’s smart phone is locked.

It should be noted that online-to-offline service transportation service, such as online taxi service, is a new form of service rooted only in post-Internet era. It provides technical solutions to users and service providers that could raise only in post-Internet era. In the pre-Internet era, when a user hails a taxi on the street, the taxi request and acceptance occur only between the passenger and one taxi driver that sees the passenger. If the passenger hails a taxi through a telephone call, the service request and acceptance may occur only between the passenger and one service provider (e.g., one taxi company or agent) . Online taxi service, however, obtains transaction requests in real-time and automatically. The online taxi service also allows a user of the service to real-time and automatic distribute a service request to a vast number of individual service providers (e.g., taxi) distance away from the user and allows a plurality of service providers to respond to the service request simultaneously and in real-time. Therefore, through Internet, the online-to-offline transportation systems may provide a much more efficient transaction platform for the users and the service providers that may never met in a traditional pre-Internet transportation service system.

In the following detailed description, numerous specific details are set forth by way of examples in order to provide a thorough understanding of the relevant disclosure. However, it should be apparent to those skilled in the art that the present disclosure may be practiced without such details. In other instances, well known methods, procedures, systems, components, and/or circuitry have been described at a relatively high-level, without detail, in order to avoid unnecessarily obscuring aspects of the present disclosure. Various modifications to the disclosed embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present disclosure. Thus, the present disclosure is not limited to the embodiments shown, but to be accorded the widest scope consistent with the claims.

The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms “a” , “an” , and “the” may be intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprise” , “comprises” , and/or “comprising” , “include” , “includes” , and/or “including” , when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It will be understood that the term “system, ” “unit, ” “module, ” and/or “block” used herein are one method to distinguish different components, elements, parts, section or assembly of different level in ascending order. However, the terms may be displaced by another expression if they achieve the same purpose.

Generally, the word “module, ” “sub-module, ” “unit, ” or “block, ” as used herein, refers to logic embodied in hardware or firmware, or to a collection of software instructions. A module, a unit, or a block described herein may be implemented as software and/or hardware and may be stored in any type of non-transitory computer-readable medium or another storage device. In some embodiments, a software module/unit/block may be compiled and linked into an executable program. It will be appreciated that software modules can be callable from other modules/units/blocks or from themselves, and/or may be invoked in response to detected events or interrupts.

Software modules/units/blocks configured for execution on computing devices (e.g., processor 210 as illustrated in FIG. 2) may be provided on a computer-readable medium, such as a compact disc, a digital video disc, a flash drive, a magnetic disc, or any other tangible medium, or as a digital download (and can be originally stored in a compressed or installable format that needs installation, decompression, or decryption prior to execution) . Such software code may be stored, partially or fully, on a storage device of the executing computing device, for execution by the computing device. Software instructions may be embedded in a firmware, such as an EPROM. It will be further appreciated that hardware modules/units/blocks may be included in connected logic components, such as gates and flip-flops, and/or can be included of programmable units, such as programmable gate arrays or processors. The modules/units/blocks or computing device functionality described herein may be implemented as software modules/units/blocks, but may be represented in hardware or firmware. In general, the modules/units/blocks described herein refer to logical modules/units/blocks that may be combined with other modules/units/blocks or divided into sub-modules/sub-units/sub-blocks despite their physical organization or storage. The description may be applicable to a system, an engine, or a portion thereof.

It will be understood that when a unit, engine, module or block is referred to as being “on, ” “connected to, ” or “coupled to, ” another unit, engine, module, or block, it may be directly on, connected or coupled to, or communicate with the other unit, engine, module, or block, or an intervening unit, engine, module, or block may be present, unless the context clearly indicates otherwise. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.

These and other features, and characteristics of the present disclosure, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, may become more apparent upon consideration of the following description with reference to the accompanying drawings, all of which form a part of this disclosure. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended to limit the scope of the present disclosure.

FIG. 1 is a schematic diagram illustrating an exemplary online-to-offline service system 100 according to some embodiments of the present disclosure. For example, the online-to-offline service system 100 may be an online-to-offline transportation service system for transportation services such as taxi hailing, chauffeur services, delivery service, carpool, bus service, take-out service, driver hiring and shuttle services. For brevity, the methods and/or systems described in the present disclosure may take a taxi service as an example. It should be noted that the taxi service is merely provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. For persons having ordinary skills in the art, the methods and/or systems described in the present disclosure may be applied to other similar situations, such as a take-out service, a delivery service, etc.

The online-to-offline service system 100 may include a server 110, a network 120, a requester terminal 130, a provider terminal 140, a storage device 150, and a positioning system 160. The server 110, the requester terminal 130, or the provider terminal 140 may be configured to implement methods for image processing described in the present disclosure. In some embodiments, the server 110, the requester terminal 130, or the provider terminal 140, may be implemented on a computing device 200 having one or more components illustrated in FIG. 2 in the present disclosure.

In some embodiments, the server 110 may be a single server or a server group. The server group may be centralized, or distributed (e.g., the server 110 may be a distributed system) . In some embodiments, the server 110 may be local or remote. For example, the server 110 may access information and/or data stored in the requester terminal 130, the provider terminal 140, and/or the storage device 150 via the network 120. As another example, the server 110 may be directly connected to the requester terminal 130, the provider terminal 140, and/or the storage device 150 to access stored information and/or data. In some embodiments, the server 110 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof. In some embodiments, the server 110 may be implemented on a computing device 200 having one or more components illustrated in FIG. 2 in the present disclosure. In some embodiments, the server 110 may include a processing engine 112. The processing engine 112 may process information and/or data related to the online-to-offline service.

The network 120 may facilitate the exchange of information and/or data. In some embodiments, one or more components in the online-to-offline service system 140 (e.g., the server 110, the requester terminal 130, the provider terminal 140, the storage device 150, and the positioning system 160) may send information and/or data to other component (s) in the online-to-offline service system 140 via the network 120. For example, the server 110 may obtain/acquire a service request from the requester terminal 130 via the network 120. In some embodiments, the network 120 may be any type of wired or wireless network, or a combination thereof. Merely by way of example, the network 130 may include a cable network, a wireline network, an optical fiber network, a telecommunications network, an intranet, the Internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a public telephone switched network (PSTN) , a Bluetooth ^TM network, a ZigBee network, a near field communication (NFC) network, or the like, or any combination thereof. In some embodiments, the network 120 may include one or more network access points. For example, the network 120 may include wired or wireless network access points such as base stations and/or internet exchange points 120-1, 120-2, …, through which one or more components of the online-to-offline service system 100 may be connected to the network 120 to exchange data and/or information.

In some embodiments, a requester may be a user of the requester terminal 130. In some embodiments, the user of the requester terminal 130 may be someone other than the requester. For example, a user A of the requester terminal 130 may use the requester terminal 130 to send a service request for a user B, or receive service and/or information or instructions from the server 110. In some embodiments, a provider may be a user of the provider terminal 140. In some embodiments, the user of the provider terminal 140 may be someone other than the provider. For example, a user C of the provider terminal 140 may user the provider terminal 140 to receive a service request for a user D, and/or information or instructions from the server 110.

In some embodiments, the requester terminal 130 may include a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, a built-in device in a motor vehicle 130-4, or the like, or any combination thereof. In some embodiments, the mobile device 130-1 may include a smart home device, a wearable device, a mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home device may include a control device of an intelligent electrical apparatus, a smart monitoring device, a smart television, a smart video camera, an interphone, or the like, or any combination thereof. In some embodiments, the wearable device may include a bracelet, footgear, glasses, a helmet, a watch, clothing, a backpack, a smart accessory, or the like, or any combination thereof. In some embodiments, the mobile device may include a mobile phone, a personal digital assistance (PDA) , a gaming device, a navigation device, a point of sale (POS) device, a laptop, a desktop, or the like, or any combination thereof. In some embodiments, the virtual reality device and/or the augmented reality device may include a virtual reality helmet, a virtual reality glass, a virtual reality patch, an augmented reality helmet, augmented reality glasses, an augmented reality patch, or the like, or any combination thereof. For example, the virtual reality device and/or the augmented reality device may include a Google Glass ^TM, a RiftCon ^TM, a Fragments ^TM, a Gear VR ^TM, etc. In some embodiments, the built-in device in the motor vehicle 130-4 may include an onboard computer, an onboard television, etc. In some embodiments, the requester terminal 130 may be a device with positioning technology for locating the position of a user of the requester terminal 130 (e.g., a service requester) and/or the requester terminal 130.

In some embodiments, the provider terminal 140 may be a device that is similar to, or the same as the requester terminal 130. For example, the provider terminal 140 may also be or include a mobile device 140-1, a tablet computer 140-2, a laptop computer 140-3, a built-in device in a motor vehicle 140-4, which are the same as or similar to the mobile device 130-1, the tablet computer 130-2, the laptop computer 130-3, the built-in device in a motor vehicle 130-4. In some embodiments, the provider terminal 140 may be a device utilizing positioning technology for locating the position of a user of the provider terminal 140 (e.g., a service provider) and/or the provider terminal 140. In some embodiments, the requester terminal 130 and/or the provider terminal 140 may communicate with one or more other positioning devices to determine the position of the requester, the requester terminal 130, the provider, and/or the provider terminal 140. In some embodiments, the requester terminal 130 and/or the provider terminal 140 may send positioning information to the server 110.

The storage device 150 may store data and/or instructions. In some embodiments, the storage device 150 may store data obtained from the requester terminal 130 and/or the provider terminal 140. In some embodiments, the storage device 150 may store data and/or instructions that the server 110 may execute or use to perform exemplary methods described in the present disclosure. In some embodiments, the storage device 150 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof. Exemplary mass storage may include a magnetic disk, an optical disk, a solid-state drive, etc. Exemplary removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc. Exemplary volatile read-and-write memory may include a random access memory (RAM) . Exemplary RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyrisor RAM (T- RAM) , and a zero-capacitor RAM (Z-RAM) , etc. Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (EPROM) , an electrically-erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc. In some embodiments, the storage device 150 may be implemented on a cloud platform. Merely by way of example, the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.

In some embodiments, the storage device 150 may be connected to the network 120 to communicate with one or more components in the online-to-offline service system 140 (e.g., the server 110, the requester terminal 130, the provider terminal 140, etc. ) . One or more components in the online-to-offline service system 1400 may access the data or instructions stored in the storage device 150 via the network 120. In some embodiments, the storage device 150 may be directly connected to or communicate with one or more components in the online-to-offline service system 140 (e.g., the server 110, the requester terminal 130, the provider terminal 140, etc. ) . In some embodiments, one or more components in the online-to-offline service system 140 (e.g., the server 110, the requester terminal 130, the provider terminal 140, etc. ) may have permission to access the storage device 150. In some embodiments, the storage device 150 may be part of the server 110.

The positioning system 160 may determine information associated with an object, for example, the requester terminal 130, the provider terminal 140, etc. For example, the positioning system 160 may determine a current location of the requester terminal 130. In some embodiments, the positioning system 160 may be a global positioning system (GPS) , a global navigation satellite system (GLONASS) , a compass navigation system (COMPASS) , a BeiDou navigation satellite system, a Galileo positioning system, a quasi-zenith satellite system (QZSS) , etc. The information may include a location, an elevation, a velocity, or an acceleration of the object, or a current time. The location may be in the form of coordinates, such as, latitude coordinate and longitude coordinate, etc. The positioning system 160 may include one or more satellites, for example, a satellite 160-1, a satellite 160-2, and a satellite 160-3. The satellites 160-1 through 160-3 may determine the information mentioned above independently or jointly. The satellite positioning system 160 may send the information mentioned above to the network 120, the requester terminal 130, or the provider terminal 140 via wireless connections.

In some embodiments, information exchanging of one or more components in the online-to-offline service system 140 may be achieved by way of requesting a service. The object of the service request may be any product. In some embodiments, the product may be a tangible product or an immaterial product. The tangible product may include food, medicine, commodity, chemical product, electrical appliance, clothing, car, housing, luxury, or the like, or any combination thereof. The immaterial product may include a servicing product, a financial product, a knowledge product, an internet product, or the like, or any combination thereof. The internet product may include an individual host product, a web product, a mobile internet product, a commercial host product, an embedded product, or the like, or any combination thereof. The mobile internet product may be used in a software of a mobile terminal (e.g., a mobile application) , a program, a system, or the like, or any combination thereof. For example, the product may be any software and/or application used in the computer or mobile phone. The software and/or application may relate to socializing, shopping, transporting, entertainment, learning, investment, or the like, or any combination thereof. In some embodiments, the software and/or application relating to transporting may include a traveling software and/or application, a vehicle scheduling software and/or application, a mapping software and/or application, etc. In the vehicle scheduling software and/or application, the vehicle may include a horse, a carriage, a rickshaw (e.g., a wheelbarrow, a bike, a tricycle, etc. ) , a car (e.g., a taxi, a bus, a private car, etc. ) , a vessel, an aircraft (e.g., an airplane, a helicopter, an unmanned aerial vehicle (UAV) ) , or the like, or any combination thereof.

In some embodiments, the server 110 may require a user of the requester terminal 130 or the provider terminal 140 to verify his/her identity and/or qualification so that the user may be allowed to receive or provide a service via the online-to-offline system 100. The user may upload an image of a document (or images of documents) to verify his/her identity and/or qualification. The document may be or include his/her ID card, license (e.g., a driving license) , certificate, bank card, passport, credential, etc. Such an image may be captured by a camera of the requester terminal 130 or the provider terminal 140. In many cases, the image taken by the user may not satisfy one or more requirements of the online-to-offline system 100. For example, the document may be distorted, tilted or rotated with respect to the image, etc.

In some embodiments, the processing engine 112 may process the image uploaded by the user so that the processed image may have a standard format or form required by the online-to-offline system 100. Alternatively, the requester terminal 130 or the provider terminal 140 may process the image before uploading the image to the server 110. The process for the image processing may be described in the following part of the present disclosure.

It is noted that, the online-to-offline service system 100 is only provided for demonstration purposes, and not intended to be limiting. The image processing process described in the present disclosure may also have other application scenarios. For example, the image processing process may also be performed for preprocessing an image of a paper document for optical character identification (OCR) .

FIG. 2 is a schematic diagram illustrating an exemplary computing device 200. Computing device 200 may be configured to implement an apparatus for image processing (e.g., the server 110, the requester terminal 130, the provider terminal 140, the processing engine 112) , and perform one or more operations disclosed in the present disclosure. The computing device 200 may be configured to implement various modules, units, and their functionalities described in the present disclosure.

The computing device 200 may include a bus 270, a processor 210 (or a plurality of processors 210) , a read only memory (ROM) 230, a random access memory (RAM) 240, a storage device 220 (e.g., massive storage device such as a hard disk, an optical disk, a solid-state disk, a memory card, etc. ) , an input/output (I/O) port 250, and a communication interface 260. It may be noted that, the architecture of the computing device 200 illustrated in FIG. 2 is only for demonstration purposes, and not intended to be limiting. The computing device 200 may be any device capable of performing a computation.

The bus 270 may couple various components of computing device 200 and facilitate transferring of data and/or information between them. The bus 270 may have any bus structure in the art. For example, the bus 270 may be or may include a memory bus and/or a peripheral bus. The I/O port 250 may allow a transferring of data and/or information between the bus 270 and one or more other devices (e.g., a touch screen, a keyboard, a mouse, a microphone, a display, a speaker) . The communication interface 260 may allow a transferring of data and/or information between the network 130 and the bus 270. For example, the communication interface 260 may be or may include a network interface card (NIC) , a Bluetooth ^TM module, an NFC module, etc.

In some embodiments, via at least one of the I/O port 250 and the communication interface 260, the computing device 200 may receive source images from the plurality of imaging sensors 110 and/or output the generated combined image.

The ROM 230, the RAM 240, and/or the storage device 220 may be configured to store instructions that may be executed by the processor 210. The RAM 240, and/or the storage device 220 may also store data and/or information generated by the processor 210 during the execution of the instruction. In some embodiments, at least one of the ROM 230, the RAM 240, or the storage device 220 may implement the storage device 122 illustrated in FIG. 1.

The processor 210 may be or include any processor in the art configured to execute instructions stored in the ROM 230, the RAM 240, and/or the storage device 220, so as to perform one or more operations or implement one or more modules/units disclosed in the present disclosure. Merely by way of example, the processor 210 may include one or more hardware processors, such as a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field-programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.

In some embodiments, the computing device 200 may include a plurality of processors 210. The plurality of processors 210 may operate in parallel for performing one or more operations disclosed in the present disclosure.

In some embodiments, one or more of the components of the computing device 200 may be implemented on a single chip. For example, the processor 210, the ROM 230, and the RAM 240 may be integrated into a single chip.

In some embodiments, the computing device 200 may be a single device or include a plurality of computing devices having a same or similar architecture as illustrated in FIG. 2. In some embodiments, the computing device 200 may implement a personal computer (PC) or any other type of work station or terminal device. The computing device 200 may also act as a server if appropriately programmed.

FIG. 3 is a schematic diagram illustrating an exemplary image processing device 300 according to some embodiments of the present disclosure. The image processing device 300 may be an example of the image processing engine 110, the requester terminal 130, or the provider terminal 140 (illustrated in FIG. 1) . In some embodiments, the image processing device 300 may be implemented by the computing device 200 illustrated in FIG. 2. The image processing device 300 may include a first image module 310, a first location module 320, and an adjustment module 330.

The first image module 310 may be configured to obtain a first image including an object (or be referred to as an object to be recognized) in a first representation. The first representation may represent the original form in which the object is displayed.

The first location module 320 may be configured to determine at least one first location of the object in the first image. In some embodiments, the at least one first location may include positions of one or more vertices of the object in the first image. In some embodiments, the at least one first location may include positions of one or more part of the object in the first image.

The adjustment module 330 may be configured to generate a second image by adjusting the first image based on the at least one first location and a second representation with respect to the object. The second image (or be referred to as a processed image, a corrected image, or a standard image) may include the object in the second representation. The second representation may represent the desired form or the standard form in which the object is displayed.

In some embodiments, the second representation may be related to at least one of a reference size of the object, a reference size ratio of the object to the image including the object, and/or a reference direction with respect to the object. Correspondingly, the first representation in the first image may also be related to at least one of the original size of the object, the original size ratio of the object to the image including the object, and/or an original direction of the object.

The image processing device 300 may perform any one of processes illustrated in FIGs. 4 to 7 and 19 to process an image. Detailed description of the image processing device 300 and the modules thereof may be provided elsewhere in the present disclosure (e.g., in connection of FIGs. 4 to 7 and 19) .

It is noted that, the above descriptions about the image processing device 300 are only for illustration purposes, and not intended to limit the present disclosure. It is understood that, after learning the major concept and the mechanism of the present disclosure, a person of ordinary skill in the art may alter the image processing device 300 in an uncreative manner. The alteration may include combining and/or splitting modules, adding or removing optional modules, etc. All such modifications are within the protection scope of the present disclosure.

FIG. 4 is a flowchart illustrating an exemplary process 400 for image processing according to some embodiments of the present disclosure. Process 400 may be implemented by the image processing device 300 for correcting an image (first image) taken by a mobile computing device (e.g., the requester terminal 130, the provider terminal 140) . For example, the process 400 illustrated in FIG. 4 may be stored in a storage device (e.g., the storage device 150, the storage device 220, the ROM 230, the RAM 240) in the form of instructions, and invoked and/or executed by one or more processors (e.g., the processor 210) of the image processing device 300.

In 410, the first image module 310 may obtain a first image including an object in a first representation. The object may be a polygon including a plurality of vertices (e.g., 3, 4, 5, 6, 8) . In some embodiments, the object to be recognized may be rectangular or square and may include 4 vertices. For example, the object to be recognized may be an ID card, a license (e.g., a driving license) , a bank card, a certificate, a paper document, or the like, or a combination thereof, or a part thereof.

In some embodiments, the first image may be a source image captured by a camera of a terminal (e.g., the requester terminal 130, the provider terminal 140) . The first image module 310 may obtain the first image from the requester terminal 130, the provider terminal 140, or a storage device (e.g., the storage device 150, the storage device 220) , etc.

In 420, the first position module 320 may determine at least one first position of the object in the first image.

In some embodiments, the at least one first position may include positions of one or more vertices of the object in the first image. For example, the object may be rectangular, and the at least one first position may be or include the positions of the four vertices of the object in the first image (e.g., as illustrated in FIG. 11) . The positions of the one or more vertices may be in the form of coordinates of a coordinate system with respect to the first image. The coordinate system may be any proper coordinate system such as a Cartesian coordinate system, a spherical coordinate system, a polar coordinate system, etc.

In some embodiments, to determine the positions of the one or more vertices of the object in the first image, the first position module 320 may detect a plurality of peripheral lines of the object in the first image, and determine the at least one first position based on the plurality of peripheral lines. As used herein, a peripheral line may be at least a part of the corresponding edge of the first object. In some embodiments, the first position module 320 may determine an intersection of each adjacent pair of the plurality of peripheral lines as the one or more vertices. When an adjacent pair of the plurality of peripheral lines are spatially separated with each other, e.g., at least one of the peripheral lines is only a part of the corresponding edge of the object, the “intersection” may be the crossing point obtained by extending the adjacent pair of the plurality of peripheral lines.

In some embodiments, the first position module 320 may detect the plurality of peripheral lines by adopting a line segment detector (LSD) algorithm, or a variant thereof. By using the LSD algorithm to treat the first image, the first position module 320 may detect a plurality of line segments associated with the object in the first image. The first position module 320 may then filter the plurality of line segments to obtain filtered line segments, and determine the plurality of peripheral lines based on the filtered line segments. In some embodiments, the filtering may be based at least in part on directions (e.g., in the form of a vector, a slope, and angle of inclination) of the plurality of line segments. In some embodiments, the filtering may be based further on confidence scores of the plurality of line segments.

In some embodiments, before, after, or during the filtering, the first position module 320 may update the plurality of line segments by combining line segments identified as being along a same straight line. As a peripheral line of the object is often detected as multiple disconnected line segments (e.g., due to the over-exposure or under-exposure of the first image, such an updating may increase the accuracy of the determination of the peripheral lines.

In some embodiments, the filtered line segments may include a plurality of line segment sets corresponding to the plurality of peripheral lines, and the first image may include a plurality of predetermined parts corresponding to the plurality of peripheral lines. To filter the plurality of line segments to obtain filtered line segments, for each of the plurality of predetermined parts of the first image, the first position module 320 may select, from the plurality of line segments, a set of line segments in the predetermined part, wherein the direction of each of the set of line segments is within a preset range associated with the predetermined part. For example, when the object is rectangular, the plurality of predetermined parts of the first image may include a top part, a left part, a bottom part, and a right part of the first image. Each set of line segments may be referred to as a peripheral line set, based on which a corresponding peripheral line (e.g., a top/left/bottom/right peripheral line of the object) may be determined. In some embodiments, to determine the plurality of peripheral lines based on the plurality of filtered line segments, the first position module 320 may identify, for each of the plurality of line segment sets, the longest line segment of the line segment set as the corresponding peripheral line of the object.

Detailed descriptions of the filtering or the selecting of the line segments may be found elsewhere in the present disclosure (e.g., in connection with FIGs. 6 and 7) .

After obtaining the plurality of peripheral lines of the object, the first position module 320 may determine the at least one first position based on the plurality of peripheral lines. In some embodiments, the at least one first position may include positions of one or more parts of the object in the first image. For example, when the object is a license (e.g., the driving license) , the one or more parts may include at least a first part including a title of the license, a second part including an image of the owner (e.g., the face of the owner) . Optionally, the one or more parts may further include a third part. For licenses in different districts, the third part may include a stamp, a signature (e.g., of the owner) , a district identifier (e.g., a national flag, a national name) , another image of the owner, or a code bar (e.g., a QR code) . In some embodiments, the positions of the one or more parts may correspond to the coordinates of their center points (or any other proper points such as the top left vertices) .

In some embodiments, the first position module 320 may recognize the at least one part of the object in the first image and obtain its position in the first image using an object recognition technique. In different embodiments, the object recognition technique may be based on a support vector machine (SVM) algorithm, a neural network algorithm, a face identification algorithm, or the like, or a combination thereof, or a variant thereof. In some specific embodiments, the object recognition technique may be based on a convolution neural network (CNN) algorithm, or a variant thereof.

Detailed descriptions of the object recognition may be found elsewhere in the present disclosure (e.g., in connection with FIG. 18) .

In 430, the adjustment module 330 may generate a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object. The second image may include the object in the second representation.

In some embodiments, the second representation may be related to at least one of a reference size of the object, a reference image occupation ratio of the object, and/or a reference direction with respect to the object. Correspondingly, the first representation in the first image may also be related to at least one of the original size of the object, the original image occupation ration of the object, and/or an original direction of the object. For example, when the object is displayed in the first representation, the object may be tilted with respect to the image (the first image) , distorted, and/or have a low image occupation ratio (e.g., less than 50%) . After the adjustment performed by the adjustment module 330, the object may be displayed in the second representation. For example, when the object is displayed in the second representation, the tilt and/or distortion of the object may be reduced, and/or the image occupation ratio of the object may be increased (e.g., above 90%) .

In some embodiments, the adjustment module 330 may determine, based at least in part on the second representation and the plurality of peripheral lines, at least one second position corresponding to the at least one first position. For example, the at least one first position may include positions (e.g., in the form of coordinates) of the vertices of the object in the first image. Correspondingly, the at least one second position may include desired positions (e.g., in the form of coordinates) of the vertices of the object in the processed image (second image) . Based on the at least one first position and the at least one second position, the adjustment module 330 may obtain a correction matrix for correcting (or processing) the first image. The adjustment module 330 may apply the correction matrix on the first image to obtain the second image.

In some embodiments, the correction matrix may be obtained (or computed) based at least in part on a mapping relationship between the at least one first position and the at least one second position. The correction matrix may be used to translate, rotate, and/or scale the object, and/or reduce the distortion of the object. The adjustment module 330 may then process the first image or the image part including the object using the correction matrix via, e.g., a convolution operation. The resulting image may be the second image.

In some embodiments, the correction matrix may be obtained based further on other factors with respect to the first image. For example, the correction matrix may be obtained based further on color/luminance/contrast information of the first image. By applying the correction matrix on the first image or the image part including the object, the obtained second image may also have desired color/luminance/contrast, which may form a part of the second representation.

In some embodiments, the adjustment module 330 may select, based on the at least one first position, an operation from a plurality of predetermined operations, and process the first image using the selected operation. For example, the adjustment module 330 may select, based on the at least one first position, a rotation mode from a plurality of predetermined rotation modes, and rotate the first image or the image part including the object using the selected rotation mode to obtain the second image, in which the object may have a desired direction with respect to the second image. Detailed description of rotation modes may be found elsewhere in the present disclosure (e.g., in connection with FIGs. 12 to 30) .

In some embodiments, the adjustment module 330 may further crop the second image, so that the obtained image may only include the object (or at least a part thereof) or be majorly covered by the object (e.g., ≥ 95%) .

It is noted that the above descriptions of process 400 are only for demonstration purposes, and not intended to be limiting. It is understandable that, after learning the major concept of the present disclosure, a person of ordinary skills in the art may alter process 400 in an uncreative manner. For example, the operations above may be implemented in an order different from that illustrated in FIG. 4. One or more optional operations may be added to the flowcharts. One or more operations may be split or be combined. All such modifications are within the scope of the present disclosure.

FIG. 5 is a flowchart illustrating an exemplary process 500 for image processing according to some embodiments of the present disclosure. Process 500 may be an example of process 400 illustrated in FIG. 4. Process 500 may be implemented by the image processing device 300 for correcting an image (first image) taken by a mobile computing device (e.g., the requester terminal 130, the provider terminal 140) . For example, the process 500 illustrated in FIG. 5 may be stored in a storage device (e.g., the storage device 150, the storage device 220, the ROM 230, the RAM 240) in the form of instructions, and invoked and/or executed by one or more processors (e.g., the processor 210) of the image processing device 300.

In 510, for each of the vertices of the object to be recognized, actual coordinates of the vertex with respect to a first coordinate system may be identified based on a plurality of peripheral lines of the object to be recognized in a source image. The source image may be the first image obtained via operation 410 of process 400 illustrated in FIG. 4. Operation 510 may be performed by the first position module 320.

In 520, a standard size of the object to be recognized may be determined based on the plurality of peripheral lines of the object to be recognized, and standard coordinates of the vertices of the object to be recognized with respect to the first coordinate system may be obtained based on the standard size. Operation 520 may be performed by the first position module 320.

Operations

510 and 520 may correspond to operation 420 of process 400 illustrated in FIG. 4.

In 530, a distortion correction matrix may be obtained based on the actual coordinates and the standard coordinates of the vertices, and a corrected image (second image) may be obtained by performing an image correction on the source image using the distortion correction matrix. Operation 530 may be performed by the adjustment module 330.

In some embodiments, the image processing apparatus may be implemented via a computer program, such as software or an application. Alternatively, the image processing apparatus may be a storage medium storing a related computer program, such as a Universal Serial Bus (USB) flash drive. Alternatively, the image processing apparatus may be a physical device integrated or installed with a related computer program, such as a chip, a smart phone, a computer.

In some embodiments, process 400 may be automatically initiated in respond to a receiving of a source image. The source image may be obtained by shooting, or be inputted by a user. In some specific embodiments, the peripheral lines of the object to be recognized in the source image may be obtained at first. Then the position (at least one position) of each vertex of the object to be recognized may be determined based on the peripheral lines, wherein the position is the actual coordinates of the each vertex in the source image with respect to the first coordinate system.

In some embodiments, the vertices of the object to be recognized may be determined based on the peripheral lines of the object to be recognized with improved accuracy. In some embodiments, operation 510 may include: determining an intersection of each adjacent pair of the peripheral lines of the object to be recognized as a corresponding vertex of the object to be recognized; and obtaining actual coordinates of the vertices of the object to be recognized in the source image with respect to the first coordinate system.

For example, as shown in FIG. 11. FIG. 11 is a schematic diagram illustrating a determination of the vertices of the object to be recognized based on intersections of the peripheral lines of FIG. 10 according to some embodiments of the present disclosure. The small circles (e.g., circles 1111 to 1114) in FIG. 11 are the determined vertices, and the coordinates of the vertices with respect to the coordinate system illustrated in FIG. 11 are the actual coordinates of the vertices. In the present embodiment, based on the intersections of the adjacent peripheral lines of the object to be recognized, the vertices of the object to be recognized may be quickly and accurately determined.

Next, based on the peripheral lines of the object to be recognized, the standard size of the object to be recognized may be determined, thereby obtaining the standard position (e.g., in the form of coordinates (or be referred to as standard coordinates) ) of each vertex. In some specific embodiments, the standard size and the standard position may be set based on the image effect (second representation) desired for the finally obtained standard image (second image) . For example, if it is desired that, in the finally obtained standard image, the object to be recognized is to occupy the entire image, the standard size may be (or be based on) the size of the object to be recognized in the source image. As another example, if it is desired that, in the finally obtained standard image, the object to be recognized is not tilted. Correspondingly, when determining the standard coordinates of the vertices of the object to be recognized, the standard coordinate of each vertex may be determined, based on the standard size, along the horizontal and vertical directions, so that in the standard image, the top and bottom edges of the object to be recognized are in parallel with the horizontal direction, and the left and right edges of the object to be recognized are in parallel with the vertical direction, that is, the object to be recognized in the standard image is not tilted.

In some embodiments, when the object to be recognized is to cover the whole image after the above correction, on the basis of any embodiment provided in the present disclosure, the obtaining, based on the standard size, standard coordinates of the vertices of the object to be recognized with respect to the first coordinate system in operation 520 may include: setting the standard coordinates of any one vertex of the object to be recognized as the origin of the first coordinate system; and obtaining the standard coordinates of the other vertices of the object to be recognized based on the standard coordinates of the any one vertex and the standard size.

The first coordinate system may be prebuilt. For example, the first coordinate system may be built with the top left vertex of the source image as the origin, with the X-axis pointing rightward and the Y-axis pointing downward. To make the object to be recognized to fully cover the standard image, in some embodiments, the standard coordinates of any one vertex of the object to be recognized may be set as the origin of the first coordinate system. For example, if the object to be recognized is rectangular, the standard coordinates of the top left vertex of the object to be recognized may be set as the origin (0, 0) . The standard coordinates of the other vertices may be determined based on the standard size of the object to be recognized. For example, when it is desired that the object to be recognized in the standard image is not tilted, and the obtained standard size of the object to be recognized includes a standard width W _norm and a standard height H _norm, it may be determined that, the standard coordinate of the top right vertex is (W _norm, 0) , the standard coordinate of the bottom left vertex is (0, H _norm) , and the standard coordinates of the bottom right vertex is (W _norm, H _norm) .

Via the above embodiments, a precise cropping of the object to be recognized in the source image may be achieved.

After obtaining the actual coordinate and standard coordinate of each vertex, a transformation matrix (or be referred to as a distortion correction matrix) for correcting the source image may be obtained based on the actual coordinate and standard coordinate of each vertex. By processing the source image with the distortion correction matrix, the corrected image (standard image) may be obtained.

It is understood that, the peripheral lines of the object to be recognized may define the outline of the object to be recognized. For example, for an object to be recognized having a quadrilateral outline, there may be four peripheral lines, including a top peripheral line, a bottom peripheral line, a left peripheral line, and a right peripheral lin. For an object to be recognized having a triangular outline, there may be three peripheral lines, which are the three sides of the triangle respectively.

In some embodiments, process 500 may be adopted to optimize the image quality of an image of a document. The document may be an ID card, a business card, a bank card, a license. Correspondingly, on the basis of any embodiment provided in the present disclosure, the peripheral lines of the object to be recognized may include a top peripheral line, a bottom peripheral line, a left peripheral line, and a right peripheral line. Then the determining, based on the plurality of peripheral lines of the object to be recognized, a standard size of the object to be recognized based on the peripheral lines of the object to be recognized in operation 520 may include: 650: determining a standard width and a standard height of the object to be recognized based on the plurality of peripheral lines of the object to be recognized (as illustrated in FIG. 6) .

In actual applications, the object to be recognized is often a document (e.g., an ID card, a certificate, a paper document) which is usually rectangular. Therefore, the present embodiment may be described by way of examples in connection with a rectangular object to be recognized. For a rectangular object to be recognized, the peripheral lines thereof may be the four sides of the rectangle. According to their relative positions, the peripheral lines of the object to be recognized may include a top peripheral line, a bottom peripheral line, a left peripheral line, and a right peripheral line. Additionally, the size of the rectangle may depends on the width (lateral length) and height (longitudinal length) thereof. Correspondingly, the determining of the standard size of the object to be recognized may be treated as determining the standard width and the standard height of the object to be recognized.

In some embodiments, to ensure the resolution of the corrected image and to avoid image distortion, a strategy for enlarging the image may be adopt when determining the standard width and the standard height. Correspondingly, in some embodiments, operation 650 may include: designating the maximum length of the top peripheral line and the bottom peripheral line as the standard width, and designating the maximum length of the left peripheral line and the right peripheral line as the standard height.

Process 500 may be adopted in actual application scenarios, in which a rectangular object is commonly to be recognized and corrected. Process 500 may improve the result of the image processing and is proper for the common application scenario.

It is noted that the above descriptions of process 500 are only for demonstration purposes, and not intended to be limiting. It is understandable that, after learning the major concept of the present disclosure, a person of ordinary skills in the art may alter process 500 in an uncreative manner. For example, the operations above may be implemented in an order different from that illustrated in FIG. 5. One or more optional operations may be added to the flowcharts. One or more operations may be split or be combined. All such modifications are within the scope of the present disclosure.

In some embodiments, in order to obtain the positions of the vertices of the object to be recognized based on the peripheral lines, the obtaining the peripheral lines of the object to be recognized may be performed first, which may be achieved using various approaches. Taking a rectangular object to be recognized as an example, an exemplary image processing process is illustrated in FIG. 6.

FIG. 6 is a flowchart illustrating an exemplary process 600 for image processing according to some embodiments of the present disclosure. Process 600 is on the basis of process 500 in FIG. 5 for processing an image including a rectangular object to be recognized. Process 600 may be implemented by the image processing device 300 for correcting an image (first image) taken by a mobile computing device (e.g., the requester terminal 130, the provider terminal 140) . For example, the process 600 illustrated in FIG. 6 may be stored in a storage device (e.g., the storage device 150, the storage device 220, the ROM 230, the RAM 240) in the form of instructions, and invoked and/or executed by one or more processors (e.g., the processor 210) of the image processing device 300.

In 610, a plurality of line segments may be identified using a line detection algorithm. Operation 610 may be performed by the first position module 320.

In 620, a top peripheral line set, a bottom peripheral line set, a left peripheral line set, and a right peripheral line set may be obtained by filtering the plurality of line segments using the corresponding filtering conditions. The filtering conditions may represent one or more features of the respective peripheral lines of the object to be recognized. Operation 620 may be performed by the first position module 320.

In 630, the longest line segment may be selected from each of the top peripheral line set, the bottom peripheral line set, the left peripheral line set, and the right peripheral line set, respectively, as the top peripheral line, the bottom peripheral line, the left peripheral line, and the right peripheral line of the object to be recognized. Operation 630 may be performed by the first position module 320.

Operations 610 to 630 may be performed to obtain the plurality of peripheral lines in operation 510 of process 500 illustrated in FIG. 5.

For example, as shown in FIG. 8. FIG. 8 is a schematic diagram illustrating a source image 800 including an ID card 801 according to some embodiments of the present disclosure. The source image 800 may have a width W and a height H. It can be seen that the ID card 801 in the source image 800 is tilted and slightly distorted. After obtaining the source image 800, first, a line segment detector (LSD) algorithm may be adopted to identify a plurality of line segments in the source image. For example, as shown in FIG. 9. FIG. 9 is a schematic diagram illustrating line segments detected by operating an LSD algorithm on the source image 800 of FIG. 8 according to some embodiments of the present disclosure. As shown in FIG. 9, each line segment (e.g., line segments 901 to 909) in FIG. 9 are the line segments detected via the LSD algorithm.

Due to the performance of the LSD algorithm adopted and the quality of the source image, a peripheral line of the object to be recognized may by fully detected (e.g., line segment 903) or not fully detected (e.g., line segment 901) , multiple part of the same peripheral line may be detected as multiple line segments (e.g., line segments 904 and 905) , and one or more noise line segments (e.g., line segments 906 to 909) may also be detected in the source image.

Line segments that do not belong to the peripheral lines of the object to be recognized (e.g., a document) may be removed by filtering the plurality of line segments of the object to be recognized. The line segments belonging to the peripheral lines of the object to be recognized may be grouped according to their relative positions, e.g., at the top part, the bottom part, the left part, or the right part of the source image or the object to be recognized. The filtering conditions of the peripheral line groups may be set based on the features of the corresponding peripheral lines. A line segment satisfying one of the filtering conditions may be assigned to a corresponding peripheral line set.

In some embodiments, the filtering may be based at least in part on directions of the plurality of line segments. An exemplary process is described in connection with FIG. 7.

In some embodiments, the filtering may be based at least in part on confidence scores of the plurality of line segments. The confidence score of a line segment may be determined and associated with the line segment when the line segment is detected by the first position module 320 using the LSD algorithm.

Next, as a line segment along a peripheral line of the object to be recognized is generally longer than a noise line segment, the peripheral lines of the object to be recognized may be obtained by selecting the longest line segment from each of the peripheral line sets. For example, the peripheral lines (e.g., line segments 901 to 904) of an ID card determined via the above processing is shown in FIG. 10. FIG. 10 is a schematic diagram illustrating peripheral lines determined based on line segments of FIG. 9 according to some embodiments of the present disclosure

In embodiment of the present disclosure, the filtering conditions may be set according to the features of peripheral lines of the object to be recognized. The detected line segments that may possibly be parts of the peripheral lines may be assigned to a plurality of peripheral line groups, from which the peripheral lines of the object to be recognized may then be determined. The whole process is very simple. Without occupying excessive computing resources, the peripheral lines of the object to be recognized may be accurately detected.

In some embodiments, in 630, the first position module 320 may optionally combine line segments detected via the LSD algorithm that are along the same straight line. Usually, line segments along the same straight line may be parts of a peripheral line of the object to be recognized (e.g., line segments 904 and 905) , which may be detected as broken because of the over-exposure or under-exposure of the source image. The combining of the line segments along the same straight line may increase the length of the detected part of a peripheral line, thereby improving the accuracy of the determination of the peripheral lines.

The first position module 320 may identify, from the plurality of line segments detected via the LSD algorithm, line segments along the same straight line. For example, the first position module 320 may identify the line segments along the same straight line based at least in part on the directions (e.g., measured by slopes, angles of inclination) and the positions of the plurality of line segments. In some embodiments, line segments having the same direction or similar directions (e.g., ±5%) ) , in the same part (e.g., the top/bottom/left/right part) of the source image, and/or close to each other (e.g., the distance is within 20 pixels) may be determined as being along the same straight line. The first position module 320 may then update the plurality of line segments by combining the line segments identified as being along the same straight line. In different embodiments, to combine the identified line segments, the first position module 320 may connect the closest ends of the identified line segments, create a new line segment based on coordinates of the ends of the identified line segments (e.g., by fitting) , or simply treat or label the identified line segments as a single line segment (although actually the identified line segments may be spatially separated) .

In 640, for each of the vertices of the object to be recognized, an actual coordinate of the vertex with respect to a first coordinate system may be identified based on a plurality of peripheral lines of the object to be recognized in the source image (first image) . Operation 640 may be performed by the first position module 320. Operations 640 may be the same as or similar to operations 510, the descriptions of which are not repeated herein.

In 650, a standard width and a standard height of the object to be recognized may be determined based on the plurality of peripheral lines of the object to be recognized, and standard coordinates of the vertices of the object to be recognized with respect to the first coordinate system may be obtained based on the standard width and the standard height. Operation 650 has been described elsewhere (e.g., in connection with FIG. 5, the descriptions of which are not repeated herein.

In 660, a distortion correction matrix may be obtained based on the actual coordinates and the standard coordinates of the vertices, and a corrected image (second image) may be obtained by performing an image correction on the first image using the distortion correction matrix. Operation 660 may be performed by the adjustment module 330. Operations 660 may be the same as or similar to operation 530, the descriptions of which are not repeated herein.

It is noted that the above descriptions of process 600 are only for demonstration purposes, and not intended to be limiting. It is understandable that, after learning the major concept of the present disclosure, a person of ordinary skills in the art may alter process 600 in an uncreative manner. For example, the operations above may be implemented in an order different from that illustrated in FIG. 6. One or more optional operations may be added to the flowcharts. One or more operations may be split or be combined. All such modifications are within the scope of the present disclosure.

Various features of the peripheral lines of a rectangular object to be recognized may be used for setting the above filtering conditions. Exemplary features are described in connection with FIG. 7. FIG. 7 is a flowchart illustrating an exemplary process 700 for image processing according to some embodiments of the present disclosure. Process 700 is on the basis of process 600 in FIG. 6 for processing an image including a rectangular object to be recognized. Process 700 may be implemented by the image processing device 300 for correcting an image (first image) taken by a mobile computing device (e.g., the requester terminal 130, the provider terminal 140) . For example, the process 700 illustrated in FIG. 7 may be stored in a storage device (e.g., the storage device 150, the storage device 220, the ROM 230, the RAM 240) in the form of instructions, and invoked and/or executed by one or more processors (e.g., the processor 210) of the image processing device 300.

In 710, a plurality of line segments may be identified using a line detection algorithm. Operation 710 may be performed by the first position module 320. Operation 710 may be the same as or similar to operation 610.

In 720, from the left part and the right part of the source image, line segments satisfying that the angle of inclination with respect to, e.g., the vertical direction or the Y axis, is within a preset range and the confidence score is higher than a predetermined threshold may be selected as the left peripheral line set and the right peripheral line set, respectively. Operation 720 may be performed by the first position module 320.

In 730, from the top part and the bottom part of the source image, line segments satisfying that the angle of inclination with respect to e.g., the horizontal direction or the X axis, is within the preset range and the confidence score is higher than a predetermined threshold may be selected as the top peripheral line set and the bottom peripheral line set, respectively. Operation 730 may be performed by the first position module 320.

Operations

720 and 730 may correspond to operation 620 of process 600 illustrated in FIG. 6.

The parts of the source image in

operation

720 and 730 for the filtering may be predetermined based on the actual needs. In some specific embodiments, the confidence scores of all the line segments are needed to be higher than corresponding thresholds, wherein the thresholds associated with different peripheral lines (e.g., top, bottom, left, and right) may be the same or different. The confidence score may be obtained based on the aforementioned LSD algorithm, and may represent the probability that the detected object is a line segment. Therefore, it is possible to remove the detected objects that are obviously not line segments by setting one or more thresholds with respect to the confidence scores. Further, in some embodiments, a grouping of the detected line segments may be perform based on the positions of the line segments. For example, the left peripheral line may locate in the left part of the source image, the right peripheral line may locate in the right part of the source image, and so on. In some embodiments, considering the user's operating habits, the angle of inclination (or slope) of the object to be recognized may be limited to a certain range. For example, a line segment whose angle of inclination is obviously out of the range is less likely to be a peripheral line (or a part thereof) of the object to be recognized. In some embodiments, the filtering conditions may be set according to one or more of the aforementioned factors. It is understood that, there may also be embodiments in which the filtering conditions associated with are set in combination with other features.

Further, there may be various implementations for performing the above filtering process. For example, the process 700 as shown in FIG. 7 may further include: obtaining a width W and a height H of the source image, and angles of inclinations L [θ] _i of the line segments with respect to the horizontal direction, taking the top left vertex of the source image as the origin, and building a second coordinate system with an X axis pointing rightward along the horizontal direction and a Y axis pointing downward along the vertical direction.

Correspondingly, operation 720 may include: designating line segments satisfying:

with a confidence score greater than a predetermined threshold as the left peripheral line set; and designating line segments satisfying:

with a confidence score greater than a predetermined threshold as the right peripheral line set. Operation 730 may include: designating line segments satisfying:

with a confidence score greater than a predetermined threshold as the top peripheral line set; and designating line segments satisfying:

with a confidence score greater than a predetermined threshold as the bottom peripheral line set. In the above formulas, L [x] _i is the horizontal coordinates (X) of the ends of the line segments with respect to the second coordinate system, L [y] _i is the vertical coordinates (Y) of the ends of the line segments with respect to the second coordinate system, θ is the maximum allowable angle of inclination of the object to be recognized (e.g., a document) with respect to the horizontal direction.

In some embodiments, the second coordinate system may be the coordinate system as shown in FIG. 10. The second coordinate system may be the same as or different from the first coordinate system. The filtering conditions may be in the form of a plurality of formulas based on various parameters of the source image. In actual applications, the object to be recognized in a source image is generally at or near the center of the entire source image, therefore, W/2 may be used as a threshold for distinguishing the left and right peripheral lines in the horizontal direction, and H/2 may be used as a threshold for distinguishing the top and bottom peripheral lines in the vertical direction. Then the plurality of line segments may be classified based on the image parts where the ends of each line segment locate in. In addition, in order to reduce the complexity of the algorithm, in some embodiments, the angle of inclination of the object to be recognized may be limited to a preset range. For example, the angle of inclination of the object to be recognized with respect to the horizontal direction may be set as no more than 20 degrees. That is, the angles of inclination of the top and bottom peripheral lines which should be parallel to the horizontal direction with respect to the horizontal direction do not exceed 20 degrees, and the angles of inclination of the left and right peripheral lines which should be perpendicular to the horizontal direction with respect to the horizontal direction is not less than 90-20 degrees, i.e., 70 degrees.

In the above embodiments, the peripheral lines of a rectangular object to be recognized may be filtered based on features of the rectangular object without using a complex algorithm, thereby reducing computational complexity and improving the efficiency of the image recognition and the correction.

FIG. 12 is a schematic diagram illustrating a corrected image obtained based on the vertices of FIG. 11 according to some embodiments of the present disclosure. By processing the source image with a process of any one of the above embodiments, the object to be recognized may be precisely cropped out, and the distortion thereof may be corrected as well.

In 740, the longest line segment may be selected from each of the top peripheral line set, the bottom peripheral line set, the left peripheral line set, and the right peripheral line set, respectively, as the top peripheral line, the bottom peripheral line, the left peripheral line, and the right peripheral line of the object to be recognized. Operation 740 may be performed by the first position module 320. Operations 740 may be the same as or similar to operations 510, the descriptions of which are not repeated herein

In 750, for each of the vertices of the object to be recognized, an actual coordinate of the vertex with respect to a first coordinate system may be identified based on a plurality of peripheral lines of the object to be recognized in the source image (first image) . Operation 750 may be performed by the first position module 320. Operations 640 may be the same as or similar to operations 510, the descriptions of which are not repeated herein.

In 760, a standard width and a standard height of the object to be recognized may be determined based on the plurality of peripheral lines of the object to be recognized, and standard coordinates of the vertices of the object to be recognized with respect to the first coordinate system may be obtained based on the standard width and the standard height. Operation 760 may be performed by the first position module 320. Operations 650 may be the same as or similar to operations 650, the descriptions of which are not repeated herein

In 770, a distortion correction matrix may be obtained based on the actual coordinates and the standard coordinates of the vertices, and a corrected image (second image) may be obtained by performing an image correction on the first image using the distortion correction matrix. Operation 660 may be performed by the adjustment module 330. Operations 770 may be the same as or similar to operation 530, the descriptions of which are not repeated herein.

It is noted that the above descriptions of process 700 are only for demonstration purposes, and not intended to be limiting. It is understandable that, after learning the major concept of the present disclosure, a person of ordinary skills in the art may alter process 700 in an uncreative manner. For example, the operations above may be implemented in an order different from that illustrated in FIG. 7. One or more optional operations may be added to the flowcharts. One or more operations may be split or be combined. All such modifications are within the scope of the present disclosure.

In summary, an image processing method provided in embodiments of the present disclosure may include: determining a plurality of peripheral lines of an object to be recognized from a source image based on one or more features of the object to be recognized; determining actual coordinates of vertices of the object to be recognized in the source image based on the plurality of peripheral lines; determining a standard size of the object to be recognized based on the plurality of peripheral lines; determining standard coordinates of the vertices, wherein the standard size and the standard coordinates are determined based on the image that is finally desired; obtaining a correction matrix based on the actual coordinates and the standard coordinates; and correcting the source image using the correction matrix to obtain a corrected image. The above process may the automatically correct the object to be recognized in the source image. The high-quality image obtained via the correction may facilitate the subsequent image processing and recognition, and be more compatible with various application scenarios, such as OCR, identity and/or qualification verification via a network (e.g., the network 120) , etc.

In some embodiments, the source image may be used for identity and/or qualification verification via a network. For example, the source image may be required for receiving or providing a service via the online-to-offline system 100. Correspondingly, the object to be recognized in the source image may be a document such as an ID card, a license (e.g., a driving license) , a passport, a bank card, etc. Such a document may generally have a unified format, which may serve as a template (second representation) for object recognition and/or image processing (or correction) . FIG. 13 is a schematic diagram illustrating an exemplary template according to some embodiments of the present disclosure. In some embodiments, the platform implemented by the online-to-offline system 100 may require a user (e.g., a transportation service provider) to upload an image (source image) of his/her driving license, which may have a unified template. For example, in People's Republic of China, the driving licenses may have a unified template as illustrated in FIG. 13. The platform may require that the image uploaded by the user is in accordance with such a template. For example, the uploaded image may only include the image part of the driving license, and the driving license is in the correct direction with respect to the uploaded image, which is in line with the direction in which the user checks the image. In some embodiments, there may be multiple templates for, e.g., different types of documents, the same type of documents in different districts, the same type of documents distributed by different organizations, etc.

The following part of the present disclosure may be described by way of example in connection with the recognition and correction of images including driving licenses. It is noted that, images including other types of documents may also be processed using the provided processes.

In many cases, when the user takes an image of a document through a terminal (e.g., the requester terminal 130 and the provider terminal 140) and uploads the image through an application of the terminal, due to the type of the terminal, the shooting environment, the setting of the application, the shooting direction, and/or other possible factors, the document in the image may have a certain rotation angle with respect to the image. In addition, besides the image part of the document, the image may also include a complex background.

For example, FIGs. 14 to 17 are schematic diagrams illustrating images to be processed (or be referred to as an uploaded image, a source image, a first image) in different situations according to some embodiments of the present disclosure. Taking the driving license as an example, when the user is uploading an image of a driving license, possible images actually uploaded by the user in different situations are illustrated in FIGs. 14 to 17, respectively. FIG. 14 is a schematic diagram illustrating an image to be processed in Situation 1 according to some embodiments of the present disclosure. FIG. 15 is a schematic diagram illustrating an image to be processed in Situation 2 according to some embodiments of the present disclosure. FIG. 16 is a schematic diagram illustrating an image to be processed in Situation 3 according to some embodiments of the present disclosure. FIG. 17 is a schematic diagram illustrating an image to be processed in Situation 4 according to some embodiments of the present disclosure. In Situation 1 as shown in FIG. 14, the image to be processed may include the driving license with the correct direction, but there may be a remarkable ratio of background content surrounding the driving license. Usually, the background content may be or include one or more attached pages of the driving license or the shooting background. In Situation 2 as shown in FIG. 15, the driving license in the image to be processed may be rotated 90 degrees clockwise with respect to the correct direction. In Situation 3 as shown in FIG. 16, the driving license in the image to be processed may be rotated 180 degrees clockwise with respect to the correct direction. In Situation 4 as shown in FIG. 17, the driving license in the image to be processed may be rotated 90 degrees anticlockwise with respect to the correct direction.

Therefore, when a user uploads an image of a document, the direction of the driving license with respect to the uploaded image is often not correct, and the above direction differences may occur. When the platform verifies the uploaded image, the uploaded image may need to be adjusted if the document (e.g., a driving license) in the uploaded image has a different rotation angle, resulting in a low image processing efficiency. Therefore, how to process an image uploaded by a user to adjust the direction of the document thereof to the correct direction is a technical problem to be solved.

Therefore, embodiments of the present disclosure provides an image processing process and apparatus for processing an image uploaded by a user to adjust the direction of the image to a correct direction, thereby improving the image processing efficiency. FIG. 18 is a flowchart illustrating an exemplary process 1800 for processing an image according to some embodiments of the present disclosure. Process 1800 may be an example of process 400 illustrated in FIG. 4. Process 1800 may be implemented by the image processing device 300 for correcting an image (first image) taken by a mobile computing device (e.g., the requester terminal 130, the provider terminal 140) . For example, the process 1800 illustrated in FIG. 18 may be stored in a storage device (e.g., the storage device 150, the storage device 220, the ROM 230, the RAM 240) in the form of instructions, and invoked and/or executed by one or more processors (e.g., the processor 210) of the image processing device 300.

In 1810, positions (first position) of at least three targets in an image to be processed may be obtained. The image to be processed may include a first image part representing an object to be recognized. The at least three targets may be parts of the first document, and may have fixed positions and the same or similar features in different first documents. The image to be processed may be the first image obtained via operation 410 of process 400 in FIG. 4. In some embodiments, operation 1810 may be performed by the first position module 320.

The execution body of embodiments of the present disclosure may be an electronic device having a data processing function such as a server (e.g., the server 110) , a terminal (e.g., the requester terminal 130, the provider terminal 140) , or a workstation. In some embodiments, in 1810, a server may obtain the image to be processed, and obtain positions of the at least three targets in the image to be processed. The positions of the at least three targets may be positions (e.g., in the form of coordinates) of the at least three targets with respect to the image to be processed. Alternatively, the positions of the at least three targets may be relative positions of the at least three specified targets with respect to one specific target of the at least three targets.

In some embodiments, the image to be processed may be an image including a document. For example, the image to be processed may include a first image part representing a first document. The first document may be any national legal document such as an ID card, a qualification certificate, a kinship document, a functioning document. The at least three targets may have fixed positions and the same or similar features in different first documents. For example, when the first document is a driving license as shown in FIG. 13, as the image of the owner (e.g., driver) in each driving license may have a fixed position and a similar feature (e.g., features related to a human face) , the image of the owner may be used as one of the at least three targets. The feature described herein may be adjusted based on one or more features extracted from the image to be processed according to different image processing technologies. For example, the feature may include a size, a color, and a content of the image.

Taking the processing of an image of a driving license as an example, the obtaining the positions of the at least three targets is described in detail as following. FIG. 19 is a schematic diagram illustrating an enlarged view of the image to be processed in FIG. 14.

As shown in FIG. 19, the image to be processed 1900 may include a first part 1901 and a second part 1902. The first part 1901 may be the image part representing a driving license and the second part 1902 may be the background part other than the image part of the driving license. The at least three targets of the driving license may include at least: the title 1911 of the driving license, the stamp 1912 (e.g., a red one) , and the image of the owner 1913. Then after the server obtaining the image to be processed 1900, positions of the three targets (the title 1911, the stamp1912, and the image of the owner 1913) may be obtained. The positions of the targets may be designated by labeling corresponding pixels of the image to be processed 1900. For example, the image to be processed 1900 may be in the form of a pixel matrix having a size of 600*400. The position of the title 1911 may be represented by coordinates of pixels at the four vertices of the title 1911, such as { (100, 100) , (500, 100) , (100, 500) , (500, 500) } . The positions of the stamp 1912 and the image of the owner 1913 may be represented similarly. The position of a target may also be represented by any other labeled pixel (e.g., a pixel at the center of the target) of the image to be processed 1900 similarly. It is noted that, the above descriptions of the representation of the position of a target are only for demonstration purposes, and the positions (or relative positions) of the at least three targets may also be represented in other ways.

In some embodiments, in 510, the positions of the at least three targets in the image to be processed may be obtained using a Convolutional Neural Network (CNN) model. For example, the CNN model may be pre-trained using a plurality of samples. The plurality of samples may be images to be processed with manually labeled targets. The at least three targets in each of the image to be processed may have known positions. The trained CNN model may take an image to be processed as at least part of its input, and determine the positions of the at least three targets in the image to be processed as its output.

In some embodiments, in the performance test of the CNN model, 25k samples are manually labeled. In each sample, positions of at least three targets are labeled. In the 25k samples, 20k samples are used as training samples and 5k samples are used as a test dataset. The performance of the finally obtained CNN model is: when IOU=0.7, mAP=0.98, wherein IOU and mAP are abbreviations of Intersection over Union and mean Average Precision, respectively.

In some embodiments, the image processing process provided may obtain positions of at least three targets from the image to be processed. As the positions of the at least three targets in the first document of the image to be processed are relatively fixed. Therefore, in 1810, the positions of the at least three targets in the image to be processed may be obtained from a certain region of the first part of the image to be processed. For example, in the image to be processed 1900 as shown in FIG. 19, the targets may include: the title 1911, the stamp 1912, and the image of the owner 1913. Since the positions of the above three targets for each driving license are fixed, after obtaining the image to be processed, the positions of the targets may be obtained from the first part of the image to be processed. FIG. 20 is a schematic diagram illustrating an exemplary pattern for obtaining positions of targets according to some embodiments of the present disclosure. FIG. 20 illustrate an outline of a first part 2000 of the image to be processed. After the first part of the image to be processed is extracted via an image processing technology, the position of the title 1911 may be determined in

regions

2010 or 2020, and the position of the stamp 1912 and the position of the image of the owner 1913 may be determined in

regions

2030 and 2040, respectively. Thus the positions of the targets may be obtained by processing local image regions instead of the whole image, thereby reducing the computation burden and improving the computation efficiency.

In 1820, a rotation mode of the image to be processed may be determined based on the positions of the at least three objects. In some embodiments, the direction of the first part in the image to be processed may be determined according to the positions of the at least three targets obtained in operation 1810. Once the direction of the first part of the image to be processed is determined, the direction of the image to be processed may also be determined as well. The rotation mode for rotating the image to be processed may then be determined based on the direction of the image to be processed. For example, in Situation 2 as shown in FIG. 15, the driving license in the image to be processed is rotated 90 degrees clockwise with respect to the correct direction, then the rotation mode of the image to be processed may be represented as a to-rotate-90-degree-anticlockwise mode, or a rotated-90-degree-clockwise mode. The operation 1820 may be performed by the adjustment module 330.

In some embodiments, operation 1820 may be achieved by: determining, based on a mapping relationship between positions (first positions) of the at least three targets in the image to be processed and positions (second positions) of the at least three targets in a reference image (or template, second representation) , a rotation mode of the image to be processed. For example, by taking the image shown in FIG. 13 as a reference image, to process the image to be processed illustrated in FIG. 15, the positions of the at least three targets in FIG. 15 may be compared with the positions of the at least three targets in FIG. 13 to determine the rotation mode of the image to be processed.

A specific embodiment of operation 1820 is described below by taking the processing of an image of a driving license as an example. For the directions of the driving licenses in the four images to be processed as shown in FIGs. 14 to 17, the patterns for extracting the positions of the at least three targets in the four image to be processed are illustrated in FIGs. 21 to 24. FIG. 21 is a schematic diagram illustrating positions of targets in Situation 1 of FIG. 14 according to some embodiments of the present disclosure. FIG. 22 is a schematic diagram illustrating positions of targets in Situation 2 of FIG. 15 according to some embodiments of the present disclosure. FIG. 23 is a schematic diagram illustrating positions of targets in Situation 3 of FIG. 16 according to some embodiments of the present disclosure. FIG. 24 is a schematic diagram illustrating positions of targets in Situation 4 of FIG. 17 according to some embodiments of the present disclosure. Four rotation modes may be set for the positions of the three targets illustrated in FIGs. 25 to 28. FIGs. 25 to 28 may correspond to FIGs 21 to 24, respectively. FIG. 25 is a schematic diagram illustrating positions of targets in Situation 1 of FIG. 14 according to some embodiments of the present disclosure. FIG. 26 is a schematic diagram illustrating positions of targets in Situation 2 of FIG. 15 according to some embodiments of the present disclosure. FIG. 27 is a schematic diagram illustrating positions of targets in Situation 3 of FIG. 16 according to some embodiments of the present disclosure. FIG. 28 is a schematic diagram illustrating positions of targets in Situation 4 of FIG. 17 according to some embodiments of the present disclosure. For a driving license, the corresponding targets may include: the title, the stamp, and the image of the owner (e.g., a face image) . The positions of the rectangular frames of the three targets determined according to operation 1820 may be face_rect, stamp_rect, and title_rect. Then the four rotation modes may include:

1. a normal mode: as shown in FIG. 14, the title of the document is in the top part of the image to be processed, the image of the owner is in the right part of the image to be processed, and the stamp is in the left part of the image to be processed. In such a situation, the document may have the correct direction, and the image to be processed does not need to be rotated. The normal mode may also be labeled as rotate ₀;

2. a rotated-90-degree-clockwise mode: as shown in FIG. 15, the title of the document is in the right part of the image to be processed, the image of the owner is in the bottom part of the image to be processed, and the stamp is in the top part of the image to be processed. In such a situation, the document under the normal mode may be obtained by rotating the image to be processed 90 degrees anticlockwise. The rotated-90-degree-clockwise mode may also be labeled as rotate _-90;

3. a rotated-90-degree-anticlockwise mode: as shown in FIG. 16, the title of the document is in the left part of the image to be processed, the image of the owner is in the top part of the image to be processed, and the stamp is in the bottom part of the image to be processed. In such a situation, the document under the normal mode may be obtained by rotating the image to be processed 90 degrees clockwise. The rotated-90-degree-anticlockwise mode may also be labeled as rotate ₉₀;

4. a rotated-180-degree-clockwise mode: as shown in FIG. 17, the title of the document is in the bottom part of the image to be processed, the image of the owner is in the left part of the image to be processed, and the stamp is in the right part of the image to be processed. In such a situation, the document under the normal mode may be obtained by rotating the image to be processed 180 degrees anticlockwise. The rotated-180-degree-clockwise mode may also be labeled as rotate _-180;

In response to the above four rotation modes, the following four operations may be performed, respectively. In some specific embodiments, an image to be processed having an unknown rotation mode may be labeled as rotate_mode, and the image part (or be referred to as a license part) of the driving license to be determined may be labeled as driver _rect. The process for locating the license part driver _rect may include one of the following operations:

1. When the rotation mode rotate_mode is determined as the rotation mode rotate ₀, the left side of the stamp stamp_rect may be designate as the left side of the license part driver _rect, the right side of the image of the owner face_rect may be designated as the right side of the license part driver _rect, the top side of the title title_rect may be designated as the top side of the license part driver _rect, and the bottom side of the image of the owner face_rect may be designated as the bottom side of the license part driver _rect, The corresponding formulas may be expressed as follows:

driver _rect (left) = stamp_rect (left) , Formula (5)

driver _rect (right) = face_rect (right) , Formula (6)

driver _rect (top) = title_rect (top) , Formula (7)

driver _rect (bottom) = face_rect (bottom) . Formula (8)

2. When the rotation mode rotate_mode is determined as the rotation mode rotate _-90, the left side of the image of the owner face_rect may be designate as the left side of the license part driver _rect, the right side of title title_rect may be designated as the right side of the license part driver _rect, the top side of the stamp stamp_rect may be designated as the top side of the license part driver _rect, and the bottom side of the image of the owner face_rect may be designated as the bottom side of the license part driver _rect, The corresponding formulas may be expressed as follows:

driver _rect (left) = face_rect (left) , Formula (9)

driver _rect (right) =title_rect (right) , Formula (10)

driver _rect (top) = stamp_rect (top) , Formula (11)

driver _rect (bottom) = face_rect (bottom) . Formula (12)

3. When the rotation mode rotate_mode is determined as the rotation mode rotate ₉₀, the left side of the title title_rect may be designate as the left side of the license part driver _rect, the right side of the image of the owner face_rect may be designated as the right side of the license part driver _rect, the top side of the image of the owner face_rect may be designated as the top side of the license part driver _rect, and the bottom side of the stamp stamp_rect may be designated as the bottom side of the license part driver _rect, The corresponding formulas may be expressed as follows:

driver _rect (left) = title_rect (left) , Formula (13)

driver _rect (right) =face_rect (rightt) , Formula (14)

driver _rect (top) = face_rect (top) , Formula (15)

driver _rect (bottom) = stamp_rect (bottom) . Formula (16)

4. When the rotation mode rotate_mode is determined as the rotation mode rotate _-180, the left side of the image of the owner face_rect may be designate as the left side of the license part driver _rect, the right side of the stamp stamp_rect may be designated as the right side of the license part driver _rect, the top side of the image of the owner face_rect may be designated as the top side of the license part driver _rect, and the bottom side of the title title_rect may be designated as the bottom side of the license part driver _rect, The corresponding formulas may be expressed as follows:

driver _rect (left) = face_rect (left) , Formula (17)

driver _rect (right) =stamp_rect (right) , Formula (18)

driver _rect (top) = face_rect (top) , Formula (19)

driver _rect (bottom) = title_rect (bottom) . Formula (20)

In 1830, the image to be processed may be rotated according to the rotation mode. In some embodiments, the image to be processed may be rotated by the adjustment module 330 according to the rotation mode determined in operation 1820

For example, in the above example, a license part may be obtained in operation 1820. The license part may have four rotation modes, and a rotation correction may be performed on the image to be processed to obtain a license part having the normal mode, so as to facilitate subsequent text detection and/or recognition. In some embodiments, the rotation correction may include one of the following operations:

1. When the rotation mode rotate_mode is determined as the rotation mode rotate ₀, then the license part is not to be rotated.

2. When the rotation mode rotate_mode is determined as the rotation mode rotate _-90, then the license part may be rotated 90 degrees anticlockwise.

3. When the rotation mode rotate_mode is determined as the rotation mode rotate ₉₀, then the license part may be rotated 90 degrees clockwise.

4. When the rotation mode rotate_mode is determined as the rotation mode rotate _-180, then the license part may be rotated 180 degrees anticlockwise.

In some embodiments, in the above image processing method, before operation 1810 or after operation 1830, process 1800 may further include: cropping (e.g., by the adjustment module 330) the image to be processed or the processed image, so that only the first part is included in the image to be processed or the processed image. Taking the image to be processed 1900 shown in FIG. 19 as an example, the second part 1902 may be cropped out, leaving only the first part 1901 in the image to be processed 1900. Then the image to be processed 1900 may be processed to have the form of the reference image (or template) as shown in FIG. 13.

In summary, the process for image processing provided in embodiments of the present disclosure may include: obtaining positions of at least three targets in an image to be processed, wherein the image to be processed may include a first image part representing a first document, and the at least three targets may be parts of the first document having fixed positions in the first document and sharing the same or similar features; determining a rotation mode of the image to be processed based on the positions of the at least three objects; and rotating the image to be processed according to the rotation mode. Such a process may be used to automatically process images uploaded by users to correct directions of the images to a correct direction, so as to improve the image processing efficiency.

FIG. 29 is a schematic diagram illustrating an image to be processed in Situation 5 according to some embodiments of the present disclosure. FIG. 29 illustrates another possible angle of inclination of the image uploaded by a user. The rectangular part of the first part of the image to be processed including the document may not be parallel or perpendicular to either side of the image to be processed. By comparing positions of at least three targets in the image to be processed and positions of the at least three targets in the reference image, a rotation angle for rotating the image to be processed may be determined, and the image to be processed may be rotated according to the determined rotation angle.

FIG. 30 is a schematic diagram illustrating a pattern for processing the image to be processed in Situation 5 of FIG. 29 according to some embodiments of the present disclosure. In the pattern illustrated in FIG. 30, taking the processing of an image of a driving license as an example, after the positions of the three targets in the driving license are determined with the aforementioned process, a first line 3011 may be obtained by connecting the centers of the rectangular frames of the stamp and the image of the owner in the image to be processed. The first line 3011 may be compared to a direction A of a second line 3012 formed by connecting the centers of the rectangular frames of the stamp and the image of the owner in the reference image, so as to obtain an angle α between the directions of the image to be processed and the reference image. The image to be processed may then be rotated according to the angle α.

In some embodiments, the processing of an image to be processed in Situation 5 may also be performed using a correction matrix. For example, the adjustment module 330 may obtain a correction matrix based on a mapping relationship between the positions of the three targets in the driving license and the positions of the three targets in the reference image. Then the adjustment module 330 may apply the correction matrix to the image to be processed or the first part to obtain a corrected image.

Having thus described the basic concepts, it may be rather apparent to those skilled in the art after reading this detailed disclosure that the foregoing detailed disclosure may be intended to be presented by way of example only and may be not limiting. Various alterations, improvements, and modifications may occur and are intended to those skilled in the art, though not expressly stated herein. These alterations, improvements, and modifications are intended to be suggested by this disclosure, and are within the spirit and scope of the exemplary embodiments of this disclosure.

Moreover, certain terminology has been used to describe embodiments of the present disclosure. For example, the terms “one embodiment, ” “an embodiment, ” and/or “some embodiments” mean that a particular feature, structure or characteristic described in connection with the embodiment may be included in at least one embodiment of the present disclosure. Therefore, it may be emphasized and should be appreciated that two or more references to “an embodiment” or “one embodiment” or “an alternative embodiment” in various portions of this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures or characteristics may be combined as suitable in one or more embodiments of the present disclosure.

Further, it will be appreciated by one skilled in the art, aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a “unit, ” “module, ” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, or the like, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that may be not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object to be recognized oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the “C” programming language, Visual Basic, Fortran 2103, Perl, COBOL 2102, PHP, ABAP, dynamic programming languages such as Python, Ruby, and Groovy, or other programming languages. The program code may execute entirely on the user’s computer, partly on the user’s computer, as a stand-alone software package, partly on the user’s computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user’s computer through any type of network, including a local part network (LAN) or a wide part network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a Software as a Service (SaaS) .

Furthermore, the recited order of processing elements or sequences, or the use of numbers, letters, or other designations therefore, may be not intended to limit the claimed processes and methods to any order except as may be specified in the claims. Although the above disclosure discusses through various examples what may be currently considered to be a variety of useful embodiments of the disclosure, it may be to be understood that such detail may be solely for that purposes, and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover modifications and equivalent arrangements that are within the spirit and scope of the disclosed embodiments. For example, although the implementation of various components described above may be embodied in a hardware device, it may also be implemented as a software only solution, for example, an installation on an existing server or mobile device.

Similarly, it should be appreciated that in the foregoing description of embodiments of the present disclosure, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purposes of streamlining the disclosure aiding in the understanding of one or more of the various inventive embodiments. This method of disclosure, however, may be not to be interpreted as reflecting an intention that the claimed subject matter requires more features than are expressly recited in each claim. Rather, inventive embodiments lie in less than all features of a single foregoing disclosed embodiment.

In some embodiments, the numbers expressing quantities or properties used to describe and claim certain embodiments of the application are to be understood as being modified in some instances by the term “about, ” “approximate, ” or “substantially. ” For example, “about, ” “approximate, ” or “substantially” may indicate ±20%variation of the value it describes, unless otherwise stated. Accordingly, in some embodiments, the numerical parameters set forth in the written description and attached claims are approximations that may vary depending upon the desired properties sought to be obtained by a particular embodiment. In some embodiments, the numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of some embodiments of the application are approximations, the numerical values set forth in the specific examples are reported as precisely as practicable.

Each of the patents, patent applications, publications of patent applications, and other material, such as articles, books, specifications, publications, documents, things, and/or the like, referenced herein may be hereby incorporated herein by this reference in its entirety for all purposes, excepting any prosecution file history associated with same, any of same that may be inconsistent with or in conflict with the present document, or any of same that may have a limiting affect as to the broadest scope of the claims now or later associated with the present document. By way of example, should there be any inconsistency or conflict between the description, definition, and/or the use of a term associated with any of the incorporated material and that associated with the present document, the description, definition, and/or the use of the term in the present document shall prevail.

In closing, it is to be understood that the embodiments of the application disclosed herein are illustrative of the principles of the embodiments of the application. Other modifications that may be employed may be within the scope of the application. Thus, by way of example, but not of limitation, alternative configurations of the embodiments of the application may be utilized in accordance with the teachings herein. Accordingly, embodiments of the present application are not limited to that precisely as shown and describe.

Claims

A system for image processing, comprising:

at least one storage medium including a set of instructions; and

at least one processor in communication with the at least one storage medium, wherein when executing the set of instructions, the at least one processor is directed to:

obtain a first image including an object in a first representation;

determine at least one first position of the object in the first image; and

generate a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object, wherein the second image includes the object in the second representation.
The system of claim 1, wherein:

the second representation is related to at least one of a reference size of the object, a reference image occupation ration of the object, or a reference direction with respect to the object; and

the first representation is related to at least one of an original size, an original image occupation ration, or an original direction of the object.
The system of claim 1 or claim 2, wherein to determine the at least one first position, the at least one processor is directed to:

detect a plurality of peripheral lines of the object in the first image; and

determine the at least one first position based on the plurality of peripheral lines.
The system of claim 3, wherein to adjust the first image based on the at least one first position and the second representation with respect to the object, the at least one processor is directed to:

determine, based at least in part on the second representation and the plurality of peripheral lines, at least one second position corresponding to the at least one first position; and

obtain a correction matrix based on the at least one first position and the at least one second position; and

apply the correction matrix on the first image.
The system of claim 3 or claim 4, wherein to detect the plurality of peripheral lines of the object, the at least one processor is further directed to:

detect a plurality of line segments associated with the object in the first image by using a line segment detector to treat the first image;

filter the plurality of line segments to obtain a plurality of filtered line segments, wherein the filtering is based at least in part on directions of the plurality of line segments; and

determine the plurality of peripheral lines based on the plurality of filtered line segments.
The system of claim 5, wherein the filtering is further based on confidence scores of the plurality of line segments.
The system of claim 5 or claim 6, wherein the at least one processor is further directed to:

identify, from the plurality of line segments, line segments along a same straight line;

update the plurality of line segments by combining the line segments identified as being along a same straight line.
The system of any one of claims 5 to 7, wherein:

the plurality of filtered line segments includes a plurality of line segment sets corresponding to the plurality of peripheral lines;

the first image includes a plurality of predetermined parts corresponding to the plurality of peripheral lines;

to filter the plurality of line segments to obtain the plurality of filtered line segments, the at least one processor is directed to, for each of the plurality of predetermined parts:

select, from the plurality of line segments, a set of line segments in the predetermined part as one of the plurality of line segment sets, wherein the direction of each of the set of line segments is within a preset range associated with the predetermined part;

and

to determine the plurality of peripheral lines based on the plurality of filtered line segments, the at least one processor is directed to, for each of the plurality of line segment sets:

identify a longest line segment of the line segment set as the corresponding peripheral line of the object.
The system of claim any of claims 3 to 8, wherein:

the at least one first position includes positions of one or more vertices of the object in the first image, and

to determine the at least one first position based on the plurality of peripheral lines, the at least one processor is further directed to:

determine an intersection of each adjacent pair of the plurality of peripheral lines as the one or more vertices.
The system of any one of claims 4 to 9, wherein to determine, based at least in part on the second representation and the plurality of peripheral lines, at least one second position corresponding to the at least one first position, the at least one processor is directed to:

determine a first size of the object in the first image based on the plurality of peripheral lines;

determine a reference size of the object based on the first size and the second representation; and

determine the at least one second position based on the reference size.
The system of claim 1, wherein the at least one first position corresponds to at least one part of the object, and

to determine the at least one first position, the at least one processor is directed to:

recognize the at least one part of the object in the first image using an object recognition technique.
The system of claim 11, wherein to adjust the first image based on the at least one first position and the second representation with respect to the object, the at least one processor is further directed to:

determine a rotation mode based on the at least one first position and the second representation; and

rotate the first image according to the rotation mode.
The system of claim 12, wherein to determine the rotation mode based on the at least one first position and the second representation, the at least one processor is directed to:

determine at least one second position corresponding to the at least one first position based on the second representation with respect to the object; and

determine the rotation mode based on a mapping relationship between the at least one first position and the at least one second position.
the system of any one of claims 11 to 13, wherein the object recognition technique is based on a conventional neural network (CNN) model.
The system of any one of claims 11 to 14, wherein the object is a document, and the at least one part of the object includes at least a first part including a title of the document, a second part including an image of an owner, and a third part including a stamp, a signature, a district identifier, another image of the owner, or a code bar.
The system of any one of claims 1 to 15, wherein the at least one processor is further directed to:

crop the second image or the first image so that the second image or the first image only includes the object.
A method for processing an image using a system that comprises:

a storage medium storing instructions for image processing; and

a processor in communication with the storage medium to execute the instructions stored therein, the method comprising:

obtaining a first image including an object in a first representation;

determining at least one first position of the object in the first image; and

generating a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object, wherein the second image includes the object in the second representation.
The method of claim 17, wherein:

the second representation is related to at least one of a reference size of the object, a reference image occupation ration of the object, or a reference direction with respect to the object; and

the first representation is related to at least one of an original size, an original image occupation ration, or an original direction of the object.
The method of claim 17 or claim 18, wherein the determining the at least one first position comprises:

detecting a plurality of peripheral lines of the object in the first image; and

determining the at least one first position based on the plurality of peripheral lines.
The method of claim 19, wherein the adjusting the first image based on the at least one first position and the second representation with respect to the object comprises:

determining, based at least in part on the second representation and the plurality of peripheral lines, at least one second position corresponding to the at least one first position;

obtaining a correction matrix based on the at least one first position and the at least one second position; and

applying the correction matrix on the first image.
The method of claim 19 or claim 20, wherein the detecting the plurality of peripheral lines of the object comprises:

detecting a plurality of line segments associated with the object in the first image by using a line segment detector to treat the first image;

filtering the plurality of line segments to obtain a plurality of filtered line segments, wherein the filtering is based at least in part on directions of the plurality of line segments; and

determining the plurality of peripheral lines based on the plurality of filtered line segments.
The method of claim 21, wherein the filtering is further based on confidence scores of the plurality of line segments.
The method of claim 21 or claim 22, further comprising:

identifying, from the plurality of line segments, line segments along a same straight line; and

updating the plurality of line segments by combining the line segments identified as being along a same straight line.
The method of any one of claims 21 to 23, wherein:

the plurality of filtered line segments includes a plurality of line segment sets corresponding to the plurality of peripheral lines;

the first image includes a plurality of predetermined parts corresponding to the plurality of peripheral lines;

the filtering the plurality of line segments to obtain the plurality of filtered line segments comprises, for each of the plurality of predetermined parts:

selecting, from the plurality of line segments, a set of line segments in the predetermined part as one of the plurality of line segment sets, wherein the direction of each of the set of line segments is within a preset range associated with the predetermined part;

and

the determining the plurality of peripheral lines based on the plurality of filtered line segments comprises, for each of the plurality of line segment sets:

identifying a longest line segment of the line segment set as the corresponding peripheral line of the object.
The method of claim any of claims 19 to 24, wherein:

the at least one first position includes positions of one or more vertices of the object in the first image, and

the determining the at least one first position based on the plurality of peripheral lines comprises:

determining an intersection of each adjacent pair of the plurality of peripheral lines as the one or more vertices.
The method of any one of claims 20 to 25, wherein the determining, based at least in part on the second representation and the plurality of peripheral lines, at least one second position corresponding to the at least one first position comprises:

determining a first size of the object in the first image based on the plurality of peripheral lines;

determining a reference size of the object based on the first size and the second representation; and

determining the at least one second position based on the reference size.
The method of claim 17, wherein the at least one first position corresponds to at least one part of the object, and

the determining the at least one first position comprises:

recognizing the at least one part of the object in the first image using an object recognition technique.
The method of claim 27, wherein the adjusting the first image based on the at least one first position and the second representation with respect to the object comprises:

determining a rotation mode based on the at least one first position and the second representation; and

rotating the first image according to the rotation mode.
The method of claim 28, wherein the determining the rotation mode based on the at least one first position and the second representation comprises:

determining at least one second position corresponding to the at least one first position based on the second representation with respect to the object; and

determining the rotation mode based on a mapping relationship between the at least one first position and the at least one second position.
the method of any one of claims 27 to 29, wherein the object recognition technique is based on a conventional neural network (CNN) model.
The method of any one of claims 27 to 30, wherein the object is a document, and the at least one part of the object includes at least a first part including a title of the document, a second part including an image of an owner, and a third part including a stamp, a signature, a district identifier, another image of the owner, or a code bar.
The method of any one of claims 27 to 31, further comprising:

cropping the second image or the first image so that the second image or the first image only includes the object.
A system for image processing, comprising:

a first image module, configured to obtain a first image including an object in a first representation;

a first position module, configured to determine at least one first position of the object in the first image; and

an adjustment module, configured to generate a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object, wherein the second image includes the object in the second representation.
A non-transitory computer readable medium, comprising instructions compatible for image processing, wherein when executed by a processor of an electronic device, the instructions direct the electronic device to execute an image processing process, comprising:

obtaining a first image including an object in a first representation;

determining at least one first position of the object in the first image; and

generating a second image by adjusting the first image based on the at least one first position and a second representation with respect to the object, wherein the second image includes the object in the second representation.