JP2021117988A

JP2021117988A - Positioning element detection method, positioning element detection device, electronic equipment, non-transitory computer-readable storage medium, and computer program

Info

Publication number: JP2021117988A
Application number: JP2020215097A
Authority: JP
Inventors: ブージャン，; Bo Zhan; シェンガオフー，; Shenghao Hu; イーティアン，; Ye Tian; コンチェン，; Gong Chen; ゼンユージャオ，; Zhenyu Zhao; チェンヘー，; Chen He
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-01-21
Filing date: 2020-12-24
Publication date: 2021-08-10
Anticipated expiration: 2040-12-24
Also published as: JP7228559B2; KR102436300B1; KR20210094476A; EP3855351A1; US11335101B2; CN111274974B; CN111274974A; EP3855351B1; US20210224562A1

Abstract

【課題】車両が能動的に駐車するなどのシーンで、車両を位置決める場合に適用することができる位置決め要素の検出方法を提供する。
【解決手段】位置決め要素検出方法は、車両の周囲のサラウンドビュー合成画像を取得するステップと、サラウンドビュー合成画像を検出し、車両の周囲の地面に存在する少なくとも１つの位置決め要素及びサラウンドビュー合成画像の各画素点が属する語意タイプを決定するステップと、語意タイプを用いて少なくとも１つの位置決め要素をマッチング融合し、位置決め要素の検出結果を取得するステップとを含む。
【選択図】図１PROBLEM TO BE SOLVED: To provide a method for detecting a positioning element which can be applied when positioning a vehicle in a scene where the vehicle is actively parked.
A positioning element detection method includes a step of acquiring a surround view composite image around a vehicle, and detecting the surround view composite image, and at least one positioning element and a surround view composite image existing on the ground around the vehicle. This includes a step of determining the meaning type to which each pixel point belongs, and a step of matching and fusing at least one positioning element using the meaning type and acquiring a detection result of the positioning element.
[Selection diagram] Fig. 1

Description

本出願は、視覚位置決め技術の分野に関し、特に目標検出技術に関し、具体的には、位置決め要素検出方法、位置決め要素検出装置、電子機器、非一時的なコンピュータ読み取り可能な記憶媒体及びコンピュータプログラムに関する。 The present application relates to the field of visual positioning technology, particularly to target detection technology, specifically to positioning element detection methods, positioning element detection devices, electronic devices, non-temporary computer-readable storage media and computer programs.

視覚位置決めシステムは、無人運転などの分野でますます広く使用され、視覚位置決めシステムの作用は、カメラが取得した情報に基づいて無人車の位置と姿勢をリアルタイムに解くことであり、これも無人車が自主的な運動をするための重要な前提である。 Visual positioning systems are becoming more and more widely used in fields such as unmanned driving, and the function of the visual positioning system is to solve the position and orientation of the unmanned vehicle in real time based on the information acquired by the camera, which is also an unmanned vehicle. Is an important premise for voluntary exercise.

従来において、視覚位置決めシステムで視覚情報を取得するための方法は、主に２種類ある。
１つ目の方法は視覚ＳＬＡＭ方案であり、すなわちカメラ画像に基づいて環境の感知作業を完成させ、従来の画像アルゴリズムによって画像におけるキーポイントを抽出し、複数フレームの画像におけるキーポイントのマッチング関係を用いて自身の位置決め情報を算出する。しかしながら、この方法は、照明が十分で環境テクスチャ特徴が明らかである静的シーンにのみ適用でき、かつ従来の視覚特徴の検出の堅牢性が低く、安定した高精度の検出結果を得ることは難しくなり、
２つ目の方法においては、手動でカスタマイズされた位置決め標識を識別する方法であり、すなわちカメラ画像における特定の標識を検出するにより、カメラに対応する位置決め標識の正確な３次元位置及び方向などを素早く算出することができる。この方法でのアルゴリズムレベルの実現は比較的簡単であるが、多くの特定の標識をカスタマイズして、大量にシーンに配置する必要があり、かつ後期のメンテナンスコストも高く、あらゆるシーンで使用できる量産方案として使用できない。 Conventionally, there are mainly two types of methods for acquiring visual information with a visual positioning system.
The first method is the visual SLAM plan, that is, the environment sensing work is completed based on the camera image, the key points in the image are extracted by the conventional image algorithm, and the matching relationship of the key points in the image of multiple frames is obtained. It is used to calculate its own positioning information. However, this method can be applied only to static scenes with sufficient lighting and clear environmental texture features, and the detection robustness of conventional visual features is low, making it difficult to obtain stable and highly accurate detection results. Naru,
The second method is to identify a manually customized positioning indicator, i.e., by detecting a specific indicator in the camera image, the exact three-dimensional position and orientation of the positioning indicator corresponding to the camera, etc. It can be calculated quickly. Achieving an algorithmic level in this way is relatively easy, but many specific markers need to be customized and placed in large numbers in the scene, and late maintenance costs are high, mass production that can be used in any scene. Cannot be used as a plan.

本出願は、位置決め要素検出方法、位置決め要素検出装置、電子機器、非一時的なコンピュータ読み取り可能な記憶媒体及びコンピュータプログラムを提供し、位置決め標識を手動でカスタマイズする必要がない場合に、検出の精度と堅牢性を向上させる。 The present application provides a positioning element detection method, a positioning element detection device, an electronic device, a non-temporary computer-readable storage medium and a computer program, and the accuracy of detection when the positioning indicator does not need to be manually customized. And improve robustness.

本出願の第１態様において、車両の周囲のサラウンドビュー合成画像を取得するステップと、前記サラウンドビュー合成画像を検出し、前記車両の周囲の地面に存在する少なくとも１つの位置決め要素及び前記サラウンドビュー合成画像の各画素点が属する語意タイプを決定するステップと、前記語意タイプを用いて少なくとも１つの前記位置決め要素をマッチング融合し、前記位置決め要素の検出結果を取得するステップとを含む位置決め要素検出方法を提供する。 In the first aspect of the present application, there is a step of acquiring a surround view composite image around the vehicle, and at least one positioning element existing on the ground around the vehicle and the surround view composite by detecting the surround view composite image. A positioning element detection method including a step of determining a word meaning type to which each pixel point of an image belongs and a step of matching and fusing at least one of the positioning elements using the word meaning type and acquiring a detection result of the positioning element. offer.

本態様においては、サラウンドビュー合成画像を入力として、車両の周囲の地面に自然に存在する目標を位置決め要素として検出し、かつ画素点の語意情報に基づいて位置決め要素をマッチング融合し、通常のセンサーの視野が制限されるという問題を回避するだけでなく、駐車場側の改造や配置が必要とされず、同時に、位置決め要素の検出の精度と堅牢性も向上させるという利点又は有益な効果を有する。 In this embodiment, a surround view composite image is input, a target naturally existing on the ground around the vehicle is detected as a positioning element, and the positioning elements are matched and fused based on the meaning information of the pixel points, and a normal sensor is used. It has the advantage or beneficial effect of not only avoiding the problem of limited field of view, but also improving the accuracy and robustness of the detection of positioning elements without the need for modification or placement on the parking side. ..

上記態様においては、前記位置決め要素は、駐車スペース、駐車スペース番号、車線、地面矢印、減速帯及び歩道のうちの少なくとも１つを含んでいても良い。 In the above aspect, the positioning element may include at least one of a parking space, a parking space number, a lane, a ground arrow, a deceleration zone and a sidewalk.

これにより、一般的な地面標識物を位置決め要素として使用し、位置決め標識を手動でカスタマイズする方式に比べて、これらの自然な位置決め要素は自然に存在して、サイトを改造せずに位置決め要素とすることができ、手動でカスタマイズされた位置決め要素と同じ役割を果たすという利点又は有益な効果を有する。 This allows these natural positioning elements to exist naturally and with the positioning elements without modifying the site, as compared to the method of using common ground markings as positioning elements and manually customizing the positioning markers. It can have the advantage or beneficial effect of playing the same role as a manually customized positioning element.

また、上記態様においては、前記サラウンドビュー合成画像を検出し、前記車両の周囲の地面に存在する少なくとも１つの前記位置決め要素及び前記サラウンドビュー合成画像の各前記画素点が属する前記語意タイプを決定する前記ステップは、予めトレーニングされたディープニューラルネットワークモデルを用いて前記サラウンドビュー合成画像を検出し、前記サラウンドビュー合成画像の各前記画素点に対して語意分割を行い、前記車両の周囲の地面に存在する少なくとも１つの前記位置決め要素の情報、及び前記サラウンドビュー合成画像の各前記画素点が属する前記語意タイプを決定するステップを含んでいても良い。 Further, in the above aspect, the surround view composite image is detected to determine the meaning type to which at least one positioning element existing on the ground around the vehicle and each pixel point of the surround view composite image belongs. The step detects the surround view composite image using a pre-trained deep neural network model, divides the meaning of each pixel point of the surround view composite image, and exists on the ground around the vehicle. The information of at least one positioning element to be used, and a step of determining the meaning type to which each of the pixel points of the surround view composite image belongs may be included.

これにより、ディープニューラルネットワークモデルを用いてサラウンドビュー合成画像における位置決め要素を検出し、語意に基づく特徴点の検出を実現し、従来の技術における画像の特徴点は不安定で、環境要因の影響を受けやすい問題を回避し、堅牢性はよりよいという利点又は有益な効果を有する。 As a result, the positioning element in the surround view composite image is detected using the deep neural network model, and the feature points based on the meaning of the word are detected. The feature points of the image in the conventional technique are unstable and are affected by environmental factors. Avoids vulnerable problems and has the advantage or beneficial effect of better robustness.

また、上記態様においては、前記位置決め要素の情報は少なくとも、前記位置決め要素のタイプ及び位置情報と、前記位置決め要素のキーポイントのタイプ及び位置情報とを含んでいても良い。 Further, in the above aspect, the information of the positioning element may include at least the type and position information of the positioning element and the type and position information of the key point of the positioning element.

また、上記態様においては、前記語意タイプを用いて少なくとも１つの前記位置決め要素をマッチング融合するステップは、前記キーポイントの位置情報および前記サラウンドビュー合成画像の各前記画素点の画素位置と組み合わせて、同じ位置の前記キーポイントのタイプおよび前記画素点の前記語意タイプをマッチングするステップと、マッチング結果及び予め設定された融合ポリシーに基づいて、各前記位置決め要素の前記キーポイントの位置情報を校正するステップとを含んでいても良い。 Further, in the above aspect, the step of matching and fusing at least one of the positioning elements using the meaning type is combined with the position information of the key point and the pixel position of each of the pixel points of the surround view composite image. A step of matching the type of the key point at the same position and the meaning type of the pixel point, and a step of proofreading the position information of the key point of each of the positioning elements based on the matching result and a preset fusion policy. And may be included.

これにより、画素点の語意分割の結果と組み合わせて、位置決め要素のキーポイントの位置をさらに校正し、位置決め要素の検出精度を向上させ、かつ不完全な位置決め要素に対して、語意分割の効果は更に堅牢であるという利点又は有益な効果を有する。 Thereby, in combination with the result of the word meaning division of the pixel points, the position of the key point of the positioning element is further proofread, the detection accuracy of the positioning element is improved, and the effect of the word meaning division is effective for the incomplete positioning element. It also has the advantage or beneficial effect of being robust.

また、上記態様においては、前記ディープニューラルネットワークモデルは、位置決め要素検出分岐と、キーポイント検出分岐とを含み、ここで、前記位置決め要素検出分岐は、前記位置決め要素に対して目標分類及び前記キーポイントの位置復帰を行うことに用いられ、前記キーポイント検出分岐は、前記キーポイントの検出を行うことに用いられ、相応に、前記位置決め要素のキーポイントの位置情報は、位置復帰して取得された前記キーポイントと前記キーポイントを検出して取得されたキーポイントとを融合するにより決定されても良い。 Further, in the above aspect, the deep neural network model includes a positioning element detection branch and a key point detection branch, wherein the positioning element detection branch includes target classification and the key point for the positioning element. The key point detection branch is used to detect the key point, and correspondingly, the position information of the key point of the positioning element is acquired by returning the position. It may be determined by fusing the key point and the key point acquired by detecting the key point.

これにより、キーポイント検出技術と組み合わせて、ネットワークにおけるキーポイント検出分岐を設けることによって、復帰したキーポイント位置に対して正確なマッチングを行い、融合するによってより高い精度のキーポイント位置情報を決定するという利点又は有益な効果を有する。 As a result, by providing a key point detection branch in the network in combination with the key point detection technology, accurate matching is performed for the returned key point position, and the key point position information with higher accuracy is determined by fusing. Has the advantage or beneficial effect.

また、上記態様においては、前記位置決め要素の検出結果を取得する前記ステップの前に、前記位置決め要素のタイプが駐車スペース番号である場合に、前記ディープニューラルネットワークモデルから駐車スペース番号検出フレームを抽出するステップと、前記駐車スペース番号検出フレームが属する駐車スペースにおける前記駐車スペース番号に近い２つの前記駐車スペースの角点を結ぶ線と前記サラウンドビュー合成画像の画像座標系の横軸との間の夾角を算出するステップと、前記駐車スペース番号検出フレームの中心点及び前記夾角に基づいて、回転した後の対応する前記駐車スペース番号が前記画像座標系において水平になるように、前記ディープニューラルネットワークモデルにおける前記駐車スペース番号検出フレームに対応する前記駐車スペース番号の特徴図を回転させるステップと、文字分類子を用いて回転した後の前記駐車スペース番号の特徴図に対して駐車スペース番号識別を行うステップとをさらに含んでいても良い。 Further, in the above aspect, before the step of acquiring the detection result of the positioning element, when the type of the positioning element is the parking space number, the parking space number detection frame is extracted from the deep neural network model. The angle between the step and the line connecting the corner points of the two parking spaces close to the parking space number in the parking space to which the parking space number detection frame belongs and the horizontal axis of the image coordinate system of the surround view composite image. The calculation in the deep neural network model so that the corresponding parking space number after rotation is horizontal in the image coordinate system based on the calculation step and the center point and the angle of the parking space number detection frame. A step of rotating the feature diagram of the parking space number corresponding to the parking space number detection frame and a step of identifying the parking space number with respect to the feature diagram of the parking space number after rotation using the character classifier. It may be further included.

これにより、地面の位置決め要素において、駐車スペース番号は非常に重要な情報であり、グローバルＩＤを備える唯一の位置決め要素であり、したがって、駐車スペース番号も位置決め要素として検出して、駐車スペース番号を識別するにより、地図における車両の絶対位置を位置決めし、位置決めの精度を向上させるという利点又は有益な効果を有する。 Thereby, in the ground positioning element, the parking space number is very important information and is the only positioning element having a global ID. Therefore, the parking space number is also detected as a positioning element to identify the parking space number. This has the advantage or beneficial effect of positioning the absolute position of the vehicle on the map and improving the positioning accuracy.

また、上記態様においては、前記ディープニューラルネットワークモデルを用いて前記サラウンドビュー合成画像に存在する反射領域を検出するステップと、相応に、前記位置決め要素の検出結果を取得する前記ステップの前に、前記反射領域の検出結果と組み合わせて、位置情報が前記反射領域にある位置決め要素をフィルタリングするステップとをさらに含んでいても良い。 Further, in the above aspect, before the step of detecting the reflection region existing in the surround view composite image using the deep neural network model and the step of acquiring the detection result of the positioning element correspondingly, the step is described. In combination with the detection result of the reflection region, the position information may further include a step of filtering the positioning element in the reflection region.

これにより、反射領域に対する位置決め要素の検出が正確ではなく、したがって、反射領域を検出して、検出結果を用いて位置決め要素をフィルタリングするにより、位置決め要素の検出の正確性をさらに向上させるという利点又は有益な効果を有する。 This has the advantage that the detection of the positioning element with respect to the reflection region is not accurate and therefore the accuracy of the detection of the positioning element is further improved by detecting the reflection region and filtering the positioning element using the detection result. Has a beneficial effect.

また、上記態様においては、前記車両の周囲の前記サラウンドビュー合成画像を取得する前記ステップは、前記車両の周囲に位置する魚眼カメラが収集した画像をそれぞれ取得するステップと、前記画像を合成して、前記サラウンドビュー合成画像を取得するステップとを含んでいても良い。 Further, in the above aspect, the step of acquiring the surround view composite image around the vehicle is the step of acquiring the image collected by the fisheye camera located around the vehicle, and the step of synthesizing the image. The step of acquiring the surround view composite image may be included.

これにより、サラウンドビュー合成画像が、車両の周囲の４つの魚眼で収集された画像を合成して形成されるものであり、車体の周囲の３６０度の視野角をカバーし、視野がより広く、かつそれらのいずれかの魚眼が失効する場合に、他の３つの魚眼画像も図を合成して検出することができ、検出機能を失効させることがなく、堅牢性は強い。そして、車体の魚眼は基本的に地面に向かって取り付けられ、車体の周囲の地面の画像を良好に取得することができ、特に地面の自然な位置決め要素の検出に適する。また、サラウンドビュー合成画像の検出に基づいて、魚眼の歪みと、内部および外部のパラメータと、取りつけ位置との影響を除去し、良好な汎化性を有するという利点又は有益な効果を有する。 As a result, the surround view composite image is formed by synthesizing the images collected by the four fisheyes around the vehicle, covering the 360-degree viewing angle around the vehicle body, and the field of view is wider. In addition, when one of the fisheyes expires, the other three fisheye images can also be detected by synthesizing the figures, the detection function is not invalidated, and the robustness is strong. Then, the fish eye of the vehicle body is basically attached toward the ground, and an image of the ground around the vehicle body can be obtained well, which is particularly suitable for detecting a natural positioning element of the ground. It also has the advantage or beneficial effect of having good generalization by removing the effects of fisheye distortion, internal and external parameters, and mounting position based on the detection of surround view composite images.

本出願の第２態様において、車両の周囲のサラウンドビュー合成画像を取得する画像取得モジュールと、前記サラウンドビュー合成画像を検出し、前記車両の周囲の地面に存在する少なくとも１つの位置決め要素及び前記サラウンドビュー合成画像の各画素点が属する語意タイプを決定する位置決め要素検出モジュールと、前記語意タイプを用いて少なくとも１つの前記位置決め要素をマッチング融合し、前記位置決め要素の検出結果を取得するマッチング融合モジュールとを備える位置決め要素検出装置を提供する。 In the second aspect of the present application, an image acquisition module that acquires a surround view composite image around the vehicle, at least one positioning element that detects the surround view composite image and exists on the ground around the vehicle, and the surround. A positioning element detection module that determines the word meaning type to which each pixel point of the view composite image belongs, and a matching fusion module that matches and fuses at least one of the positioning elements using the word meaning type and acquires the detection result of the positioning element. Provided is a positioning element detection device comprising.

本出願の第３態様において、電子機器であって、前記電子機器の周囲に位置して画像を収集するための魚眼カメラと、少なくとも１つのプロセッサと、少なくとも１つの前記プロセッサと通信可能に接続されるメモリとを備え、前記メモリには、少なくとも１つの前記プロセッサによって実行可能な命令が記憶されており、前記命令は、少なくとも１つの前記プロセッサが上記の位置決め要素検出方法を実行するように、少なくとも１つの前記プロセッサによって実行される電子機器をさらに提供する。 In a third aspect of the present application, the electronic device is communicably connected to a fisheye camera located around the electronic device for collecting images, at least one processor, and at least one of the processors. An instruction that can be executed by at least one processor is stored in the memory so that the at least one processor executes the positioning element detection method. Further provided are electronic devices executed by at least one of the processors.

本出願の第４態様において、コンピュータ命令が記憶されている非一時的なコンピュータ読み取り可能な記憶媒体であって、前記コンピュータ命令は、前記コンピュータに上記の位置決め要素検出方法を実行させることに用いられる非一時的なコンピュータ読み取り可能な記憶媒体を提供する。
本出願の第５態様において、コンピュータ上で動作しているときに、上記の位置決め要素検出方法を前記コンピュータに実行させるコンピュータプログラムをさらに提供する。 In a fourth aspect of the present application, a non-temporary computer-readable storage medium in which computer instructions are stored, the computer instructions being used to cause the computer to perform the positioning element detection method. Provide a non-temporary computer-readable storage medium.
In a fifth aspect of the present application, there is further provided a computer program that causes the computer to execute the positioning element detection method when operating on the computer.

本出願においては、以下のような利点又は有益な効果を有する。サラウンドビュー合成画像を入力として、車両の周囲の地面に自然に存在する目標を位置決め要素として検出し、かつ画素点の語意情報に基づいて位置決め要素をマッチング融合し、通常のセンサーの視野が制限されるという問題を回避するだけでなく、駐車場側の改造や配置が必要とされず、同時に、ディープニューラルネットワークモデルを用いてサラウンドビュー合成画像における位置決め要素を検出し、語意に基づく特徴点の検出を実現し、従来の技術における画像の特徴点は不安定で、環境要因の影響を受けやすい問題を回避し、堅牢性はよりよい。かつ、画素点の語意分割の結果と組み合わせて、位置決め要素のキーポイントの位置をさらに校正し、さらに位置決め要素の検出精度を向上させることができる。駐車スペース番号に対する検出は、さらに車両の絶対位置を決定して、位置決めの精度を向上させることができる。最後に、反射領域に対する検出も、位置決め要素の検出の正確性をさらに向上させる。 In this application, it has the following advantages or beneficial effects. By inputting a surround view composite image, a target that naturally exists on the ground around the vehicle is detected as a positioning element, and the positioning element is matched and fused based on the meaning information of the pixel point, and the field of view of a normal sensor is limited. Not only does it avoid the problem of The feature points of the image in the conventional technology are unstable, avoid the problem that is easily affected by environmental factors, and the robustness is better. Moreover, in combination with the result of the word meaning division of the pixel points, the position of the key point of the positioning element can be further proofread, and the detection accuracy of the positioning element can be further improved. The detection of the parking space number can further determine the absolute position of the vehicle and improve the positioning accuracy. Finally, detection of the reflection area also further improves the accuracy of detection of the positioning element.

上記選択可能な方式が有する他の効果については、以下、具体的な実施例と組み合わせて説明する。 Other effects of the selectable method will be described below in combination with specific examples.

図面は、本技術案をよりよく理解するために使用されており、本出願を限定するものではない。
本出願の第１実施例に係る位置決め要素検出方法の概略フローチャートである。本出願の第２実施例に係る位置決め要素検出方法の概略フローチャートである。本出願の第３実施例に係る位置決め要素検出方法の概略フローチャートである。本出願の第４実施例に係る位置決め要素検出装置の概略構成図である。本出願の実施例を実施できる位置決め要素検出方法の電子機器のブロック図である。 The drawings are used to better understand the proposed technology and are not intended to limit the application.
It is a schematic flowchart of the positioning element detection method which concerns on 1st Example of this application. It is a schematic flowchart of the positioning element detection method which concerns on 2nd Example of this application. It is a schematic flowchart of the positioning element detection method which concerns on 3rd Example of this application. It is a schematic block diagram of the positioning element detection apparatus which concerns on 4th Embodiment of this application. It is a block diagram of the electronic device of the positioning element detection method which can carry out the Example of this application.

以下、図面と組み合わせて本出願の例示的な実施例を説明し、理解を容易にするためにその中には本出願の実施例の様々な詳細事項を含んでおり、それらは単なる例示的なものと見なされるべきである。したがって、当業者は、本出願の範囲及び精神から逸脱することなく、ここで説明される実施例に対して様々な変更と修正を行うことができることを認識されたい。同様に、明確及び簡潔にするために、以下の説明では、周知の機能及び構造の説明を省略する。 Hereinafter, exemplary examples of the present application are described in combination with the drawings, which include various details of the examples of the present application for ease of understanding, which are merely exemplary. Should be considered as a thing. It should be appreciated that one of ordinary skill in the art can therefore make various changes and amendments to the embodiments described herein without departing from the scope and spirit of the present application. Similarly, for clarity and brevity, the following description omits the description of well-known functions and structures.

図１は本出願の第１実施例に係る位置決め要素検出方法の概略フローチャートである。本実施例は、車両が能動的に駐車するなどのシーンで、位置決め要素を検出して、車両を位置決める場合に適用することができ、例えば、屋内駐車場に能動的に駐車するシーンである。位置決め要素検出方法は、位置決め要素検出装置によって実行することができ、位置決め要素検出装置は、ソフトウェア及び／又はハードウェアの方式を採用して実現することができ、好ましくは電子機器、例えば、無人車またはインテリジェント運転車両などに配置する。図１に示すように、位置決め要素検出方法は、具体的には以下のＳ１０１からＳ１０３のステップを含む。 FIG. 1 is a schematic flowchart of a positioning element detection method according to the first embodiment of the present application. This embodiment can be applied to the case where the positioning element is detected and the vehicle is positioned in a scene where the vehicle is actively parked. For example, the scene is a scene in which the vehicle is actively parked in an indoor parking lot. .. The positioning element detection method can be carried out by a positioning element detection device, which can be implemented by adopting software and / or hardware methods, preferably electronic devices such as automated guided vehicles. Or place it in an intelligent driving vehicle. As shown in FIG. 1, the positioning element detection method specifically includes the following steps S101 to S103.

ステップＳ１０１は、車両の周囲のサラウンドビュー合成画像を取得する。 Step S101 acquires a surround view composite image around the vehicle.

具体的には、車両の周囲に魚眼カメラを設置することができる。例えば、車両の前、後、左、右に、車両の周囲の画像をリアルタイムに収集するための魚眼カメラをそれぞれそれぞれ１つ設置し、その後、前記画像を合成するにより、サラウンドビュー合成画像を取得することができる。 Specifically, a fisheye camera can be installed around the vehicle. For example, one fisheye camera for collecting images around the vehicle in real time is installed in front of, behind, left, and right of the vehicle, and then the surround view composite image is obtained by synthesizing the images. Can be obtained.

合成して形成されたサラウンドビュー合成画像は、車体の周囲３６０度の視野角をカバーし、視野はより広く、かつそれらのいずれかの魚眼が失効する場合に、他の３つの魚眼画像も図を合成して検出することができ、検出機能を失効させるがなく、堅牢性は強い。かつ車体の魚眼は基本的に地面に向かって取りつけられ、車体の周囲の地面の画像を良好に取得することができ、特に地面の自然な位置決め要素の検出に適する。また、サラウンドビュー合成画像の検出に基づいて、魚眼の歪みと、内部および外部のパラメータと、インストール位置との影響を除去し、良好な汎化性を有する。 The combined surround view composite image covers a 360 degree viewing angle around the vehicle body, has a wider field of view, and if one of those fisheyes expires, the other three fisheye images. The figure can be synthesized and detected, the detection function is not revoked, and the robustness is strong. Moreover, the fish eye of the vehicle body is basically attached toward the ground, and a good image of the ground around the vehicle body can be obtained, which is particularly suitable for detecting a natural positioning element of the ground. It also has good generalization by removing fisheye distortion, internal and external parameters, and installation position based on the detection of surround view composite images.

ステップＳ１０２は、サラウンドビュー合成画像を検出し、車両の周囲の地面に存在する少なくとも１つの位置決め要素及びサラウンドビュー合成画像の各画素点が属する語意タイプを決定する。 Step S102 detects the surround view composite image and determines the meaning type to which at least one positioning element existing on the ground around the vehicle and each pixel point of the surround view composite image belongs.

シーンを大規模に改造すること、および位置決め標識を手動でカスタマイズすることによる人件費の浪費、及び量産できないという問題を回避するため、本出願の実施例は、車両の周囲の地面に自然に存在する自然な要素を位置決め要素として採用し、例えば、位置決め要素は、駐車スペース、駐車スペース番号、車線、地面矢印、減速帯及び歩道のうちの少なくとも１つを含んでいても良い。これらの自然な位置決め要素は自然に存在しており、サイトを改造せずに位置決め要素とすることができ、手動でカスタマイズされた位置決め要素と同じ役割を果たす。 In order to avoid the waste of labor costs and the inability to mass-produce due to large-scale remodeling of the scene and manual customization of positioning markers, the examples of this application naturally exist on the ground around the vehicle. For example, the positioning element may include at least one of a parking space, a parking space number, a lane, a ground arrow, a deceleration zone and a sidewalk. These natural positioning elements are naturally present and can be used as positioning elements without modifying the site and play the same role as manually customized positioning elements.

もちろん、本出願の実施例における以上のようなタイプの位置決め要素に限定されず、地面に存在する他の自然な要素も本出願の実施例における位置決め要素として検出と識別することができ、本出願の実施例では、限定されない。 Of course, the present application is not limited to the above types of positioning elements in the examples of the present application, and other natural elements existing on the ground can also be identified as detection as the positioning elements in the examples of the present application. In the embodiment of, the present invention is not limited.

位置決め要素に対する検出は、ディープニューラルネットワークモデルに基づいて行うことができる。したがって、変形例として、サラウンドビュー合成画像を検出し、車両の周囲の地面に存在する少なくとも１つの位置決め要素及びサラウンドビュー合成画像の各画素点が属する語意タイプを決定するステップＳ１０２は、予めトレーニングされたディープニューラルネットワークモデルを用いてサラウンドビュー合成画像を検出し、サラウンドビュー合成画像の各画素点に対して語意分割を行い、車両の周囲の地面に存在する少なくとも１つの位置決め要素の情報、及びサラウンドビュー合成画像の各画素点が属する語意タイプを決定するステップを含んでいても良い。 Detection of positioning elements can be performed based on a deep neural network model. Therefore, as a modification, step S102 of detecting the surround view composite image and determining the meaning type to which at least one positioning element existing on the ground around the vehicle and each pixel point of the surround view composite image belongs is pre-trained. The surround view composite image is detected using the deep neural network model, the meaning of each pixel point of the surround view composite image is divided, the information of at least one positioning element existing on the ground around the vehicle, and the surround It may include a step of determining the meaning type to which each pixel point of the view composite image belongs.

ここで、モデルは目標検出におけるｏｎｅーｓｔａｇｅ（ワンステージ）のａｎｃｈｏｒーｆｒｅｅ（候補ボックスなし）のマルチタスク結合のディープニューラルネットワークアルゴリズムを用いてもよく、一度のネットワークモデルの計算で複数の目標を同時に出力することができる。そして、サラウンドビュー合成画像における異なるサイズの位置決め要素に対して、モデルは異なるスケールのｆｅａｔｕｒｅｍａｐ（特徴図）において予測することを採用し、小さいｆｅａｔｕｒｅｍａｐはより広い受容野があり、例えば駐車スペース、歩道などより大きい目標物体を予測することに適し、大きいｆｅａｔｕｒｅｍａｐはより多くの詳細な特徴があり、例えば車線、地面矢印など小さい物体及び物体のキーポイント、縁など細部を予測することに適する。これにより、異なるスケールのｆｅａｔｕｒｅｍａｐにおける異なるサイズの目標検出を行うにより、マルチタスク結合の検出効果を達成する。 Here, the model may use a one-stage (one-stage) anchor-free (no candidate box) multitasking deep neural network algorithm in target detection, and multiple targets can be calculated in one network model. It can be output at the same time. Then, for different sized positioning elements in the surround view composite image, the model employs predictions on different scale sidewalk maps, with smaller sidewalk maps having wider receptive areas, such as parking spaces, etc. Suitable for predicting larger target objects such as sidewalks, larger feature maps have more detailed features and are suitable for predicting small objects such as lanes, ground arrows and details such as key points and edges of objects. Thereby, the detection effect of the multitasking combination is achieved by performing the target detection of different sizes in the feature map of different scales.

また、当該モデルは語意分割を実現することができ、すなわちサラウンドビュー合成画像の各画素点に対して語意分割を行い、各画素点が属する語意タイプを識別し、例えば、前景または背景であるか、前景のどのタイプに属するか、駐車スペースまたは駐車スペース番号であるかなど。これにより、車両の周囲の地面に存在する少なくとも１つの位置決め要素の情報及びサラウンドビュー合成画像の各画素点が属する語意タイプを取得することができる。 In addition, the model can realize the word meaning division, that is, the word meaning division is performed for each pixel point of the surround view composite image, and the word meaning type to which each pixel point belongs is identified, for example, whether it is a foreground or a background. , Which type of foreground it belongs to, whether it is a parking space or parking space number, etc. Thereby, the information of at least one positioning element existing on the ground around the vehicle and the meaning type to which each pixel point of the surround view composite image belongs can be acquired.

ステップＳ１０３は、語意タイプを用いて少なくとも１つの位置決め要素をマッチング融合し、位置決め要素の検出結果を取得する。 In step S103, at least one positioning element is matched and fused using the meaning type, and the detection result of the positioning element is acquired.

位置決め要素の検出の正確性を向上させるために、本出願の実施例は、一方では位置決め要素を検出し、他方では語意分割を行い、語意分割の結果を用いて、同じ位置の位置決め要素と画素点の語意タイプをマッチングし、タイプが一致しない場合に、検出結果が異なることが示され、融合する必要があり、融合された結果はより正確である。例えば、語意分割の結果を最終の検出結果として選択することができ、検出の結果および語意分割の結果を最後の結果として重み付き融合し、本出願の実施例では、融合の方式を限定しない。 In order to improve the accuracy of the detection of the positioning element, the embodiment of the present application detects the positioning element on the one hand, performs the word meaning division on the other side, and uses the result of the word meaning division to use the positioning element and the pixel at the same position. Matching the meaning types of points, if the types do not match, the detection results are shown to be different and need to be fused, and the fused results are more accurate. For example, the result of the word meaning division can be selected as the final detection result, the result of the detection and the result of the word meaning division are weighted fusion as the final result, and the embodiment of the present application does not limit the method of fusion.

なお、語意タイプに基づくマッチング融合は、モデルを単独で用いて検出する場合における精度の問題を回避し、融合後の検出の精度が高くなり、ノイズの影響を回避することができる。そして、モデルで不完全に検出された位置決め要素に対して、語意分割の効果はより堅牢であり、したがって、語意タイプのマッチング融合により、位置決め要素の検出をより完全にする。 It should be noted that the matching fusion based on the meaning type avoids the problem of accuracy when the model is used alone, the accuracy of the detection after the fusion becomes high, and the influence of noise can be avoided. And for positioning elements that are incompletely detected in the model, the effect of word meaning division is more robust, and therefore the matching fusion of word meaning types makes the detection of positioning elements more complete.

本出願の実施例の技術案によれば、サラウンドビュー合成画像を入力として、車両の周囲の地面に自然に存在する目標を位置決め要素として検出し、かつ画素点の語意情報に基づいて位置決め要素をマッチング融合し、通常のセンサーの視野が制限されるという問題を回避するだけでなく、駐車場側の改造や配置が必要とされず、同時に、マッチング融合により位置決め要素の検出の精度と堅牢性も向上させる。 According to the technical proposal of the embodiment of the present application, the surround view composite image is input, the target naturally existing on the ground around the vehicle is detected as the positioning element, and the positioning element is determined based on the meaning information of the pixel points. Not only does the matching fusion avoid the problem of limiting the field of view of a normal sensor, but it does not require modification or placement on the parking lot side, and at the same time, the matching fusion also improves the accuracy and robustness of the detection of positioning elements. Improve.

図２は本出願の第２実施例に係る位置決め要素検出方法の概略フローチャートであり、本実施例は上記実施例に基づいてさらに最適化する。図２に示すように、位置決め要素検出方法は、具体的には以下のＳ２０１〜Ｓ２０４のステップを含む。 FIG. 2 is a schematic flowchart of a positioning element detection method according to a second embodiment of the present application, and this embodiment is further optimized based on the above embodiment. As shown in FIG. 2, the positioning element detection method specifically includes the following steps S201 to S204.

ステップＳ２０１は、車両の周囲のサラウンドビュー合成画像を取得する。 Step S201 acquires a surround view composite image around the vehicle.

ステップＳ２０２は、予めトレーニングされたディープニューラルネットワークモデルを用いてサラウンドビュー合成画像を検出し、サラウンドビュー合成画像の各画素点に対して語意分割を行い、車両の周囲の地面に存在する少なくとも１つの位置決め要素の情報、及びサラウンドビュー合成画像の各画素点が属する語意タイプを決定する。 Step S202 detects the surround view composite image using a pre-trained deep neural network model, divides the meaning of each pixel point of the surround view composite image, and at least one present on the ground around the vehicle. The information of the positioning element and the meaning type to which each pixel point of the surround view composite image belongs is determined.

ここで、位置決め要素の情報は少なくとも、位置決め要素のタイプ及び位置情報と、位置決め要素のキーポイントのタイプ及び位置情報とを含む。 Here, the information of the positioning element includes at least the type and position information of the positioning element and the type and position information of the key point of the positioning element.

位置決め要素のタイプは、例えば、駐車スペース、駐車スペース番号、車線、地面矢印、減速帯及び歩道などを含み、位置情報は、例えば、位置決め要素の検出フレームの位置を含み、キーポイントは、モデルをトレーニングする前に予め決定することができ、異なるタイプの位置決め要素の特徴点を表し、例えば駐車スペースの角点、駐車スペース番号の左の頂点または車線の端線の点などを選択することができる。本出願の実施例では、異なるタイプの位置決め要素のキーポイントの選択が限定されない。 Positioning element types include, for example, parking spaces, parking space numbers, lanes, ground arrows, deceleration zones and sidewalks, position information includes, for example, the position of the positioning element detection frame, and key points are models. It can be pre-determined prior to training and represents the feature points of different types of positioning elements, such as the corner point of the parking space, the left apex of the parking space number or the end line point of the lane. .. In the examples of the present application, the selection of key points for different types of positioning elements is not limited.

位置決め要素及びキーポイントの検出は、マルチタスク結合のディープニューラルネットワークアルゴリズムモデルにより実現できる。選択可能に、モデルの目標分類及び位置復帰に基づいて、本出願の実施例は目標キーポイントの検出ネットワーク分岐を追加して、後処理において位置復帰の精度をさらに向上させる。すなわち、ディープニューラルネットワークモデルは、位置決め要素検出分岐と、キーポイント検出分岐とを含み、ここで、位置決め要素検出分岐は、位置決め要素に対して目標分類及びキーポイントの位置復帰を行うことに用いられ、キーポイント検出分岐は、キーポイントの検出を行うことに用いられる。 Detection of positioning elements and key points can be realized by a deep neural network algorithm model of multitasking coupling. Selectably, based on the model's target classification and repositioning, the embodiments of the present application add a detection network branch of the target key point to further improve the accuracy of repositioning in post-processing. That is, the deep neural network model includes a positioning element detection branch and a key point detection branch, where the positioning element detection branch is used to perform target classification and key point position return for the positioning element. , The key point detection branch is used to detect the key point.

具体的には、位置決め要素検出分岐は、検出により位置決め要素の検出フレームを決定し、かつ位置決め要素のタイプを識別することができ、検出フレームにおいて位置復帰により位置決め要素の少なくとも１つのキーポイントの位置を取得する。キーポイントで検出された位置精度は常に位置復帰により算出されたキーポイントの位置の精度より高いため、本出願の実施例は、最終的に決定されたキーポイントの位置精度を向上させるように、キーポイント検出分岐を用いてキーポイントを検出し、その後、位置復帰して取得されたキーポイントとキーポイントを検出して取得されたキーポイントとを融合するにより、位置決め要素のキーポイントの位置情報を最終的に決定する。 Specifically, the positioning element detection branch can determine the detection frame of the positioning element by detection and identify the type of the positioning element, and the position of at least one key point of the positioning element by the position return in the detection frame. To get. Since the position accuracy detected at the key point is always higher than the position accuracy of the key point calculated by the position return, the embodiment of the present application is made so as to improve the position accuracy of the finally determined key point. The position information of the key points of the positioning element is obtained by detecting the key points using the key point detection branch, and then fusing the acquired key points by returning to the position and the key points acquired by detecting the key points. Is finally decided.

ステップＳ２０３は、キーポイントの位置情報およびサラウンドビュー合成画像の各画素点の画素位置と組み合わせて、同じ位置のキーポイントのタイプおよび画素点の語意タイプをマッチングする。 Step S203 matches the type of the key point at the same position and the word meaning type of the pixel point in combination with the position information of the key point and the pixel position of each pixel point of the surround view composite image.

ステップＳ２０４は、マッチング結果及び予め設定された融合ポリシーに基づいて、各位置決め要素のキーポイントの位置情報を校正して、位置決め要素の検出結果を取得する。 In step S204, the position information of the key point of each positioning element is calibrated based on the matching result and the preset fusion policy, and the detection result of the positioning element is acquired.

ここで、同じ位置のキーポイントのタイプと画素点の語意タイプはマッチングしない場合、検出して取得された位置決め要素及びキーポイントが正確でないことは示される。したがって、語意分割の結果に基づく融合ポリシーを選択して、位置決め要素のキーポイントの位置情報を校正することができ、それぞれ語意分割の方式およびモデル検出の方式によって取得された同じ位置の同じタイプの位置結果を重み付き融合して、キーポイントの位置情報を校正することを選択しても良い。これにより、マッチング融合の方式により、正確なキーポイント位置を最終的に取得して、より正確な位置決め要素の検出結果を取得する。 Here, if the type of the key point at the same position and the word meaning type of the pixel point do not match, it is shown that the detected and acquired positioning element and key point are not accurate. Therefore, a fusion policy based on the result of the word meaning division can be selected to proofread the position information of the key points of the positioning element, and the same type of the same position obtained by the word meaning division method and the model detection method, respectively. You may choose to calibrate the position information of the key points by weighted fusion of the position results. As a result, the accurate key point position is finally acquired by the matching fusion method, and the more accurate detection result of the positioning element is acquired.

本出願の実施例の技術案によれば、サラウンドビュー合成画像を入力として、車両の周囲の地面に自然に存在する目標を位置決め要素として検出し、かつ画素点の語意情報に基づいて位置決め要素をマッチング融合し、通常のセンサーの視野が制限されるという問題を回避するだけでなく、駐車場側の改造や配置が必要とされず、同時に、画素点の語意分割の結果と組み合わせて位置決め要素のキーポイントの位置をさらに校正して、位置決め要素の検出の精度を向上させることができる。 According to the technical proposal of the embodiment of the present application, the surround view composite image is input, the target naturally existing on the ground around the vehicle is detected as the positioning element, and the positioning element is determined based on the meaning information of the pixel points. Not only does it avoid the problem of limited field of view of the normal sensor by matching fusion, but it does not require modification or placement on the parking lot side, and at the same time, it is combined with the result of word meaning division of pixel points of the positioning element. The position of the key point can be further calibrated to improve the accuracy of detecting the positioning element.

図３は本出願の第３実施例に係る位置決め要素検出方法の概略フローチャートであり、本実施例は上記実施例に基づいてさらに最適化する。図３に示すように、位置決め要素検出方法は、具体的には以下のＳ３０１〜Ｓ３０７のステップを含む。 FIG. 3 is a schematic flowchart of the positioning element detection method according to the third embodiment of the present application, and this embodiment is further optimized based on the above embodiment. As shown in FIG. 3, the positioning element detection method specifically includes the following steps S301 to S307.

ステップＳ３０１は、車両の周囲のサラウンドビュー合成画像を取得する。 Step S301 acquires a surround view composite image around the vehicle.

ステップＳ３０２は、予めトレーニングされたディープニューラルネットワークモデルを用いてサラウンドビュー合成画像を検出し、サラウンドビュー合成画像の各画素点に対して語意分割を行い、車両の周囲の地面に存在する少なくとも１つの位置決め要素の情報、及びサラウンドビュー合成画像の各画素点が属する語意タイプを決定する。ここで、位置決め要素の情報は少なくとも、位置決め要素のタイプ及び位置情報と、位置決め要素のキーポイントのタイプ及び位置情報とを含む。 Step S302 detects the surround view composite image using a pre-trained deep neural network model, divides the meaning of each pixel point of the surround view composite image, and at least one existing on the ground around the vehicle. The information of the positioning element and the meaning type to which each pixel point of the surround view composite image belongs is determined. Here, the information of the positioning element includes at least the type and position information of the positioning element and the type and position information of the key point of the positioning element.

ステップＳ３０３は、位置決め要素のタイプが駐車スペース番号である場合に、ディープニューラルネットワークモデルから駐車スペース番号検出フレームを抽出する。 Step S303 extracts a parking space number detection frame from the deep neural network model when the type of positioning element is parking space number.

ステップＳ３０４は、駐車スペース番号検出フレームが属する駐車スペースにおける駐車スペース番号に近い２つの駐車スペースの角点を結ぶ線とサラウンドビュー合成画像の画像座標系の横軸との間の夾角を算出する。 Step S304 calculates the angle between the line connecting the corner points of the two parking spaces close to the parking space number in the parking space to which the parking space number detection frame belongs and the horizontal axis of the image coordinate system of the surround view composite image.

ステップＳ３０５は、駐車スペース番号検出フレームの中心点及び夾角に基づいて、回転した後の対応する駐車スペース番号が画像座標系において水平になるように、ディープニューラルネットワークモデルにおける駐車スペース番号検出フレームに対応する駐車スペース番号の特徴図を回転させる。 Step S305 corresponds to the parking space number detection frame in the deep neural network model so that the corresponding parking space number after rotation is horizontal in the image coordinate system based on the center point and the angle of the parking space number detection frame. Rotate the feature diagram of the parking space number to be used.

ステップＳ３０６は、文字分類子を用いて回転した後の駐車スペース番号の特徴図に対して駐車スペース番号識別を行う。 Step S306 identifies the parking space number with respect to the characteristic diagram of the parking space number after rotation using the character classifier.

ステップＳ３０７は、語意タイプを用いて少なくとも１つの位置決め要素をマッチング融合し、位置決め要素の検出結果を取得する。 In step S307, at least one positioning element is matched and fused using the meaning type, and the detection result of the positioning element is acquired.

上記Ｓ３０１〜Ｓ３０６の操作において、駐車スペース番号に対する識別を実現した。地面の位置決め要素において、駐車スペース番号は非常に重要な情報でありながら、グローバルＩＤを備える唯一の位置決め要素であり、したがって、駐車スペース番号も位置決め要素として検出して、駐車スペース番号を識別するにより、地図における車両の絶対位置を位置決めし、位置決めの精度を向上させる。 In the operations of S301 to S306, the identification of the parking space number was realized. In the ground positioning element, the parking space number is very important information, but it is the only positioning element having a global ID. Therefore, the parking space number is also detected as a positioning element to identify the parking space number. , Position the absolute position of the vehicle on the map and improve the positioning accuracy.

具体的には、まず駐車スペース番号検出フレームを抽出し、その後、検出フレームが属する駐車スペースにおける駐車スペース番号に近い２つの駐車スペースの角点を結ぶ線とサラウンドビュー合成画像の画像座標系の横軸との間の夾角を決定し、次いでモデルにおける検出フレームに対応する駐車スペース番号の特徴図を角度に従って回転させ、対応する駐車スペース番号を画像座標系で水平にし、最後に回転した後の駐車スペース番号の特徴図を文字分類子に入力して駐車スペース番号識別を行う。 Specifically, the parking space number detection frame is first extracted, and then the line connecting the corner points of the two parking spaces close to the parking space number in the parking space to which the detection frame belongs and the side of the image coordinate system of the surround view composite image. Determine the angle between the axes, then rotate the feature diagram of the parking space number corresponding to the detection frame in the model according to the angle, level the corresponding parking space number in the image coordinate system, and park after the last rotation. The parking space number is identified by inputting the feature diagram of the space number into the character classifier.

実際の生活において、駐車スペース番号は通常、駐車スペースの前端に位置し、つまり車両が最初に駐車場に入ったときに通過した一端であり、すなわち駐車スペースにおける駐車スペース番号に近い２つの駐車スペースの角点を結ぶ線の一端であり、かつ、駐車スペース番号は通常、人々が駐車スペースに面したときに左から右の方向にマークされるため、ディープニューラルネットワークモデルは、駐車スペースにおける駐車スペース番号に近い左、右２つの駐車スペースの角点を識別するにより、２つの角点を結ぶ線を決定して、回転の方向を決定することができ、回転した特徴図における対応する駐車スペース番号が水平であるが、駐車スペース番号の文字を反転するという現象を回避する。 In real life, the parking space number is usually located at the front end of the parking space, that is, the end that the vehicle passed through when it first entered the parking lot, that is, the two parking spaces that are close to the parking space number in the parking space. The deep neural network model is a parking space in a parking space because it is one end of a line connecting the corner points of and the parking space number is usually marked from left to right when people face the parking space. By identifying the corner points of the two parking spaces, left and right, which are close to the numbers, the line connecting the two corner points can be determined to determine the direction of rotation, and the corresponding parking space number in the rotated feature diagram. Is horizontal, but avoids the phenomenon of inverting the characters in the parking space number.

また、本出願の実施例において、ディープニューラルネットワークモデルは、特に反射領域を検出することに用いられる分岐をさらに含み、ここで、反射領域の検出アルゴリズムに対しては、本出願の実施例では限定されず、従来の技術における任意のアルゴリズムを採用してもよい。 Further, in the embodiment of the present application, the deep neural network model further includes a branch used particularly for detecting the reflection region, and here, the detection algorithm of the reflection region is limited in the embodiment of the present application. Instead, any algorithm in the conventional technique may be adopted.

すなわち、本実施例に係る位置決め要素検出方法は、ディープニューラルネットワークモデルを用いてサラウンドビュー合成画像に存在する反射領域を検出するステップをさらに含み、相応に、位置決め要素の検出結果を取得するステップの前に、反射領域の検出結果と組み合わせて、位置情報が反射領域にある位置決め要素をフィルタリングするステップをさらに含んでいても良い。 That is, the positioning element detection method according to the present embodiment further includes a step of detecting a reflection region existing in the surround view composite image using a deep neural network model, and correspondingly, a step of acquiring a detection result of the positioning element. Previously, it may further include a step of filtering the positioning elements whose position information is in the reflection region in combination with the detection result of the reflection region.

屋内および屋外環境、特に屋内環境において、通常、反射が発生し、反射領域における位置決め要素に対する検出結果は正確ではない。したがって、反射領域を検出し、検出結果を用いて位置決め要素をフィルタリングし、反射領域に表示されるこれらの位置決め要素を削除するにより、位置決め要素の検出の正確性をさらに向上させる。 In indoor and outdoor environments, especially indoor environments, reflections typically occur and detection results for positioning elements in the reflection area are not accurate. Therefore, the accuracy of detecting the positioning element is further improved by detecting the reflection region, filtering the positioning element using the detection result, and removing these positioning elements displayed in the reflection region.

なお、変形例として、ディープニューラルネットワークモデルに対して、それの出力は少なくとも位置決め要素の情報と、キーポイント検出の情報と、各画素点の語意分割の情報と、反射領域の情報とを含んでいても良い。これらの出力を後処理モジュールに送信すると、キーポイント検出の情報と位置決め要素のキーポイントとの正確なマッチングを実現することができ、それによってキーポイント検出の精度を向上させる。さらに、正確にマッチングされたキーポイントと前記語意分割の情報とのマッチング融合を実現することができ、目的は同様に精度をさらに向上させることである。また、駐車スペース番号を識別するおよび反射領域を用いて位置決め要素を融合フィルタリングすることを実現することもでき、もちろん、検出された車線に対して、ポイントクラスタによって車線のポイントスクリーニングを行うことを実現することができ、車線検出の精度を向上させる。そして、上記後処理モジュールにおける各操作の実行順序は、本出願の実施例では限定されない。 As a modification, for a deep neural network model, its output includes at least positioning element information, key point detection information, word meaning division information of each pixel point, and reflection region information. You can stay. When these outputs are sent to the post-processing module, accurate matching between the key point detection information and the key points of the positioning element can be achieved, thereby improving the accuracy of the key point detection. Further, it is possible to realize a matching fusion of the accurately matched key points and the information of the meaning division, and the purpose is to further improve the accuracy as well. It is also possible to identify parking space numbers and perform fusion filtering of positioning elements using reflective areas, and of course, to perform lane point screening with point clusters for detected lanes. Can improve the accuracy of lane detection. The execution order of each operation in the post-processing module is not limited in the examples of the present application.

本出願の実施例の技術案によれば、サラウンドビュー合成画像を入力として、車両の周囲の地面に自然に存在する目標を位置決め要素として検出し、かつ画素点の語意情報に基づいて位置決め要素をマッチング融合し、通常のセンサーの視野が制限されるという問題を回避するだけでなく、駐車場側の改造や配置が必要とされず、同時に、ディープニューラルネットワークモデルを用いてサラウンドビュー合成画像における位置決め要素を検出し、語意に基づく特徴点の検出を実現し、従来の技術における画像の特徴点が不安定で、環境要因の影響を受けやすいという問題を回避し、堅牢性はより良い。駐車スペース番号に対する検出は、さらに車両の絶対位置を決定して、位置決めの精度を向上させることができる。最後に、反射領域に対する検出も、位置決め要素の検出の正確性をさらに向上させる。 According to the technical proposal of the embodiment of the present application, the surround view composite image is input, the target naturally existing on the ground around the vehicle is detected as the positioning element, and the positioning element is determined based on the word meaning information of the pixel points. Not only does it match and fuse to avoid the problem of limiting the field of view of a normal sensor, it does not require modification or placement on the parking lot side, and at the same time it uses a deep neural network model for positioning in surround view composite images. It detects elements, realizes detection of feature points based on word meaning, avoids the problem that feature points of images are unstable and susceptible to environmental factors in the conventional technology, and has better robustness. The detection of the parking space number can further determine the absolute position of the vehicle and improve the positioning accuracy. Finally, detection of the reflection area also further improves the accuracy of detection of the positioning element.

図４は本出願の第４実施例に係る位置決め要素検出装置の概略構成図であり、本実施例が適用可能な場合である。位置決め要素検出装置は、本出願の任意の実施例により説明される位置決め要素検出方法を実現することができる。
図４に示すように、位置決め要素検出装置４００は、具体的には、車両の周囲のサラウンドビュー合成画像を取得する画像取得モジュール４０１と、サラウンドビュー合成画像を検出し、車両の周囲の地面に存在する少なくとも１つの位置決め要素及びサラウンドビュー合成画像の各画素点が属する語意タイプを決定する位置決め要素検出モジュール４０２と、語意タイプを用いて少なくとも１つの位置決め要素をマッチング融合し、位置決め要素の検出結果を取得するマッチング融合モジュール４０３とを備える。 FIG. 4 is a schematic configuration diagram of the positioning element detection device according to the fourth embodiment of the present application, and is a case where the present embodiment is applicable. The positioning element detection device can realize the positioning element detection method described by any embodiment of the present application.
As shown in FIG. 4, specifically, the positioning element detection device 400 detects an image acquisition module 401 that acquires a surround view composite image around the vehicle and a surround view composite image, and puts the surround view composite image on the ground around the vehicle. At least one positioning element existing and the positioning element detection module 402 that determines the word meaning type to which each pixel point of the surround view composite image belongs and at least one positioning element are matched and fused using the word meaning type, and the detection result of the positioning element The matching fusion module 403 and the matching fusion module 403 are provided.

変形例として、位置決め要素は、駐車スペース、駐車スペース番号、車線、地面矢印、減速帯及び歩道のうちの少なくとも１つを含んでいても良い。 As a variant, the positioning element may include at least one of a parking space, a parking space number, a lane, a ground arrow, a deceleration zone and a sidewalk.

変形例として、位置決め要素検出モジュール４０２は、具体的には、予めトレーニングされたディープニューラルネットワークモデルを用いてサラウンドビュー合成画像を検出し、サラウンドビュー合成画像の各画素点に対して語意分割を行い、車両の周囲の地面に存在する少なくとも１つの位置決め要素の情報、及びサラウンドビュー合成画像の各画素点が属する語意タイプを決定することに用いられる。 As a modification, specifically, the positioning element detection module 402 detects the surround view composite image using a deep neural network model trained in advance, and divides the meaning of each pixel point of the surround view composite image. , Information on at least one positioning element present on the ground around the vehicle, and used to determine the meaning type to which each pixel point of the surround view composite image belongs.

変形例として、位置決め要素の情報は少なくとも、位置決め要素のタイプ及び位置情報と、位置決め要素のキーポイントのタイプ及び位置情報とを含んでいても良い。 As a modification, the information on the positioning element may include at least the type and position information of the positioning element and the type and position information of the key points of the positioning element.

変形例として、マッチング融合モジュール４０３は、キーポイントの位置情報およびサラウンドビュー合成画像の各画素点の画素位置と組み合わせて、同じ位置のキーポイントのタイプおよび画素点の語意タイプをマッチングするマッチングユニットと、マッチング結果及び予め設定された融合ポリシーに基づいて、各位置決め要素のキーポイントの位置情報を校正するための校正ユニットとを備えていても良い。 As a modification, the matching fusion module 403 is combined with the key point position information and the pixel position of each pixel point of the surround view composite image to match the key point type at the same position and the pixel point word meaning type with a matching unit. , A calibration unit for calibrating the position information of the key points of each positioning element based on the matching result and the preset fusion policy may be provided.

変形例として、ディープニューラルネットワークモデルは、位置決め要素検出分岐と、キーポイント検出分岐とを含み、ここで、位置決め要素検出分岐は、位置決め要素に対して目標分類及びキーポイントの位置復帰を行うことに用いられ、キーポイント検出分岐は、キーポイントの検出を行うことに用いられ、相応に、位置決め要素のキーポイントの位置情報は、位置復帰して取得されたキーポイントとキーポイントを検出して取得されたキーポイントとを融合するにより決定されても良い。 As a modification, the deep neural network model includes a positioning element detection branch and a key point detection branch, wherein the positioning element detection branch performs target classification and key point position return for the positioning element. Used, the key point detection branch is used to detect the key point, and correspondingly, the position information of the key point of the positioning element is acquired by detecting the key point and the key point acquired by returning to the position. It may be determined by fusing with the key points that have been made.

変形例として、位置決め要素検出装置は駐車スペース番号検出モジュールをさらに備え、駐車スペース番号検出モジュールは、具体的には、マッチング融合モジュールが位置決め要素の検出結果を取得する前に、位置決め要素のタイプが駐車スペース番号である場合に、ディープニューラルネットワークモデルから駐車スペース番号検出フレームを抽出し、駐車スペース番号検出フレームが属する駐車スペースにおける駐車スペース番号に近い２つの駐車スペースの角点を結ぶ線とサラウンドビュー合成画像の画像座標系の横軸との間の夾角を算出し、駐車スペース番号検出フレームの中心点及び夾角に基づいて、回転した後の対応する駐車スペース番号が画像座標系において水平になるように、ディープニューラルネットワークモデルにおける駐車スペース番号検出フレームに対応する駐車スペース番号の特徴図を回転させ、文字分類子を用いて回転した後の駐車スペース番号の特徴図に対して駐車スペース番号識別を行うという操作を実行することに用いられても良い。 As a variant, the positioning element detector further comprises a parking space number detection module, the parking space number detection module, specifically, the type of positioning element before the matching fusion module obtains the detection result of the positioning element. If it is a parking space number, the parking space number detection frame is extracted from the deep neural network model, and the line connecting the corner points of the two parking spaces close to the parking space number in the parking space to which the parking space number detection frame belongs and the surround view. Calculate the angle between the horizontal axis of the image coordinate system of the composite image and make the corresponding parking space number after rotation horizontal in the image coordinate system based on the center point and angle of the parking space number detection frame. In addition, the feature diagram of the parking space number corresponding to the parking space number detection frame in the deep neural network model is rotated, and the parking space number is identified for the feature diagram of the parking space number after the rotation using the character classifier. It may be used to execute the operation.

変形例として、位置決め要素検出モジュール４０２は、さらに、ディープニューラルネットワークモデルを用いてサラウンドビュー合成画像に存在する反射領域を検出することに用いられ、相応に、位置決め要素検出装置は、位置決め要素の検出結果を取得する前に、反射領域の検出結果と組み合わせて、位置情報が反射領域にある位置決め要素をフィルタリングするための位置決め要素フィルタリングモジュールをさらに備えていても良い。 As a variant, the positioning element detection module 402 is further used to detect the reflection region present in the surround view composite image using a deep neural network model, and the positioning element detection device accordingly detects the positioning element. Before acquiring the result, a positioning element filtering module for filtering the positioning element whose position information is in the reflection region may be further provided in combination with the detection result of the reflection region.

変形例として、画像取得モジュール４０１は、車両の周囲に位置する魚眼カメラが収集した画像をそれぞれ取得する画像取得ユニットと、画像を合成して、サラウンドビュー合成画像を取得する画像合成ユニットとを備えていても良い。 As a modification, the image acquisition module 401 includes an image acquisition unit that acquires images collected by fisheye cameras located around the vehicle, and an image composition unit that synthesizes images to acquire a surround view composite image. You may have it.

本出願の実施例により開示される位置決め要素検出装置４００は、本出願の任意の実施例により提供される位置決め要素検出方法を実行することができ、実行方法に対応する機能モジュールと有益な効果を備える。本実施例で詳細に説明されていない内容は、本出願の任意の方法の実施例の説明を参照することができる。 The positioning element detection device 400 disclosed by the embodiments of the present application can execute the positioning element detection method provided by any embodiment of the present application, and can provide a functional module corresponding to the execution method and a beneficial effect. Be prepared. For content not described in detail in this example, the description of an example of any method of the present application can be referred to.

本出願の実施例によれば、本出願は、電子機器及び読み取り可能な記憶媒体をさらに提供する。 According to the embodiments of the present application, the present application further provides electronic devices and readable storage media.

図５に示すように、それは本出願の実施例に係る位置決め要素検出方法の電子機器のブロック図である。電子機器は、ラップトップコンピュータ、デスクトップコンピュータ、ワークステーション、パーソナルデジタルアシスタント、サーバ、ブレードサーバ、メインフレーム、及び他の適切なコンピュータなどの様々な形式のデジタルコンピュータを表すことを目的とする。電子機器は、携帯情報端末、携帯電話、スマートフォン、ウェアラブルデバイス、他の同様のコンピューティングデバイスなどの様々な形式のモバイルデバイスを表すこともできる。本明細書で示されるコンポーネント、それらの接続と関係、及びそれらの機能は単なる例であり、本明細書の説明及び／又は要求される本出願の実現を制限することを意図したものではない。 As shown in FIG. 5, it is a block diagram of an electronic device of a positioning element detection method according to an embodiment of the present application. Electronic devices are intended to represent various types of digital computers such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Electronic devices can also represent various types of mobile devices such as personal digital assistants, mobile phones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are merely examples and are not intended to limit the description and / or the required realization of this application.

図５に示すように、電子機器は、少なくとも１つのプロセッサ５０１と、メモリ５０２と、高速インターフェースと低速インターフェースを含む各コンポーネントを接続するためのインターフェースとを備える。各コンポーネントは、異なるバスで相互に接続され、共通のマザーボードに取り付けられるか、又は必要に基づいて他の方式で取り付けることができる。プロセッサは、外部入力／出力装置（インターフェースに結合されたディスプレイデバイスなど）にＧＵＩの図形情報をディスプレイするためにメモリに記憶されている命令を含む、電子機器内に実行される命令を処理することができる。他の実施形態では、必要であれば、複数のプロセッサ及び／又は複数のバスを、複数のメモリと複数のメモリとともに使用することができる。同様に、複数の電子機器を接続することができ、各電子機器は、部分的な必要な操作（例えば、サーバアレイ、ブレードサーバ、又はマルチプロセッサシステムとする）を提供することができる。図５では、１つのプロセッサ５０１を一例とする。 As shown in FIG. 5, the electronic device includes at least one processor 501, a memory 502, and an interface for connecting each component including a high-speed interface and a low-speed interface. Each component can be interconnected on different buses and mounted on a common motherboard, or otherwise mounted as needed. The processor processes instructions executed within an electronic device, including instructions stored in memory to display GUI graphic information on an external input / output device (such as a display device coupled to an interface). Can be done. In other embodiments, a plurality of processors and / or a plurality of buses can be used with the plurality of memories and the plurality of memories, if necessary. Similarly, a plurality of electronic devices can be connected, and each electronic device can provide a partial necessary operation (for example, a server array, a blade server, or a multiprocessor system). In FIG. 5, one processor 501 is taken as an example.

メモリ５０２は、本出願により提供される非一時的なコンピュータ読み取り可能な記憶媒体である。ここで、メモリ５０２には、少なくとも１つのプロセッサが上記の位置決め要素検出方法を実行するように、少なくとも１つのプロセッサによって実行可能な命令が記憶されている。本出願の非一時的なコンピュータ読み取り可能な記憶媒体は、コンピュータに上記の位置決め要素検出方法を実行させるためのコンピュータ命令が記憶されている。 Memory 502 is a non-temporary computer-readable storage medium provided by this application. Here, the memory 502 stores an instruction that can be executed by at least one processor so that at least one processor executes the above-mentioned positioning element detection method. The non-temporary computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the above-mentioned positioning element detection method.

メモリ５０２は、非一時的なコンピュータ読み取り可能な記憶媒体として、本出願の実施例に係る位置決め要素検出方法に対応するプログラム命令／モジュール（例えば、図４に示す画像取得モジュール４０１、位置決め要素検出モジュール４０２、及びマッチング融合モジュール４０３）ような非一時的なソフトウェアプログラム、非一時的なコンピュータ実行可能なプログラム及びモジュールを記憶することに用いられる。プロセッサ５０１は、メモリ５０２に記憶されている非一時的なソフトウェアプログラム、命令及びモジュールを実行することによって、サーバの様々な機能アプリケーション及びデータ処理を実行し、すなわち上記の位置決め要素検出方法を実現する。 The memory 502 is a non-temporary computer-readable storage medium, and is a program instruction / module (for example, the image acquisition module 401 shown in FIG. 4 and the positioning element detection module) corresponding to the positioning element detection method according to the embodiment of the present application. It is used to store non-temporary software programs, non-temporary computer-executable programs and modules such as 402, and Matching Fusion Module 403). The processor 501 executes various functional applications and data processing of the server by executing non-temporary software programs, instructions and modules stored in the memory 502, that is, realizing the positioning element detection method described above. ..

メモリ５０２は、プログラム記憶領域とデータ記憶領域とを含むことができ、ここで、プログラム記憶領域は、オペレーティングシステム、少なくとも１つの機能に必要なアプリケーションプログラムを記憶することができ、データ記憶領域は、本出願の実施例の位置決め要素検出方法を実現することに基づく電子機器の使用によって生成されたデータなどを記憶することができる。また、メモリ５０２は、高速ランダム存取メモリを含むことができ、非一時的なメモリをさらに含むことができ、例えば、少なくとも１つのディスクストレージデバイス、フラッシュメモリデバイス、又は他の非一時的なソリッドステートストレージデバイスである。いくつかの実施例では、メモリ５０２は、プロセッサ５０１に対して遠隔に設置されたメモリを選択的に含むことができ、これらの遠隔メモリは、ネットワークを介して本出願の実施例の位置決め要素検出方法を実現する電子機器に接続することができる。上記ネットワークの一例は、インターネット、イントラネット、ローカルエリアネットワーク、モバイル通信ネットワーク、及びその組み合わせを含むが、これらに限定されない。 The memory 502 can include a program storage area and a data storage area, where the program storage area can store an operating system, an application program required for at least one function, and the data storage area is a data storage area. Data generated by the use of electronic devices based on the realization of the positioning element detection method according to the embodiment of the present application can be stored. The memory 502 can also include fast random storage memory and can further include non-temporary memory, eg, at least one disk storage device, flash memory device, or other non-temporary solid state. It is a state storage device. In some embodiments, the memory 502 can selectively include memory located remotely to the processor 501, which remote memory detects positioning elements of the embodiments of the present application over a network. It can be connected to an electronic device that realizes the method. Examples of the above networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

本出願の実施例に係る位置決め要素検出方法を実現する電子機器は、入力装置５０３と出力装置５０４とをさらに備えていても良い。プロセッサ５０１、メモリ５０２、入力装置５０３、及び出力装置５０４は、バス又は他の方式を介して接続することができ、図５では、バスによる接続を一例とする。 The electronic device that realizes the positioning element detection method according to the embodiment of the present application may further include an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503, and the output device 504 can be connected via a bus or other method, and in FIG. 5, the connection by the bus is taken as an example.

入力装置５０３は、入力された数字又は文字情報を受信することができ、及び本出願の実施例に係る位置決め要素検出方法を実現する電子機器のユーザ設定及び機能制御に関するキー信号入力を生成することができ、例えば、タッチスクリーン、キーパッド、マウス、トラックパッド、タッチパッド、ポインティングスティック、少なくとも１つのマウスボタン、トラックボール、ジョイスティックなどの入力装置である。出力装置５０４は、ディスプレイデバイス、補助照明デバイス（例えば、ＬＥＤ）、及び触覚フィードバックデバイス（例えば、振動モータ）などを含むことができる。ディスプレイデバイスは、液晶ディスプレイ（ＬＣＤ）、発光ダイオード（ＬＥＤ）ディスプレイ、及びプラズマディスプレイを含むことができるが、これらに限定されない。いくつかの実施形態では、ディスプレイデバイスは、タッチスクリーンであってもよい。 The input device 503 can receive the input numerical or character information, and generates a key signal input relating to user setting and function control of the electronic device that realizes the positioning element detection method according to the embodiment of the present application. It is an input device such as a touch screen, a keypad, a mouse, a trackpad, a touchpad, a pointing stick, at least one mouse button, a trackball, a joystick, and the like. The output device 504 can include a display device, an auxiliary lighting device (eg, LED), a haptic feedback device (eg, a vibration motor), and the like. Display devices can include, but are not limited to, liquid crystal displays (LCDs), light emitting diode (LED) displays, and plasma displays. In some embodiments, the display device may be a touch screen.

本明細書で説明されるシステムと技術の様々な実施形態は、デジタル電子回路システム、集積回路システム、特定用途向けＡＳＩＣ（特定用途向け集積回路）、コンピュータハードウェア、ファームウェア、ソフトウェア、及び／又はそれらの組み合わせで実現することができる。これらの様々な実施形態は以下を含んでいても良い。少なくとも１つのコンピュータプログラムで実施されることを含むことができ、少なくとも１つのコンピュータプログラムは、少なくとも１つのプログラマブルプロセッサを含むプログラム可能なシステムで実行及び／又は解釈することができ、プログラマブルプロセッサは、特定用途向け又は汎用プログラマブルプロセッサであってもよく、ストレージシステム、少なくとも１つの入力装置、及び少なくとも１つの出力装置からデータ及び命令を受信し、データ及び命令を当該ストレージシステム、少なくとも１つの入力装置、及び少なくとも１つの出力装置に伝送することができる。 Various embodiments of the systems and techniques described herein include digital electronic circuit systems, integrated circuit systems, application specific ASICs (ASICs), computer hardware, firmware, software, and / or them. It can be realized by the combination of. These various embodiments may include: It can include being implemented in at least one computer program, the at least one computer program can be run and / or interpreted in a programmable system that includes at least one programmable processor, and the programmable processor is specific. It may be a general purpose or general purpose programmable processor, receiving data and instructions from a storage system, at least one input device, and at least one output device, and sending the data and instructions to the storage system, at least one input device, and It can be transmitted to at least one output device.

これらのコンピューティングプログラム（プログラム、ソフトウェア、ソフトウェアアプリケーション、又はコードとも呼ばれる）は、プログラマブルプロセッサの機械命令を含んでもよく、高レベルのプロセス及び／又はオブジェクト指向プログラミング言語、及び／又はアセンブリ／機械言語でこれらのコンピューティングプログラムを実施することができる。本明細書で使用されるように、「機械読み取り可能な媒体」及び「コンピュータ読み取り可能な媒体」という用語は、機械命令及び／又はデータをプログラマブルプロセッサに提供するために使用される任意のコンピュータプログラム製品、機器、及び／又は装置（例えば、磁気ディスク、光ディスク、メモリ、プログラマブルロジックデバイス（ＰＬＤ））を指し、機械読み取り可能な信号である機械命令を受信する機械読み取り可能な媒体を含む。「機械読み取り可能な信号」という用語は、機械命令及び／又はデータをプログラマブルプロセッサに提供するための任意の信号を指す。 These computing programs (also referred to as programs, software, software applications, or code) may include programmable processor machine instructions in high-level process and / or object-oriented programming languages and / or assembly / machine languages. These computing programs can be implemented. As used herein, the terms "machine readable medium" and "computer readable medium" are any computer programs used to provide machine instructions and / or data to programmable processors. Refers to products, devices, and / or devices (eg, magnetic disks, optical disks, memories, programmable logic devices (PLDs)), including machine-readable media that receive machine commands, which are machine-readable signals. The term "machine readable signal" refers to any signal for providing machine instructions and / or data to a programmable processor.

ユーザとのインタラクションを提供するために、ここで説明されているシステム及び技術をコンピュータ上で実施することができ、コンピュータは、ユーザに情報を表示するためのディスプレイ装置（例えば、ＣＲＴ（陰極線管）又はＬＣＤ（液晶ディスプレイ）モニタ）と、キーボード及びポインティングデバイス（例えば、マウス又はトラックボール）とを有し、ユーザは、キーボード及び当該ポインティングデバイスによって入力をコンピュータに提供することができる。他の種類の装置は、ユーザとのインタラクションを提供するために用いられることもでき、例えば、ユーザに提供されるフィードバックは、任意の形式のセンシングフィードバック（例えば、視覚フィードバック、聴覚フィードバック、又は触覚フィードバック）であってもよく、任意の形式（音響入力と、音声入力と、触覚入力とを含む）でユーザからの入力を受信することができる。 To provide interaction with the user, the systems and techniques described herein can be implemented on a computer, which is a display device for displaying information to the user (eg, a CRT (cathode tube)). Alternatively, it has an LCD (liquid crystal display) monitor) and a keyboard and a pointing device (for example, a mouse or a track ball), and the user can provide input to the computer by the keyboard and the pointing device. Other types of devices can also be used to provide interaction with the user, eg, the feedback provided to the user is any form of sensing feedback (eg, visual feedback, auditory feedback, or tactile feedback). ), And the input from the user can be received in any format (including acoustic input, voice input, and tactile input).

ここで説明されるシステム及び技術は、バックエンドコンポーネントを含むコンピューティングシステム（例えば、データサーバとする）、又はミドルウェアコンポーネントを含むコンピューティングシステム（例えば、アプリケーションサーバー）、又はフロントエンドコンポーネントを含むコンピューティングシステム（例えば、グラフィカルユーザインタフェース又はウェブブラウザを有するユーザコンピュータ、ユーザは、グラフィカルユーザインタフェース又はウェブブラウザによってここで説明されるシステム及び技術の実施形態とインタラクションする）、又はこのようなバックエンドコンポーネントと、ミドルウェアコンポーネントと、フロントエンドコンポーネントの任意の組み合わせを含むコンピューティングシステムで実施することができる。任意の形式又は媒体のデジタルデータ通信（例えば、通信ネットワーク）によってシステムのコンポーネントを相互に接続されることができる。通信ネットワークの一例は、ローカルエリアネットワーク（ＬＡＮ）と、ワイドエリアネットワーク（ＷＡＮ）と、インターネットとを含む。 The systems and techniques described herein are computing systems that include back-end components (eg, data servers), or computing systems that include middleware components (eg, application servers), or computing that includes front-end components. With a system (eg, a user computer having a graphical user interface or web browser, the user interacts with embodiments of systems and techniques described herein by the graphical user interface or web browser), or such back-end components. It can be implemented in computing systems that include any combination of middleware components and front-end components. The components of the system can be interconnected by digital data communication of any form or medium (eg, a communication network). An example of a communication network includes a local area network (LAN), a wide area network (WAN), and the Internet.

コンピュータシステムは、クライアントとサーバとを含むことができる。クライアントとサーバは、一般的に、互いに離れており、通常に通信ネットワークを介してインタラクションする。対応するコンピュータ上で実行され、互いにクライアント−サーバ関係を有するコンピュータプログラムによってクライアントとサーバとの関係が生成される。 A computer system can include a client and a server. Clients and servers are generally separated from each other and typically interact over a communication network. A client-server relationship is created by a computer program that runs on the corresponding computer and has a client-server relationship with each other.

本出願の実施例の技術案によれば、サラウンドビュー合成画像を入力として、車両の周囲の地面に自然に存在する目標を位置決め要素として検出し、かつ画素点の語意情報に基づいて位置決め要素をマッチング融合し、通常のセンサーの視野が制限されるという問題を回避するだけでなく、駐車場側の改造や配置が必要とされず、同時に、ディープニューラルネットワークモデルを用いてサラウンドビュー合成画像における位置決め要素を検出し、語意に基づく特徴点の検出を実現し、従来の技術における画像の特徴点は不安定で、環境要因の影響を受けやすい問題を回避し、堅牢性はよりよい。かつ、画素点の語意分割の結果と組み合わせて、位置決め要素のキーポイントの位置をさらに校正し、さらに位置決め要素の検出精度を向上させることができる。駐車スペース番号に対する検出は、さらに車両の絶対位置を決定して、位置決めの精度を向上させることができる。最後に、反射領域に対する検出も、位置決め要素の検出の正確性をさらに向上させる。 According to the technical proposal of the embodiment of the present application, the surround view composite image is input, the target naturally existing on the ground around the vehicle is detected as the positioning element, and the positioning element is determined based on the word meaning information of the pixel points. Not only does it match and fuse to avoid the problem of limiting the field of view of a normal sensor, it does not require modification or placement on the parking lot side, and at the same time it uses a deep neural network model for positioning in surround view composite images. It detects elements, realizes detection of feature points based on word meaning, avoids the problem that the feature points of images in the conventional technique are unstable and susceptible to environmental factors, and has better robustness. Moreover, in combination with the result of the word meaning division of the pixel points, the position of the key point of the positioning element can be further proofread, and the detection accuracy of the positioning element can be further improved. The detection of the parking space number can further determine the absolute position of the vehicle and improve the positioning accuracy. Finally, detection of the reflection area also further improves the accuracy of detection of the positioning element.

上記に示される様々な形式のフローを用いて、ステップを並べ替え、追加、又は削除することができることを理解されたい。例えば、本出願に記載されている各ステップは、並列に実行されてもよいし、順次的に実行されてもよいし、異なる順序で実行されてもよいが、本出願で開示されている技術案の所望の結果を実現することができれば、本明細書では限定されない。 It should be understood that steps can be rearranged, added, or deleted using the various forms of flow shown above. For example, the steps described in this application may be performed in parallel, sequentially, or in a different order, but the techniques disclosed in this application. The present specification is not limited as long as the desired result of the proposal can be achieved.

上記具体的な実施形態は、本出願に対する保護範囲を制限するものではない。当業者は、設計要件と他の要因に基づいて、様々な修正、組み合わせ、サブコンビネーション、及び代替を行うことができる。任意の本出願の精神と原則内で行われる修正、同等の置換、及び改善などは、いずれも本出願の保護範囲内に含まれるべきである。
The specific embodiments described above do not limit the scope of protection for this application. One of ordinary skill in the art can make various modifications, combinations, subcombinations, and alternatives based on design requirements and other factors. Any amendments, equivalent replacements, and improvements made within the spirit and principles of this application should be within the scope of this application's protection.

Claims

It is a positioning element detection method
Steps to get a surround view composite image around the vehicle,
A step of detecting the surround view composite image and determining the meaning type to which at least one positioning element existing on the ground around the vehicle and each pixel point of the surround view composite image belongs.
A positioning element detection method including a step of matching and fusing at least one of the positioning elements using the meaning type and acquiring a detection result of the positioning element.

The positioning element detection method according to claim 1, wherein the positioning element includes at least one of a parking space, a parking space number, a lane, a ground arrow, a deceleration zone, and a sidewalk.

The step of detecting the surround view composite image and determining the meaning type to which at least one of the positioning elements present on the ground around the vehicle and each of the pixel points of the surround view composite image belongs.
The surround view composite image is detected using a pre-trained deep neural network model, word meaning division is performed for each pixel point of the surround view composite image, and at least one existing on the ground around the vehicle. The positioning element detection method according to claim 1 or 2, further comprising a step of determining the meaning type to which each of the pixel points of the surround view composite image belongs and the information of the positioning element.

The positioning element detection method according to claim 3, wherein the positioning element information includes at least the type and position information of the positioning element and the key point type and position information of the positioning element.

The step of matching and fusing at least one of the positioning elements using the word meaning type is
A step of matching the type of the key point at the same position and the meaning type of the pixel point in combination with the position information of the key point and the pixel position of each pixel point of the surround view composite image.
The positioning element detection method according to claim 4, further comprising a step of calibrating the position information of the key point of each of the positioning elements based on a matching result and a preset fusion policy.

The deep neural network model includes a positioning element detection branch and a key point detection branch, wherein the positioning element detection branch performs target classification and position return of the key point for the positioning element. Used, the keypoint detection branch is used to detect the keypoint.
Correspondingly, the position information of the key point of the positioning element is determined by claim 4 which is determined by fusing the key point acquired by returning to the position and the key point acquired by detecting the key point. Positioning element detection method.

Before the step of acquiring the detection result of the positioning element,
A step of extracting a parking space number detection frame from the deep neural network model when the type of the positioning element is a parking space number, and
A step of calculating the angle between the line connecting the corner points of the two parking spaces close to the parking space number in the parking space to which the parking space number detection frame belongs and the horizontal axis of the image coordinate system of the surround view composite image. When,
The parking space number detection frame in the deep neural network model so that the corresponding parking space number after rotation is horizontal in the image coordinate system based on the center point and the angle of the parking space number detection frame. The step of rotating the feature diagram of the parking space number corresponding to
The positioning element detection method according to claim 4, further comprising a step of identifying the parking space number with respect to the feature diagram of the parking space number after rotation using the character classifier.

A step of detecting a reflection region existing in the surround view composite image using the deep neural network model, and
Correspondingly, the method according to claim 4, further comprising a step of filtering the positioning element whose position information is in the reflection region in combination with the detection result of the reflection region before acquiring the detection result of the positioning element. ..

The step of acquiring the surround view composite image around the vehicle is
The step of acquiring the images collected by the fisheye cameras located around the vehicle, and
The positioning element detection method according to claim 1, further comprising a step of synthesizing the images and acquiring the surround view composite image.

An image acquisition module that acquires surround view composite images around the vehicle,
A positioning element detection module that detects the surround view composite image and determines the meaning type to which at least one positioning element existing on the ground around the vehicle and each pixel point of the surround view composite image belongs.
A positioning element detection device including a matching fusion module that matches and fuses at least one of the positioning elements using the meaning type and acquires a detection result of the positioning element.

The positioning element detecting device according to claim 10, wherein the positioning element includes at least one of a parking space, a parking space number, a lane, a ground arrow, a deceleration zone, and a sidewalk.

Specifically, the positioning element detection module is
The surround view composite image is detected using a pre-trained deep neural network model, word meaning division is performed for each pixel point of the surround view composite image, and at least one existing on the ground around the vehicle. The positioning element detection device according to claim 10 or 11, which is used to determine the information of the positioning element and the meaning type to which each of the pixel points of the surround view composite image belongs.

The positioning element detection device according to claim 12, wherein the information on the positioning element includes at least the type and position information of the positioning element and the type and position information of the key points of the positioning element.

The matching fusion module
A matching unit that matches the type of the key point at the same position and the meaning type of the pixel point in combination with the position information of the key point and the pixel position of each pixel point of the surround view composite image.
The positioning element detection device according to claim 13, further comprising a calibration unit that calibrates the position information of the key point of each of the positioning elements based on a matching result and a preset fusion policy.

The deep neural network model includes a positioning element detection branch and a key point detection branch, wherein the positioning element detection branch performs target classification and position return of the key point for the positioning element. Used, the keypoint detection branch is used to detect the keypoint.
Correspondingly, claim 13 determines that the position information of the key point of the positioning element is determined by fusing the key point acquired by returning to the position and the key point acquired by detecting the key point. The positioning element detection device according to the description.

Equipped with a parking space number detection module
Specifically, the module is used before the matching fusion module acquires the detection result of the positioning element.
When the type of the positioning element is the parking space number, the parking space number detection frame is extracted from the deep neural network model, and the parking space number detection frame is extracted.
The angle between the line connecting the corner points of the two parking spaces close to the parking space number in the parking space to which the parking space number detection frame belongs and the horizontal axis of the image coordinate system of the surround view composite image is calculated.
The parking space number detection frame in the deep neural network model so that the corresponding parking space number after rotation is horizontal in the image coordinate system based on the center point and the angle of the parking space number detection frame. Rotate the feature diagram of the parking space number corresponding to
The positioning element detection device according to claim 13, which is used for performing an operation of identifying a parking space number with respect to a characteristic diagram of the parking space number after being rotated by using a character classifier.

The positioning element detection module 402 is used to detect a reflection region existing in the surround view composite image using the deep neural network model.
30. Positioning element detector.

The image acquisition module
An image acquisition unit that acquires images collected by fisheye cameras located around the vehicle, and an image acquisition unit.
The positioning element detection device according to claim 10, further comprising an image composition unit that synthesizes the images and acquires the surround view composite image.

It ’s an electronic device,
A fisheye camera located around the electronic device to collect images,
With at least one processor
It comprises at least one processor and communicably connected memory.
An instruction that can be executed by at least one processor is stored in the memory, and when the instruction is executed by at least one processor, the instruction is executed by at least one processor according to claims 1 to 9. An electronic device that performs the positioning element detection method described in any of the above.

A non-temporary computer-readable storage medium that stores computer instructions.
The computer instruction is a non-temporary computer-readable storage medium used to cause the computer to execute the positioning element detection method according to any one of claims 1 to 9.

A computer program that causes the computer to execute the positioning element detection method according to any one of claims 1 to 9, while operating on the computer.