JP2017162438A

JP2017162438A - Risk prediction method

Info

Publication number: JP2017162438A
Application number: JP2016214701A
Authority: JP
Inventors: 和紀小塚; Kazuki Kozuka; 育規石井; Yasunori Ishii; 齊藤　雅彦; Masahiko Saito; 雅彦齊藤; 哲司渕上; Tetsuji Fuchigami
Original assignee: Panasonic Intellectual Property Corp of America
Current assignee: Panasonic Intellectual Property Corp of America
Priority date: 2016-03-11
Filing date: 2016-11-01
Publication date: 2017-09-14

Abstract

PROBLEM TO BE SOLVED: To provide a danger prediction method capable of predicting a dangerous area where a travelling vehicle is likely to be exposed to danger.SOLUTION: The danger prediction method executed by a computer of a danger predictor using convolutional neural networks includes: an acquisition step (S1) for causing the neural networks to acquire an input image captured by an on-vehicle camera mounted on a vehicle; and an output step (S2) for causing the convolutional neural networks to estimate a dangerous area in the input image acquired in the acquisition step, where a collision may occur between the vehicle and a moving object appearing on a travel route if the vehicle keeps travelling, and characteristics of the dangerous area and then to output them as a degree of danger predicted for the input image.SELECTED DRAWING: Figure 5

Description

本発明は、危険予測方法に関する。 The present invention relates to a risk prediction method.

例えば特許文献１では、車両を運転する運転者の安全確認を支援するための運転支援技術装置について開示されている。この運転支援技術によれば、例えば、交差点において信号が青から赤に変化する際、急に走り出す歩行者等といった、運転者の予想に反して急に速度を変化させ、事故に繋がる可能性が高い危険な移動物を、早いタイミングで精度良く検出できる。 For example, Patent Literature 1 discloses a driving support technology device for supporting safety confirmation of a driver who drives a vehicle. According to this driving support technology, for example, when the signal changes from blue to red at an intersection, there is a possibility of suddenly changing the speed against the driver's expectation, such as a pedestrian who suddenly runs, leading to an accident. Highly dangerous moving objects can be accurately detected at an early timing.

特許第４９６７０１５号公報Japanese Patent No. 4967015

Jonathan Long,Evan Shelhamer,Trevor Darrell，“Fully Convolutional Networks for Semantic Segmentation”;The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3431-3440Jonathan Long, Evan Shelhamer, Trevor Darrell, “Fully Convolutional Networks for Semantic Segmentation”; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3431-3440

しかしながら、上記特許文献１における運転支援技術装置では、実際に視認できる移動物の速度変化を観測し危険状態を予測しているに過ぎないという問題がある。すなわち、上記特許文献１における運転支援技術装置では、停車しているバスの背後の領域など、バスから降車した歩行者が飛び出してくると車両と衝突してしまうといった危険が発生しそうな領域（危険領域）を検出できないという課題がある。 However, the driving support technology apparatus in Patent Document 1 has a problem that it is merely predicting a dangerous state by observing a speed change of a moving object that can be actually visually recognized. In other words, in the driving support technology device in Patent Document 1 described above, an area where there is a danger of collision with a vehicle when a pedestrian who gets off the bus pops out, such as an area behind a stopped bus (danger There is a problem that the (region) cannot be detected.

本開示は、上述の事情を鑑みてなされたもので、走行中の車両にとって危険が発生しそうな危険領域を予測することができる危険予測方法を提供することを目的とする。 The present disclosure has been made in view of the above-described circumstances, and an object thereof is to provide a risk prediction method capable of predicting a dangerous area in which danger is likely to occur for a traveling vehicle.

上記課題を解決するために、本開示の一形態に係る危険予測方法は、畳み込みニューラルネットワークを用いた危険予測器のコンピュータが行う危険予測方法であって、車両に搭載された車載カメラにより撮影された入力画像を、前記畳み込みニューラルネットワークに取得させる取得ステップと、前記畳み込みニューラルネットワークに、前記取得ステップにおいて取得した前記入力画像中の危険領域であって前記車両がそのまま走行した場合に前記車両の走行経路中に運動物体が出現して前記車両と衝突する可能性のある危険領域と当該危険領域の特徴とを推定させ、前記入力画像に対して予測される危険度として出力させる出力ステップとを含む。 In order to solve the above problem, a risk prediction method according to an embodiment of the present disclosure is a risk prediction method performed by a computer of a risk predictor using a convolutional neural network, and is captured by an in-vehicle camera mounted on a vehicle. An acquisition step of causing the convolutional neural network to acquire the input image, and a travel of the vehicle when the vehicle travels as it is in the dangerous region in the input image acquired in the acquisition step of the convolutional neural network. An output step of estimating a dangerous area where a moving object may appear and collide with the vehicle and a feature of the dangerous area and outputting the predicted risk level to the input image. .

なお、これらの全般的または具体的な態様は、システム、方法、集積回路、コンピュータプログラムまたはコンピュータで読み取り可能なＣＤ−ＲＯＭなどの記録媒体で実現されてもよく、システム、方法、集積回路、コンピュータプログラムおよび記録媒体の任意な組み合わせで実現されてもよい。 These general or specific aspects may be realized by a system, a method, an integrated circuit, a computer program, or a computer-readable recording medium such as a CD-ROM. The system, method, integrated circuit, computer You may implement | achieve with arbitrary combinations of a program and a recording medium.

本開示によれば、走行中の車両にとって危険が発生しそうな危険領域を予測することができる危険予測方法を実現できる。 According to the present disclosure, it is possible to realize a danger prediction method capable of predicting a dangerous area in which danger is likely to occur for a traveling vehicle.

図１は、実施の形態１における危険予測器の構成の一例を示すブロック図である。FIG. 1 is a block diagram illustrating an example of a configuration of a risk predictor according to the first embodiment. 図２は、図１に示す危険予測器が用いる畳み込みニューラルネットワークの構造の概略を示す図である。FIG. 2 is a diagram showing an outline of the structure of a convolutional neural network used by the risk predictor shown in FIG. 図３は、実施の形態１における危険予測器の予測結果の一例を示す図である。FIG. 3 is a diagram illustrating an example of a prediction result of the risk predictor according to the first embodiment. 図４は、実施の形態１における危険予測器の予測結果の別の一例を示す図である。FIG. 4 is a diagram illustrating another example of the prediction result of the risk predictor according to the first embodiment. 図５は、実施の形態１における危険予測器の予測処理の一例を示すフローチャートである。FIG. 5 is a flowchart illustrating an example of the prediction process of the risk predictor according to the first embodiment. 図６は、実施の形態１における畳み込みニューラルネットワークの学習処理の概要を示すフローチャートである。FIG. 6 is a flowchart showing an outline of learning processing of the convolutional neural network in the first embodiment. 図７Ａは、ステップＳ１において準備される学習用データの説明図である。FIG. 7A is an explanatory diagram of the learning data prepared in step S1. 図７Ｂは、ステップＳ１において準備される学習用データの説明図である。FIG. 7B is an explanatory diagram of the learning data prepared in step S1. 図７Ｃは、ステップＳ１において準備される学習用データの説明図である。FIG. 7C is an explanatory diagram of the learning data prepared in step S1. 図７Ｄは、ステップＳ１において準備される学習用データの説明図である。FIG. 7D is an explanatory diagram of the learning data prepared in step S1. 図８は、学習処理の結果として出力される危険領域とその特徴の一例を示す図である。FIG. 8 is a diagram illustrating an example of a dangerous area output as a result of the learning process and its characteristics. 図９は、実施例における学習システムの構成の一例を示すブロック図である。FIG. 9 is a block diagram illustrating an example of a configuration of the learning system in the embodiment. 図１０は、学習用データ作成装置が行うアノテーション付きデータの作成処理を示すフローチャートである。FIG. 10 is a flowchart showing a process for creating annotated data performed by the learning data creation apparatus. 図１１は、学習装置が行う学習処理を示すフローチャートである。FIG. 11 is a flowchart illustrating a learning process performed by the learning device. 図１２Ａは、ステップＳ１において準備されるアノテーション付き画像の別の一例である。FIG. 12A is another example of the annotated image prepared in step S1. 図１２Ｂは、学習処理の結果として出力される危険領域とその特徴の別の一例を示す図である。FIG. 12B is a diagram illustrating another example of the dangerous area output as a result of the learning process and its characteristics. 図１３は、詳細領域情報の一例を示す図である。FIG. 13 is a diagram illustrating an example of detailed area information. 図１４Ａは、学習用画像の一例である。FIG. 14A is an example of a learning image. 図１４Ｂは、危険領域を示すアノテーションが付されたアノテーション付き画像の一例である。FIG. 14B is an example of an annotated image with an annotation indicating a dangerous area. 図１５Ａは、学習用画像の一例である。FIG. 15A is an example of a learning image. 図１５Ｂは、危険領域を示すアノテーションが付されたアノテーション付き画像の一例である。FIG. 15B is an example of an annotated image with an annotation indicating a dangerous area. 図１６Ａは、学習用画像の一例である。FIG. 16A is an example of a learning image. 図１６Ｂは、危険領域を示すアノテーションが付されたアノテーション付き画像の一例である。FIG. 16B is an example of an annotated image with an annotation indicating a dangerous area. 図１７Ａは、学習用画像の一例である。FIG. 17A is an example of a learning image. 図１７Ｂは、危険領域を示すアノテーションが付されたアノテーション付き画像の一例である。FIG. 17B is an example of an annotated image with an annotation indicating a dangerous area. 図１８は、実施の形態２における２段階の学習処理の概要を示すフローチャートである。FIG. 18 is a flowchart showing an outline of a two-stage learning process in the second embodiment. 図１９は、１段階目の学習処理を概念的に説明するための説明図である。FIG. 19 is an explanatory diagram for conceptually explaining the first-stage learning process. 図２０は、２段階目の学習処理を概念的に説明するための説明図である。FIG. 20 is an explanatory diagram for conceptually explaining the learning process at the second stage. 図２１は、１段階目の学習処理を概念的に説明するための別の説明図である。FIG. 21 is another explanatory diagram for conceptually explaining the first-stage learning process.

（本発明の一態様を得るに至った経緯）
近年、Deep Learningを使用することにより、画像認識の性能が劇的に向上している。Deep Learningは、多層のニューラルネットワークを使った機械学習の方法論として知られ、このような多層ニューラルネットワークには、畳み込みニューラルネットワーク（Convolutional Neural Network：CNN)が用いられることが多い。ここで、畳み込みニューラルネットワークは、局所領域の畳み込み(Convolution)とプーリング(Pooling)とを繰り返す多層のニューラルネットワークからなる。 (Background to obtaining one embodiment of the present invention)
In recent years, the performance of image recognition has improved dramatically by using Deep Learning. Deep learning is known as a machine learning methodology using a multilayer neural network, and a convolutional neural network (CNN) is often used for such a multilayer neural network. Here, the convolutional neural network is composed of a multilayer neural network that repeats convolution and pooling of a local region.

ここで、畳み込みニューラルネットワークは、複数のフィルタによる畳み込み処理により特徴を抽出する畳み込み層と、一定領域の反応をまとめるプーリング処理により局所的なデータの不変性を獲得するプーリング層と、Softmax関数などによる確率を用いて認識を行う全結合層とを有する。 Here, the convolutional neural network is based on a convolution layer that extracts features by convolution processing using multiple filters, a pooling layer that acquires local data invariance by pooling processing that collects reactions in a certain region, and a Softmax function. And a fully connected layer that performs recognition using probability.

しかし、このような畳み込みニューラルネットワークを用いた画像認識の処理は、リアルタイムには実行できないという課題を有している。 However, the image recognition process using such a convolutional neural network has a problem that it cannot be executed in real time.

それに対して、例えば非特許文献１では、畳み込みニューラルネットワークを構成する全結合層を畳み込み層にする畳み込みニューラルネットワークの構造が提案されている。この構造の畳み込みニューラルネットワーク（完全畳み込みニューラルネットワーク）を用いることにより画像認識の処理をリアルタイムに実行することできる。 On the other hand, for example, Non-Patent Document 1 proposes a structure of a convolutional neural network in which all concatenated layers constituting the convolutional neural network are convolutional layers. By using a convolutional neural network having this structure (complete convolutional neural network), it is possible to execute image recognition processing in real time.

そこで、発明者（ら）は、完全畳み込みニューラルネットワークを用いて、上述の課題を解決することについて想到した。 Accordingly, the inventors (e.g.) have come up with the idea of solving the above-described problem using a complete convolution neural network.

すなわち、本開示の一形態に係る危険予測方法は、畳み込みニューラルネットワークを用いた危険予測器のコンピュータが行う危険予測方法であって、車両に搭載された車載カメラにより撮影された入力画像を、前記畳み込みニューラルネットワークに取得させる取得ステップと、前記畳み込みニューラルネットワークに、前記取得ステップにおいて取得した前記入力画像中の危険領域であって前記車両がそのまま走行した場合に前記車両の走行経路中に運動物体が出現して前記車両と衝突する可能性のある危険領域と当該危険領域の特徴とを推定させ、前記入力画像に対して予測される危険度として出力させる出力ステップとを含む。 That is, the risk prediction method according to an aspect of the present disclosure is a risk prediction method performed by a computer of a risk predictor using a convolutional neural network, and an input image captured by an in-vehicle camera mounted on a vehicle is An acquisition step that causes the convolutional neural network to acquire, and a moving object that is a dangerous region in the input image acquired in the acquisition step and that the vehicle travels as it is in the convolutional neural network when the vehicle travels as it is. An output step of estimating a dangerous area that may appear and collide with the vehicle and a characteristic of the dangerous area and outputting the estimated risk level for the input image.

これにより、取得した入力画像に対する危険度を推定することができるので、走行中の車両にとって危険が発生しそうな危険領域を予測することができる危険予測方法を実現できる。 As a result, since the degree of danger for the acquired input image can be estimated, it is possible to realize a danger prediction method capable of predicting a dangerous area in which danger is likely to occur for a running vehicle.

ここで、例えば、前記出力ステップにおいて、前記畳み込みニューラルネットワークに、当該危険領域の特徴として、当該危険領域の危険度合いを推定させ、推定した前記危険領域および当該危険領域の危険度合いを示す尤度マップを前記危険度として出力させるとしてもよい。 Here, for example, in the output step, the convolutional neural network is caused to estimate the risk level of the risk area as a characteristic of the risk area, and the likelihood map indicating the estimated risk area and the risk level of the risk area May be output as the degree of risk.

また、例えば、前記出力ステップにおいて、前記畳み込みニューラルネットワークに、当該危険領域の特徴として、前記危険領域に対応づけられる前記運動物体の種別を推定させ、推定した前記危険領域および前記種別を前記危険度として出力させるとしてもよい。 Further, for example, in the output step, the convolutional neural network is caused to estimate the type of the moving object associated with the dangerous area as the characteristic of the dangerous area, and the estimated dangerous area and the type are used as the risk level. May be output as

また、例えば、前記危険予測方法は、さらに、前記取得ステップを行う前に、前記危険領域を含む学習用画像と、前記学習用画像に前記危険領域を示すアノテーションが付されたアノテーション付き画像とを用いて、前記畳み込みニューラルネットワークに、前記学習用画像の中の前記危険領域と当該危険領域の前記特徴とを推定させるための当該畳み込みニューラルネットワークの重みを学習させる学習ステップを含むとしてもよい。 Further, for example, the risk prediction method may further include a learning image including the dangerous area and an annotated image in which an annotation indicating the dangerous area is attached to the learning image before performing the obtaining step. A learning step may be included in which the convolutional neural network learns the weight of the convolutional neural network for estimating the dangerous area and the feature of the dangerous area in the learning image.

また、例えば、前記学習ステップは、前記学習用画像それぞれのうちの一部領域であって前記危険領域を示すアノテーションが付された領域の危険領域画像と当該アノテーションが付されていない領域の安全領域画像とを用いて、全結合層を有する畳み込みニューラルネットワークである第１ニューラルネットワークに、前記一部領域が安全領域であるまたは危険領域であると判定させるための当該第１ニューラルネットワークの第１重みを学習させる第１学習ステップと、前記第１ニューラルネットワークの全結合層が畳み込み層に変更された構成からなる第２ニューラルネットワークの重みの初期値を、前記第１学習ステップにおいて学習された前記第１重みに更新し、前記危険領域を含む学習用画像と、前記学習用画像に前記危険領域を示すアノテーションが付されたアノテーション付き画像とを用いて、前記第２ニューラルネットワークに、前記学習用画像の中の前記危険領域と当該危険領域の前記特徴とを推定させるための当該第２ニューラルネットワークの第２重みを学習させることにより、前記第２ニューラルネットワークと同一構成である前記畳み込みニューラルネットワークの重みを学習させる第２学習ステップとを含むとしてもよい。 Further, for example, the learning step includes a dangerous area image of a region that is a partial region of each of the learning images and an annotation that indicates the dangerous region, and a safety region that is not annotated. A first weight of the first neural network for causing the first neural network, which is a convolutional neural network having a fully connected layer, to determine that the partial region is a safe region or a dangerous region using an image And the initial value of the weight of the second neural network having a configuration in which all the connection layers of the first neural network are changed to convolutional layers, the first learning step learned in the first learning step. An image for learning that includes the dangerous area, updated to one weight, and the dangerous area in the learning image The second neural network for causing the second neural network to estimate the dangerous area in the learning image and the feature of the dangerous area using the annotated image with the annotation shown. A second learning step of learning the weight of the convolutional neural network having the same configuration as the second neural network by learning the second weight may be included.

また、例えば、前記危険予測方法は、さらに、車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像を取得し、取得した前記複数の画像の少なくとも一部の画像に含まれる危険領域であって、前記車両がそのまま走行した場合に前記車両の走行経路中に運動物体が出現して前記車両と衝突する可能性のある危険領域を決定して、当該少なくとも一部の画像に対して、決定した危険領域を示すアノテーションを付与する付与ステップと、を含み、前記学習ステップでは、前記付与ステップにおいてアノテーションが付与された当該少なくとも一部の画像、および、前記複数の画像のうち当該少なくとも一部の画像に対応する画像を、前記学習用画像および前記アノテーション付き画像として取得して、当該畳み込みニューラルネットワークの前記重みを学習させるとしてもよい。 In addition, for example, the risk prediction method further includes acquiring a plurality of time-sequential images taken by an in-vehicle camera mounted on a vehicle, and is included in at least some of the acquired images. A dangerous area, which is a dangerous area where a moving object may appear in the traveling route of the vehicle and collide with the vehicle when the vehicle travels as it is, and the at least part of the image is determined. An assigning step for giving an annotation indicating the determined dangerous area, and in the learning step, the at least a part of the images to which the annotation is given in the giving step, and the plurality of images among the plurality of images An image corresponding to at least a part of the image is acquired as the learning image and the annotated image, and the convolution new It may train the said weights Le networks.

ここで、例えば、前記危険領域は、前記学習用画像中に存在する遮蔽物の領域の一部を含む領域であって、前記運動物体が隠れており前記運動物体が前記遮蔽物から前記走行経路中に出現する前の領域である。 Here, for example, the dangerous area is an area including a part of the area of the shielding object existing in the learning image, the moving object is hidden, and the moving object is moved from the shielding object to the travel route. This is the area before it appears inside.

また、例えば、前記危険領域は、人物を含む２以上の運動物体同士が接近すると前記車両の走行経路中を横切ることになる前記２以上の運動物体の間の領域であるとしてもよい。 In addition, for example, the danger area may be an area between the two or more moving objects that will cross the travel route of the vehicle when two or more moving objects including a person approach each other.

また、例えば、前記アノテーションは、前記危険領域と前記危険領域に対応づけられる前記運動物体のカテゴリとを示すとしてもよい。 Further, for example, the annotation may indicate the danger area and the category of the moving object associated with the danger area.

また、例えば、前記アノテーションは、前記危険領域と前記学習用画像の撮影時における前記車両のブレーキ強度またはハンドル角度を含む制御情報とを示すとしてもよい。 Further, for example, the annotation may indicate the danger area and control information including a brake strength or a steering wheel angle of the vehicle at the time of capturing the learning image.

また、例えば、前記アノテーションは、前記危険領域のセグメント情報であるとしてもよい。 For example, the annotation may be segment information of the dangerous area.

以下で説明する実施の形態は、いずれも本開示の一具体例を示すものである。以下の実施の形態で示される数値、形状、構成要素、ステップ、ステップの順序などは、一例であり、本開示を限定する主旨ではない。また、以下の実施の形態における構成要素のうち、最上位概念を示す独立請求項に記載されていない構成要素については、任意の構成要素として説明される。また全ての実施の形態において、各々の内容を組み合わせることもできる。 Each of the embodiments described below shows a specific example of the present disclosure. Numerical values, shapes, components, steps, order of steps, and the like shown in the following embodiments are merely examples, and are not intended to limit the present disclosure. In addition, among the constituent elements in the following embodiments, constituent elements that are not described in the independent claims indicating the highest concept are described as optional constituent elements. In all the embodiments, the contents can be combined.

（実施の形態１）
以下では、図面を参照しながら、実施の形態１における危険予測器１０の危険予測方法等の説明を行う。 (Embodiment 1)
Hereinafter, the risk prediction method of the risk predictor 10 according to the first embodiment will be described with reference to the drawings.

［危険予測器１０の構成］
図１は、本実施の形態における危険予測器１０の構成の一例を示すブロック図である。図２は、図１に示す危険度測器１０が用いる畳み込みニューラルネットワークの構造の概略を示す図である。図３は、本実施の形態における危険予測器１０の予測結果の一例を示す図である。 [Configuration of danger predictor 10]
FIG. 1 is a block diagram showing an example of the configuration of the danger predictor 10 in the present embodiment. FIG. 2 is a diagram showing an outline of the structure of a convolutional neural network used by the risk meter 10 shown in FIG. FIG. 3 is a diagram illustrating an example of a prediction result of the risk predictor 10 in the present embodiment.

図１に示す危険予測器１０は、畳み込みニューラルネットワークを用いた危険予測器であって、コンピュータ等で実現される。危険予測器１０が用いる畳み込みニューラルネットワークは、例えば図２に示すように、畳み込み層１０１と、全結合層が畳み込み層１０２に変更されている完全畳み込みニューラルネットワークである。 A risk predictor 10 shown in FIG. 1 is a risk predictor using a convolutional neural network, and is realized by a computer or the like. The convolutional neural network used by the risk predictor 10 is a complete convolutional neural network in which, for example, as shown in FIG.

ここで、畳み込みニューラルネットワークは、画像認識分野でよく使われ、２次元画像に対してフィルタによる畳み込みを行うことで、画像から特徴量を抽出する。畳み込みニューラルネットワークは、上述したように、畳み込みとプーリングとを繰り返す多層ネットワークからなる。そして、畳み込みニューラルネットワークにおいて畳み込み層を構成する識別に有効なフィルタの係数（重み）を、大量の学習用画像などの大量の学習用データを用いて学習させる。当該係数（重み）は、大量のデータを用いて、フィルタによる畳み込みと、一定領域の反応をまとめるプーリングとを繰り返すことで多様な変形に対する不変性を獲得する学習を行うことにより得られる。なお、畳み込みニューラルネットワークの識別性能は、畳み込み層を構成するフィルタに依存することがわかっている。 Here, the convolutional neural network is often used in the field of image recognition, and extracts a feature amount from an image by performing convolution with a filter on a two-dimensional image. As described above, the convolutional neural network is composed of a multilayer network that repeats convolution and pooling. Then, a filter coefficient (weight) effective for identification constituting the convolution layer in the convolutional neural network is learned using a large amount of learning data such as a large amount of learning images. The coefficient (weight) is obtained by performing learning to acquire invariance with respect to various deformations by repeating convolution by a filter and pooling that collects reactions in a certain region using a large amount of data. It has been found that the discrimination performance of the convolutional neural network depends on the filters constituting the convolutional layer.

図１に示す危険予測器１０は、車載カメラにより撮影された入力画像が入力されると、図２に示すような畳み込みニューラルネットワークを用いて、入力画像における走行中の車両にとっての危険領域とその特徴を推定し、予測した危険度として出力する。危険予測器１０は、過去に人の飛び出しが発生して危険となった画像の直前の画像の一部領域すなわち過去に危険に繋がった領域に近いほど高い危険度を予測する。 When the input image captured by the in-vehicle camera is input, the risk predictor 10 illustrated in FIG. 1 uses a convolutional neural network as illustrated in FIG. Estimate features and output as predicted risk. The danger predictor 10 predicts a higher degree of danger as it is closer to a partial area of an image immediately before an image that has become dangerous due to the occurrence of a person jumping out in the past, that is, an area that has led to danger in the past.

より具体的には、図１に示す危険予測器１０は、車両に搭載された車載カメラにより撮影された入力画像を取得する。危険予測器１０は、取得した入力画像中の危険領域であって当該車両がそのまま走行した場合に当該車両の走行経路中に運動物体が出現して当該車両と衝突する可能性のある危険領域と当該危険領域の特徴とを畳み込みニューラルネットワークで推定する。そして、危険予測器１０は、それを取得した入力画像に対して予測される危険度として出力する。ここで、危険予測器１０は、当該危険領域の特徴として、当該危険領域の危険度合いを推定し、推定した当該危険領域および当該危険領域の危険度合いを示す尤度マップを危険度として出力する。 More specifically, the risk predictor 10 shown in FIG. 1 acquires an input image taken by an in-vehicle camera mounted on the vehicle. The danger predictor 10 is a dangerous area in the acquired input image, and when the vehicle travels as it is, there is a danger area where a moving object may appear in the traveling route of the vehicle and may collide with the vehicle. The characteristics of the dangerous area are estimated by a convolutional neural network. Then, the danger predictor 10 outputs the degree of danger predicted for the acquired input image. Here, the risk predictor 10 estimates the risk level of the risk area as a characteristic of the risk area, and outputs the estimated risk area and a likelihood map indicating the risk level of the risk area as the risk level.

本実施の形態では、危険予測器１０は、入力画像として、例えば図３に示すような画像５０すなわち停車しているバス５０１と、歩行者である人物５０２および５０３とが含まれる画像５０を取得したとする。この場合、危険予測器１０は、画像５０において、停車しているバス５０１の背後の領域と、人物５０２の領域と、人物５０３の領域とが危険領域であると推定する。そして、画像５０に対して推定した危険領域の危険度合い（尤度）を重畳させた画像５０ａを、予測した危険度として出力する。画像５０ａには、尤度マップとして、停車しているバス５０１の背後の領域の尤度５０４と、人物５０２の領域の尤度５０５と、人物５０３の領域の尤度５０６とが重畳されている。この尤度は、車両が衝突する可能性のあるリスク（危険度）を示している。本実施の形態では、この尤度により、人が見えていない領域であり飛び出しそうな領域に対しても危険に繋がった領域であるとして高い危険度（リスク）が示される。 In the present embodiment, the risk predictor 10 acquires, as an input image, for example, an image 50 as shown in FIG. 3, that is, an image 50 including a stopped bus 501 and pedestrians 502 and 503. Suppose that In this case, the danger predictor 10 estimates in the image 50 that the area behind the stopped bus 501, the area of the person 502, and the area of the person 503 are dangerous areas. Then, an image 50a in which the estimated risk degree (likelihood) of the dangerous region is superimposed on the image 50 is output as the predicted risk degree. In the image 50a, as the likelihood map, the likelihood 504 in the area behind the stopped bus 501, the likelihood 505 in the area of the person 502, and the likelihood 506 in the area of the person 503 are superimposed. . This likelihood indicates the risk (risk level) that the vehicle may collide with. In the present embodiment, this likelihood indicates a high risk (risk) as an area that is not visible to humans and is an area that leads to danger even for an area that is likely to jump out.

なお、危険予測器１０が出力する危険度は、図３に示されるような危険領域とその危険度合いとを示す尤度マップに限らない。以下、この例を図４を用いて説明する。 Note that the risk level output by the risk predictor 10 is not limited to the likelihood map indicating the risk area and the risk level as shown in FIG. Hereinafter, this example will be described with reference to FIG.

図４は、本実施の形態における危険予測器１０の予測結果の別の一例を示す図である。なお、図３と同様の要素には同一の符号を付しており、詳細な説明は省略する。 FIG. 4 is a diagram illustrating another example of the prediction result of the risk predictor 10 according to the present embodiment. Elements similar to those in FIG. 3 are denoted by the same reference numerals, and detailed description thereof is omitted.

危険予測器１０は、入力画像として取得した画像５０に対して、停車しているバス５０１の背後の領域５０７と、人物５０２の領域５０８と、人物５０３の領域５０９とが危険領域であると推定し、画像５０に対して推定した危険領域とそのカテゴリを重畳させた画像５０ｂを出力する。つまり、危険予測器１０は、画像５０に対して領域５０７とそのカテゴリである危険領域（車）、領域５０８とそのカテゴリである危険領域（人）、領域５０９とそのカテゴリである危険領域（人）が重畳させた画像５０ｂを出力する。ここで、画像５０ｂにおいて、危険領域である領域５０７に対応付けられる運動物体の種別が車であることから、危険領域のカテゴリとして、危険領域（車）が重畳されている。同様に、画像５０ｂにおいて、危険領域である領域５０８および領域５０９に対応付けられる運動物体の種別が人物（人）であることから、危険領域のカテゴリとして、危険領域（人）が重畳されている。 The danger predictor 10 estimates that the area 507 behind the stopped bus 501, the area 508 of the person 502, and the area 509 of the person 503 are dangerous areas for the image 50 acquired as the input image. Then, an image 50b in which the estimated dangerous area and its category are superimposed on the image 50 is output. In other words, the danger predictor 10 has an area 507 and its category dangerous area (car), an area 508 and its category dangerous area (person), and an area 509 and its category dangerous area (person). ) Is output. Here, in the image 50b, since the type of the moving object associated with the region 507, which is a dangerous region, is a car, a dangerous region (vehicle) is superimposed as a category of the dangerous region. Similarly, in the image 50b, since the type of the moving object associated with the areas 508 and 509 which are the dangerous areas is a person (person), the dangerous area (person) is superimposed as the category of the dangerous area. .

このように、危険予測器１０は、当該危険領域の特徴として、当該危険領域に対応づけられる前記運動物体の種別を推定し、推定した当該危険領域および当該種別を危険度として出力してもよい。 Thus, the danger predictor 10 may estimate the type of the moving object associated with the dangerous area as the characteristic of the dangerous area, and output the estimated dangerous area and the type as the degree of danger. .

［危険予測器１０の予測処理］
次に、本実施の形態の係る危険予測器１０の予測処理について、図を用いて説明する。 [Prediction process of danger predictor 10]
Next, the prediction process of the risk predictor 10 according to the present embodiment will be described with reference to the drawings.

図５は、本実施の形態における危険予測器１０の予測処理の一例を示すフローチャートである。 FIG. 5 is a flowchart showing an example of the prediction process of the risk predictor 10 in the present embodiment.

図５に示すように、危険予測器１０のコンピュータは、まず、畳み込みニューラルネットワークに、車両に搭載された車載カメラにより撮影された入力画像を取得させる（Ｓ１）。本実施の形態では、危険予測器１０のコンピュータは、畳み込みニューラルネットワークに、例えば上述した図５に示す画像５０を入力画像として取得させる。 As shown in FIG. 5, the computer of the risk predictor 10 first causes the convolutional neural network to acquire an input image taken by an in-vehicle camera mounted on the vehicle (S1). In the present embodiment, the computer of the risk predictor 10 causes the convolutional neural network to acquire, for example, the above-described image 50 shown in FIG. 5 as an input image.

次に、危険予測器１０のコンピュータは、畳み込みニューラルネットワークに、危険領域とその特徴とを推定させ、入力画像に対して予測した危険度として出力させる（Ｓ２）。より具体的には、危険予測器１０のコンピュータは、畳み込みニューラルネットワークに、Ｓ１において取得させた入力画像中の危険領域であって車両がそのまま走行した場合に車両の走行経路中に運動物体が出現して車両と衝突する可能性のある危険領域と当該危険領域の特徴とを推定させる。そして、それを、入力画像に対して予測した危険度として出力させる。 Next, the computer of the risk predictor 10 causes the convolutional neural network to estimate the risk region and its features and output the predicted risk level for the input image (S2). More specifically, the computer of the risk predictor 10 causes the convolutional neural network to show a moving object in the travel route of the vehicle when the vehicle travels as it is in the danger region in the input image acquired in S1. Thus, the danger area that may collide with the vehicle and the characteristics of the danger area are estimated. Then, it is output as a predicted risk level for the input image.

本実施の形態では、危険予測器１０のコンピュータは、当該畳み込みニューラルネットワークに、当該危険領域の特徴として、当該危険領域の危険度合いを推定させ、推定した危険領域および当該危険領域の危険度合いを示す尤度マップを危険度として出力させる。例えば、危険予測器１０のコンピュータは、畳み込みニューラルネットワークに、例えば上述した図５に示す画像５０ａを予測した危険度として出力させる。 In the present embodiment, the computer of the risk predictor 10 causes the convolutional neural network to estimate the risk level of the risk area as a characteristic of the risk area, and indicates the estimated risk area and the risk level of the risk area. Output likelihood map as risk. For example, the computer of the risk predictor 10 causes the convolutional neural network to output, for example, the image 50a shown in FIG.

なお、危険予測器１０のコンピュータは、上述したように、当該畳み込みニューラルネットワークに、当該危険領域の特徴として、当該危険領域危険領域に対応づけられる前記運動物体の種別を推定させ、推定した危険領域および当該危険領域の種別を危険度として出力させてもよい。 As described above, the computer of the danger predictor 10 causes the convolutional neural network to estimate the type of the moving object associated with the dangerous area dangerous area as the characteristic of the dangerous area, and the estimated dangerous area. The type of the dangerous area may be output as the degree of danger.

［危険予測器１０の効果等］
以上のように、実施の形態１に係る危険予測器１０によれば、車両に搭載された車載カメラにより撮影された入力画像中の危険領域と当該危険領域の特徴とを推定することができる。特に、実施の形態１に係る危険予測器１０は、停車しているバスの背後の領域など、バスから降車した歩行者が見えていないが飛び出しそうな領域も、走行中の車両にとって危険が発生しそうな危険領域として予測することができる。 [Effects of danger predictor 10]
As described above, according to the risk predictor 10 according to the first embodiment, it is possible to estimate the risk area in the input image captured by the in-vehicle camera mounted on the vehicle and the characteristics of the risk area. In particular, the risk predictor 10 according to the first embodiment generates a danger for a traveling vehicle even in a region where a pedestrian getting off the bus is not visible, such as a region behind a stopped bus, but is likely to jump out. It can be predicted as a likely danger area.

それにより、例えば自動運転中の車両が危険予測器１０を備える場合には、当該車両は、車載カメラの画像を用いて危険が発生しそうな危険領域を予測し、予測した危険領域を回避する制御を行うことができるので、より安全な走行が可能になる。 Thereby, for example, when a vehicle in automatic driving includes the danger predictor 10, the vehicle predicts a dangerous area where danger is likely to occur using an image of the in-vehicle camera, and avoids the predicted dangerous area. Can be performed more safely.

［危険予測器１０の学習処理］
以下、このような危険予測器１０を実現するための学習処理について説明する。学習処理が行われることにより、危険予測器１０に用いられる畳み込みニューラルネットワークとして機能する畳み込みニューラルネットワークを、畳み込みニューラルネットワーク１０ａと称して説明する。 [Learning process of danger predictor 10]
Hereinafter, a learning process for realizing such a risk predictor 10 will be described. A convolutional neural network that functions as a convolutional neural network used for the risk predictor 10 by performing the learning process will be described as a convolutional neural network 10a.

図６は、本実施の形態における畳み込みニューラルネットワーク１０ａの学習処理の概要を示すフローチャートである。図７Ａ〜図７Ｄは、ステップＳ１において準備される学習用データの説明図である。図８は、学習処理の結果として出力される危険領域とその特徴の一例を示す図である。 FIG. 6 is a flowchart showing an outline of the learning process of the convolutional neural network 10a in the present embodiment. 7A to 7D are explanatory diagrams of learning data prepared in step S1. FIG. 8 is a diagram illustrating an example of a dangerous area output as a result of the learning process and its characteristics.

まず、学習用データを準備する（Ｓ１１）。より具体的には、危険領域を含む学習用画像と、学習用画像に危険領域を示すアノテーションが付されたアノテーション付き画像とで構成される学習用データを準備する。 First, learning data is prepared (S11). More specifically, learning data including a learning image including a dangerous area and an annotated image in which an annotation indicating the dangerous area is added to the learning image is prepared.

ここで、図７Ａ〜図７Ｄを用いて、学習用データについて説明する。 Here, the learning data will be described with reference to FIGS. 7A to 7D.

図７Ａおよび図７Ｂは、学習用画像の一例であり、過去に車載カメラにより撮影された複数の画像のうち危険につながった画像の一例である。図７Ｂに示す画像５２には、停車しているバス５１１と歩行者である人物５１２と、バス５１１の背後の領域から出現した歩行者である人物５１３とが示されている。図７Ａに示す画像５１には、停車しているバス５１１と歩行者である人物５１２とが示されており、人物５１３はバス５１１の背後の領域に存在するが見えていない。 FIG. 7A and FIG. 7B are examples of learning images, and are examples of images that have led to danger among a plurality of images captured by the in-vehicle camera in the past. The image 52 shown in FIG. 7B shows a bus 511 that is stopped, a person 512 that is a pedestrian, and a person 513 that is a pedestrian that appears from the area behind the bus 511. The image 51 shown in FIG. 7A shows a bus 511 that is stopped and a person 512 that is a pedestrian, and the person 513 is present in the area behind the bus 511 but is not visible.

図７Ｃに示す画像５１ａは、アノテーション付き画像の一例であり、画像５１の人物が出現する所定時間前の領域にアノテーションが付与されている。より具体的には、画像５１ａは、画像５１を撮影した車載カメラを搭載する車両がそのまま走行した場合に車両の走行経路中に人物５１３である運動物体が出現して当該車両と衝突する可能性のある危険領域を示すアノテーションが付与されている。 An image 51a shown in FIG. 7C is an example of an annotated image, and an annotation is given to an area of the image 51 that is a predetermined time before the person appears. More specifically, when the vehicle equipped with the in-vehicle camera that captured the image 51 travels as it is, the image 51a may cause a moving object that is the person 513 to appear in the travel route of the vehicle and collide with the vehicle. Annotation indicating a certain dangerous area is given.

なお、危険領域を示すアノテーションは、図７Ｃの画像５１ａに示すように、見えていないが飛び出しそうな人物５１３を含むバス５１１の背後の領域５１４つまり飛び出しが発生する直前の領域５１４に付す場合に限らない。例えば図７Ｄに示すように、領域５１４に加えて、見えている人物５１２の領域５１５に対しての危険領域を示すアノテーションを付与し、その他の領域（例えば領域５１６）に対して安全領域を示すアノテーションを付与するとしてもよい。 It should be noted that the annotation indicating the dangerous area is attached to the area 514 behind the bus 511 including the person 513 that is not visible but likely to jump out, as shown in the image 51a in FIG. 7C, that is, the area 514 immediately before the jumping occurs. Not exclusively. For example, as shown in FIG. 7D, in addition to the area 514, an annotation indicating a dangerous area for the area 515 of the visible person 512 is given, and a safety area is shown for other areas (for example, the area 516). An annotation may be given.

以下、図６に戻って説明する。 Hereinafter, the description will be returned to FIG.

次に、コンピュータは、学習用データを用いて、畳み込みニューラルネットワーク１０ａを、危険予測器１０として機能させるために、学習処理を行う（Ｓ１２）。より具体的には、危険領域を含む学習用画像と、学習用画像に危険領域を示すアノテーションが付されたアノテーション付き画像とを用いて、畳み込みニューラルネットワーク１０ａに、学習用画像の中の危険領域と当該危険領域の特徴とを推定させるための当該畳み込みニューラルネットワーク１０ａの重みを学習させる学習処理を行う。 Next, the computer performs learning processing using the learning data in order to cause the convolutional neural network 10a to function as the risk predictor 10 (S12). More specifically, using the learning image including the dangerous area and the annotated image in which the learning image is annotated with the annotation indicating the dangerous area, the convolutional neural network 10a is used to transmit the dangerous area in the learning image. And a learning process for learning the weight of the convolutional neural network 10a for estimating the characteristics of the dangerous area.

例えば、図７Ａに示す画像５１を危険領域を含む学習用画像、図７Ｄに示す画像５１ｂを学習用画像に危険領域を示すアノテーションが付されたアノテーション付き画像として用いて畳み込みニューラルネットワーク１０ａに学習処理させたとする。この場合、畳み込みニューラルネットワーク１０ａは、図８の画像５１ｃに示されるような危険領域および当該危険領域の危険度合いを示す尤度マップを推定するための重みを学習する。ここで、図８の画像５１ｃにおいて、停車しているバス５１１の背後の領域と、人物５１２の領域とが危険領域であり、バス５１１の背後の領域の尤度５１７と、人物５１２の領域の尤度５１８とが重畳されている。 For example, the learning process is performed on the convolutional neural network 10a using the image 51 illustrated in FIG. 7A as a learning image including a dangerous area and the image 51b illustrated in FIG. 7D as an image with an annotation in which an annotation indicating the dangerous area is added to the learning image. Suppose that In this case, the convolutional neural network 10a learns the weights for estimating the risk region and the likelihood map indicating the risk level of the risk region as shown in the image 51c of FIG. Here, in the image 51c of FIG. 8, the area behind the bus 511 and the area of the person 512 are dangerous areas, the likelihood 517 of the area behind the bus 511, and the area of the person 512 Likelihood 518 is superimposed.

なお、学習用データの準備において、学習用画像に危険領域を示すアノテーションを付す作業は、例えばクラウドソーシングのワーカなどに依頼して行うなど、人手で行うが、一部または全部の作業をコンピュータにさせるとしてもよい。以下、学習システムのコンピュータが、学習用画像に危険領域を示すアノテーションを付す作業を行い、学習用データを用いて、畳み込みニューラルネットワーク１０ａに学習処理を行う場合を実施例として説明する。 In preparing the learning data, the task of adding an annotation indicating the dangerous area to the learning image is done manually, for example, by requesting a crowdsourcing worker, etc. It may be allowed. Hereinafter, a case where the computer of the learning system performs an operation of attaching an annotation indicating a dangerous area to a learning image and performs learning processing on the convolutional neural network 10a using the learning data will be described as an example.

（実施例）
［学習システムの構成］
図９は、実施例における学習システムの構成の一例を示すブロック図である。図９に示す学習システムは、学習用データ作成装置２０と学習装置３０とを備え、危険予測器１０を実現するための学習処理を行う。 (Example)
[Learning system configuration]
FIG. 9 is a block diagram illustrating an example of a configuration of the learning system in the embodiment. The learning system shown in FIG. 9 includes a learning data creation device 20 and a learning device 30 and performs a learning process for realizing the risk predictor 10.

学習用データ作成装置２０は、記憶部２０１、記憶部２０３と、アノテーション付与部２０２とを備え、映像データから学習用データを作成する。記憶部２０１は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）やメモリ等で構成され、車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像からなる映像データを記憶する。この映像データは学習用画像としても用いられる。記憶部２０３は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）やメモリ等で構成され、学習用画像（映像データ）に対して危険領域を示すアノテーションが付されたアノテーション付きデータを記憶する。アノテーション付与部２０２は、記憶部２０１から取得した学習用画像（映像データ）に危険領域を示すアノテーションを少なくとも付して記憶部２０３に記憶する。 The learning data creation device 20 includes a storage unit 201, a storage unit 203, and an annotation assignment unit 202, and creates learning data from video data. The storage unit 201 is configured by an HDD (Hard Disk Drive), a memory, and the like, and stores video data including a plurality of time-series images captured by an in-vehicle camera mounted on the vehicle. This video data is also used as a learning image. The storage unit 203 includes an HDD (Hard Disk Drive), a memory, and the like, and stores annotation-added data in which an annotation indicating a dangerous area is attached to a learning image (video data). The annotation assigning unit 202 attaches at least an annotation indicating a dangerous area to the learning image (video data) acquired from the storage unit 201 and stores the annotation in the storage unit 203.

学習装置３０は、誤差算出部３０１と、重み調整部３０２とを備え、学習用データ作成装置２０から取得した学習用データを用いて、畳み込みニューラルネットワーク１０ａに学習処理を行う。ここで、畳み込みニューラルネットワーク１０ａは、危険予測器１０に用いられる畳み込みニューラルネットワークと同一構成であり、学習処理の結果、危険予測器１０に用いられる畳み込みニューラルネットワークとして機能する。誤差算出部３０１は、畳み込みニューラルネットワーク１０ａが推定すべき学習用画像の中の危険領域および当該危険領域の特徴を示す値（正解値）と、畳み込みニューラルネットワーク１０ａが現に出力（推定）したもの（推定した危険度を示す値）との誤差を、誤差関数を用いて誤差を算出する。重み調整部３０２は、誤差算出部３０１で算出される誤差が小さくなるように畳み込みニューラルネットワーク１０ａの重みを調整する。 The learning device 30 includes an error calculation unit 301 and a weight adjustment unit 302, and performs learning processing on the convolutional neural network 10a using the learning data acquired from the learning data creation device 20. Here, the convolutional neural network 10a has the same configuration as the convolutional neural network used for the risk predictor 10, and functions as a convolutional neural network used for the risk predictor 10 as a result of the learning process. The error calculation unit 301 is a value (correct value) indicating the risk area in the learning image to be estimated by the convolutional neural network 10a and the characteristic of the risk area (correct value), and the output (estimation) that the convolutional neural network 10a actually outputs (estimates). The error is calculated by using an error function with respect to the error from the estimated risk level. The weight adjustment unit 302 adjusts the weight of the convolutional neural network 10a so that the error calculated by the error calculation unit 301 is reduced.

［学習システムの動作］
次に、以上のように構成された学習システムの動作について、図１０および図１１を用いて説明する。図１０は、学習用データ作成装置２０が行うアノテーション付きデータの作成処理を示すフローチャートである。図１１は、学習装置３０が行う学習処理を示すフローチャートである。なお、図１０は図９に示すＳ１１の詳細処理一例に該当し、図１１は図９に示すＳ１２の詳細処理の一例に該当する。 [Learning system operation]
Next, the operation of the learning system configured as described above will be described with reference to FIGS. 10 and 11. FIG. 10 is a flowchart showing a process for creating annotated data performed by the learning data creation apparatus 20. FIG. 11 is a flowchart showing a learning process performed by the learning device 30. 10 corresponds to an example of the detailed process of S11 shown in FIG. 9, and FIG. 11 corresponds to an example of the detailed process of S12 shown in FIG.

図１０に示すように、学習用データ作成装置２０は、まず、記憶部２０１に記憶されている映像データを取得する（Ｓ１１１）。より具体的には、学習用データ作成装置２０は、映像データとして車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像を取得する。 As shown in FIG. 10, the learning data creation device 20 first acquires video data stored in the storage unit 201 (S111). More specifically, the learning data creation device 20 acquires a plurality of time-sequential images taken by an in-vehicle camera mounted on the vehicle as video data.

次に、学習用データ作成装置２０は、映像データに含まれる危険領域を決定する（Ｓ１１２）。より具体的には、学習用データ作成装置２０は、取得した複数の画像（映像データ）の少なくとも一部の画像に含まれる危険領域であって、当該複数の画像を撮影した車載カメラを搭載する車両がそのまま走行した場合に当該車両の走行経路中に運動物体が出現して当該車両と衝突する可能性のある危険領域を決定する。 Next, the learning data creation device 20 determines a dangerous area included in the video data (S112). More specifically, the learning data creation device 20 is equipped with an in-vehicle camera that is a dangerous region included in at least a part of the acquired plurality of images (video data) and has captured the plurality of images. When a vehicle travels as it is, a dangerous area where a moving object appears in the travel route of the vehicle and may collide with the vehicle is determined.

次に、学習用データ作成装置２０は、映像データに対して危険領域を示すアノテーションを付与する（Ｓ１１３）。より具体的には、学習用データ作成装置２０は、当該少なくとも一部の画像に対して、決定した危険領域を示すアノテーションを付与する。そして、少なくとも一部の画像に危険領域を示すアノテーションを付与した映像データ（アノテーション付きデータ）を記憶部２０３に記憶する。本実施の形態では、学習用データ作成装置２０は、人物が出現する所定時間前の画像の一部領域に対して危険領域を示すアノテーションを付与するが、それに限らない。ボール、ハザードランプを付けた自動車または人物が映されたカーブミラーなど人物が出現する所定時間前の画像の特定物体領域に対して危険領域を示すアノテーションを付与してよい。いずれにせよ、映像データを撮影した車載カメラを搭載する車両がそのまま走行した場合に車両の走行経路中に人物５１３である運動物体が出現して当該車両と衝突する可能性のある危険領域を示すアノテーションを付すことになるからである。 Next, the learning data creation apparatus 20 adds an annotation indicating a dangerous area to the video data (S113). More specifically, the learning data creation device 20 assigns an annotation indicating the determined dangerous area to the at least some images. Then, video data (annotated data) obtained by adding an annotation indicating a dangerous area to at least a part of the image is stored in the storage unit 203. In the present embodiment, the learning data creation device 20 gives an annotation indicating a dangerous area to a partial area of an image a predetermined time before a person appears, but is not limited thereto. An annotation indicating a dangerous area may be given to a specific object area of an image a predetermined time before a person appears, such as a car with a ball, a hazard lamp, or a curved mirror showing a person. In any case, when a vehicle equipped with an in-vehicle camera that has captured video data travels as it is, a moving object that is a person 513 appears in the traveling route of the vehicle and indicates a danger area that may collide with the vehicle. This is because an annotation is added.

続いて、図１１に示すように、学習装置３０は、学習用データ作成装置２０から学習用データを取得する（Ｓ１２１）。具体的には、学習装置３０は、Ｓ１１３においてアノテーションが付与された当該少なくとも一部の画像、および、複数の画像（映像データ）のうち当該少なくとも一部の画像に対応する画像を、学習用画像およびアノテーション付き画像として取得する。例えば、学習装置３０は、上述したように、図７Ａに示す危険領域を含む学習用画像と図７Ｄに示すアノテーション付き画像とを学習用データ作成装置２０から取得する。 Subsequently, as illustrated in FIG. 11, the learning device 30 acquires learning data from the learning data creation device 20 (S121). Specifically, the learning device 30 acquires the image corresponding to the at least a part of the at least a part of the image to which the annotation is added in S113 and the at least a part of the plurality of images (video data) as a learning image. And as an annotated image. For example, as described above, the learning device 30 acquires the learning image including the dangerous region illustrated in FIG. 7A and the annotated image illustrated in FIG. 7D from the learning data creation device 20.

次に、学習装置３０は、学習用画像を用いて、畳み込みニューラルネットワーク１０ａに、推定した危険度を示す値を出力させる（Ｓ１２２）。本実施の形態では、学習装置３０は、例えば図７Ａに示す危険領域を含む学習用画像に対して、畳み込みニューラルネットワーク１０ａに、推定させた危険度を示す値として、学習用画像の中の危険領域とその当該危険領域の特徴すなわち、危険領域とその危険度合い（尤度マップ）を示す値を出力させる。 Next, the learning device 30 causes the convolutional neural network 10a to output a value indicating the estimated risk using the learning image (S122). In the present embodiment, the learning device 30 uses, for example, a risk in the learning image as a value indicating the degree of risk estimated by the convolutional neural network 10a for the learning image including the dangerous region shown in FIG. 7A. A value indicating the region and the characteristics of the dangerous region, that is, the dangerous region and the degree of risk (likelihood map) is output.

次に、学習装置３０は、Ｓ１２２で出力させた値と、畳み込みニューラルネットワーク１０ａが推定すべき学習用画像の中の危険領域および当該危険領域の特徴を示す値（正解値）との差（誤差）を算出する（Ｓ１２３）。ここで、正解値は、図７Ｄに示す画像５１ｂを学習用画像に危険領域を示すアノテーションが付されたアノテーション付き画像から算出された図８の画像５１ｃに示される危険領域および当該危険領域の危険度合いを示す尤度マップを示す値である。 Next, the learning device 30 determines the difference (error) between the value output in S122 and the risk area in the learning image to be estimated by the convolutional neural network 10a and the value (correct value) indicating the characteristic of the risk area. ) Is calculated (S123). Here, the correct answer value is the risk area shown in the image 51c in FIG. 8 calculated from the image 51b shown in FIG. 7D and the annotated image in which the annotation indicating the danger area is added to the learning image, and the risk of the danger area. It is a value indicating a likelihood map indicating the degree.

次に、学習装置３０は、Ｓ１２３で出力された危険度を示す値と正解値との差が最小でない場合（Ｓ１２４でＮｏ）、その差が小さくなるように畳み込みニューラルネットワーク１０ａの重みを更新する（Ｓ１２５）。そして、学習装置３０は、Ｓ１２２の処理から繰り返す回帰処理を行う。 Next, when the difference between the value indicating the degree of risk output in S123 and the correct answer value is not minimum (No in S124), the learning device 30 updates the weight of the convolutional neural network 10a so that the difference becomes small. (S125). And the learning apparatus 30 performs the regression process repeated from the process of S122.

一方、Ｓ１２２で出力された危険度を示す値と正解値との差が最小である場合には、処理を終了する。すなわち、学習装置３０は、Ｓ１２３において算出された誤差を最小にするように畳み込みニューラルネットワーク１０ａの重みを調整することにより、畳み込みニューラルネットワーク１０ａに重みを学習させる。そして、学習装置３０は、Ｓ１２３において算出された誤差が最小になった場合における畳み込みニューラルネットワーク１０ａの重みを、危険予測器１０に用いる畳み込みニューラルネットワークの重みとして決定する。 On the other hand, when the difference between the value indicating the degree of risk output in S122 and the correct answer value is minimum, the process is terminated. In other words, the learning device 30 causes the convolutional neural network 10a to learn the weight by adjusting the weight of the convolutional neural network 10a so as to minimize the error calculated in S123. Then, the learning device 30 determines the weight of the convolutional neural network 10a when the error calculated in S123 is minimized as the weight of the convolutional neural network used for the risk predictor 10.

なお、図１０および図１１を説明する便宜上、一枚の画像に対する作成処理や学習処理について説明したが、これにかぎらない。１０〜２０枚の画像ごとに作成処理や学習処理を行ってもよい。また、誤差が最小になった畳み込みニューラルネットワーク１０ａの重みを、危険予測器１０に用いる畳み込みニューラルネットワークの重みとして決定するとして説明したが、それに限らない。上記の回帰処理を繰り返しても重みの値及び誤差が動かない場合、重みの値及び誤差が動かないときの畳み込みニューラルネットワークの重み１０ａを、危険予測器１０に用いる畳み込みニューラルネットワークの重みとして決定してもよい。また、誤差が最小とは、回帰処理の上限回数を決めた場合に、上限回数までの回帰処理における誤差の最小を意味するとしてもよい。 For convenience of describing FIGS. 10 and 11, the creation processing and learning processing for one image have been described, but the present invention is not limited to this. You may perform a creation process and a learning process for every 10-20 images. Further, the weight of the convolutional neural network 10a with the smallest error is described as being determined as the weight of the convolutional neural network used for the risk predictor 10, but the present invention is not limited to this. If the weight value and the error do not move even when the above regression processing is repeated, the weight 10a of the convolutional neural network when the weight value and the error do not move is determined as the weight of the convolutional neural network used for the risk predictor 10. May be. Further, the minimum error may mean the minimum error in the regression process up to the upper limit number when the upper limit number of regression processes is determined.

（変形例１）
図１２Ａは、ステップＳ１において準備されるアノテーション付き画像の別の一例である。図１２Ｂは、学習処理の結果として出力される危険領域とその特徴の別の一例を示す図である。なお、図７Ａ〜図８と同様の要素には同一の符号を付しており、詳細な説明は省略する。 (Modification 1)
FIG. 12A is another example of the annotated image prepared in step S1. FIG. 12B is a diagram illustrating another example of the dangerous area output as a result of the learning process and its characteristics. 7A to 8 are denoted by the same reference numerals, and detailed description thereof is omitted.

上述した危険予測器１０の学習処理では、図７Ａに示す画像５１および図７Ｄに示す画像５１ｂが学習用画像およびアノテーション付き画像の一例であり、正解値が図８に示す画像５１ｃにおける危険領域とその危険度合いを示す値である場合について説明したが、これに限らない。図７Ａに示す画像５１および図１２Ａに示す画像５１ｄを学習用画像およびアノテーション付き画像として、正解値が図１２Ｂに示す画像５１ｅにおける危険領域とそのカテゴリを示す値であるとしてもよい。 In the learning process of the risk predictor 10 described above, the image 51 shown in FIG. 7A and the image 51b shown in FIG. 7D are examples of the learning image and the annotated image, and the correct value is the risk area in the image 51c shown in FIG. Although the case where it is the value which shows the risk degree was demonstrated, it is not restricted to this. The image 51 shown in FIG. 7A and the image 51d shown in FIG. 12A may be a learning image and an annotated image, and the correct value may be a value indicating the dangerous area and its category in the image 51e shown in FIG. 12B.

図１２Ａに示す画像５１において、停車しているバス５１１の背後の領域５１４ｄが、見えていない人物が飛び出して車両と衝突する可能性のある危険領域であり、危険カテゴリ１が危険領域のカテゴリを示すアノテーションである。また、画像５１ｄには、さらに、見えている人物５１２の領域５１５とその危険カテゴリ２とを示すアノテーションが付されている。人物５１２の領域５１５が車両と衝突する可能性のある危険領域に該当し、危険カテゴリ２が危険領域のカテゴリに該当する。また、画像５１ｄには、これら以外の領域（例えば領域５１６ｄ）に対して安全領域であることを示すアノテーションも付与されている。 In the image 51 shown in FIG. 12A, an area 514d behind the stopped bus 511 is a danger area where an invisible person may jump out and collide with the vehicle, and the danger category 1 is a category of the danger area. It is an annotation to show. Further, the image 51d is further annotated with the region 515 of the visible person 512 and the danger category 2 thereof. The area 515 of the person 512 corresponds to a danger area that may collide with the vehicle, and the danger category 2 corresponds to the category of the danger area. The image 51d is also provided with an annotation indicating that it is a safe area for areas other than these (for example, the area 516d).

図１２Ｂに示す画像５１ｅには、停車しているバス５１１の背後の領域５１９が危険領域であり、そのカテゴリが危険領域（車）であること、および、人物５１２の領域５２０が危険領域であり、そのカテゴリである危険領域（人）であることが示されている。 In the image 51e shown in FIG. 12B, the area 519 behind the stopped bus 511 is a dangerous area, the category is a dangerous area (car), and the area 520 of the person 512 is a dangerous area. , It is indicated that the category is a dangerous area (person).

（変形例２）
学習用画像に付されるアノテーションは、実施例で説明した危険領域を示す情報（枠など）や変形例１で説明した危険領域に対応づけられる運動物体のカテゴリである場合に限らない。例えば、車両のハンドル角度やブレーキ強度など学習用画像が撮影されたときの車両の制御情報であってもよいし、図１３に示すような物体領域とは異なる詳細領域情報であってもよい。 (Modification 2)
The annotation attached to the learning image is not limited to information (such as a frame) indicating the dangerous area described in the embodiment or the category of the moving object associated with the dangerous area described in the first modification. For example, the control information of the vehicle when the learning image such as the steering angle of the vehicle and the brake strength is captured may be detailed area information different from the object area as shown in FIG.

図１３は、詳細領域情報の一例を示す図である。図１３に示す詳細領域情報は、例えば図７Ａに対するセグメント情報５４である。領域５４１が図７Ａに示すバス５１１の背後の領域が危険領域であることを示しており、領域５４２が図７Ａに示す人物５１２の領域が危険領域であることを示している。 FIG. 13 is a diagram illustrating an example of detailed area information. The detailed area information shown in FIG. 13 is, for example, segment information 54 for FIG. 7A. An area 541 indicates that the area behind the bus 511 illustrated in FIG. 7A is a dangerous area, and an area 542 indicates that the area of the person 512 illustrated in FIG. 7A is a dangerous area.

このように、学習用画像に付されるアノテーションは、入力画像における危険領域と学習用画像の撮影時における車両のブレーキ強度またはハンドル角度を含む制御情報とを示してもよい。また、学習用画像に付されるアノテーションは、入力画像における危険領域のセグメント情報であってもよい。 As described above, the annotation attached to the learning image may indicate the dangerous area in the input image and the control information including the brake strength or the steering wheel angle of the vehicle when the learning image is captured. Further, the annotation added to the learning image may be segment information of the dangerous area in the input image.

（変形例３）
上記の実施の形態１、実施例、変形例１および２では、走行中の車両にとって危険が発生しそうな危険領域を、学習用画像中に存在する遮蔽物の領域の一部を含む領域であって、運動物体が隠れており当該運動物体が遮蔽物から走行経路中に出現する前の領域であるとして説明したが、これに限らない。危険領域は、人物を含む２以上の運動物体同士が接近すると車両の走行経路中を横切ることになる２以上の運動物体の間の領域であってもよい。 (Modification 3)
In the first embodiment, the example, and the first and second modifications described above, the danger area where danger is likely to occur for the traveling vehicle is an area including a part of the area of the shielding object existing in the learning image. In the above description, it is assumed that the moving object is hidden and the moving object is an area before the moving object appears in the travel route from the shield. The dangerous area may be an area between two or more moving objects that cross the traveling route of the vehicle when two or more moving objects including a person approach each other.

以下、このような危険領域を学習処理するために用いられる学習用画像とアノテーション付き画像とについて例示する。 Hereinafter, the learning image and the annotated image used for learning processing of such a dangerous area will be exemplified.

図１４Ａ、図１５Ａ、図１６Ａおよび図１７Ａは、学習用画像の一例である。図１４Ｂ、図１５Ｂ、図１６Ｂおよび図１７Ｂは、危険領域を示すアノテーションが付されたアノテーション付き画像の一例である。 14A, 15A, 16A, and 17A are examples of learning images. FIG. 14B, FIG. 15B, FIG. 16B, and FIG. 17B are examples of annotated images with annotations indicating dangerous areas.

図１４Ａに示される学習用画像５６ａには、駐停車中の自動車５６１と人物５６２とが示されている。図１４Ｂに示されているアノテーション付き画像５６ｂには、学習用画像５６ａに対して、駐停車中の自動車５６１と人物５６２との間の領域５６３が危険領域であることを示すアノテーションが付されている。 The learning image 56a shown in FIG. 14A shows a car 561 and a person 562 that are parked and stopped. The annotation-added image 56b shown in FIG. 14B has an annotation indicating that the area 563 between the parked and stopped automobile 561 and the person 562 is a dangerous area with respect to the learning image 56a. Yes.

図１５Ａに示される学習用画像５７ａには、人物５７１とバスの停留所を示す物体５７２と駐停車中の自動車５７３とが示されている。図１５Ｂに示されているアノテーション付き画像５７ｂには、学習用画像５７ａに対して、人物５７１とバスの停留所を示す物体５７２との間の領域５７４が危険領域であることを示すアノテーションが付されている。 The learning image 57a shown in FIG. 15A shows a person 571, an object 572 indicating a bus stop, and an automobile 573 parked and stopped. Annotated image 57b shown in FIG. 15B is annotated with respect to learning image 57a, indicating that area 574 between person 571 and object 572 indicating the bus stop is a dangerous area. ing.

また、図１６Ａに示される学習用画像５８ａには、子供である人物５８１とボールなど人物５８１が遊びに使う物体５８２とが示されている。図１６Ｂに示されているアノテーション付き画像５８ｂには、学習用画像５８ａに対して、人物５８１と人物５８１が遊びに使う物体５８２との間の領域５８３が危険領域であることを示すアノテーションが付されている。 In addition, the learning image 58a shown in FIG. 16A shows a person 581 as a child and an object 582 used by the person 581 such as a ball for play. In the image 58b with annotation shown in FIG. 16B, an annotation indicating that a region 583 between the person 581 and the object 582 used for play by the person 581 is a dangerous region is attached to the learning image 58a. Has been.

また、図１７Ａに示される学習用画像５９ａには、子供である人物５９１と子供の親などの人物５９２とが示されている。図１７Ｂに示されているアノテーション付き画像５９ｂには、学習用画像５９ａに対して、人物５９１と人物５９２との間の領域５９３が危険領域であることを示すアノテーションが付されている。 In addition, the learning image 59a shown in FIG. 17A shows a person 591 who is a child and a person 592 such as a child's parent. In the image 59b with annotation shown in FIG. 17B, an annotation indicating that the area 593 between the person 591 and the person 592 is a dangerous area is added to the learning image 59a.

このように、人物を含む２以上の運動物体同士が接近すると車両の走行経路中を横切ることになることから、２以上の運動物体の間の領域は、車両がそのまま走行した場合に車両の走行経路中に運動物体が出現して車両と衝突する可能性のある危険領域であると考えられる。したがって、危険予測器１０は、当該２以上の運動物体の間の領域を危険領域と予測してもよい。 As described above, when two or more moving objects including a person approach each other, the vehicle travels across the traveling path. Therefore, the region between the two or more moving objects travels when the vehicle travels as it is. This is considered a dangerous area where a moving object may appear in the route and collide with the vehicle. Therefore, the danger predictor 10 may predict an area between the two or more moving objects as a dangerous area.

（実施の形態２）
実施の形態１では、危険予測器１０に用いられる畳み込みニューラルネットワークと同一構成の畳み込みニューラルネットワーク１０ａのみを用いて学習処理を行うとして説明したが、これに限らない。まず、全結合層を有する畳み込みニューラルネットワークを用いて１段階目の学習処理を行い、次に、全結合層を畳み込み層に変更した畳み込みニューラルネットワークを用いて２段階目の学習処理を行うとしてもよい。以下、この場合について、実施の形態２として説明する。なお、以下では実施の形態１と異なるところを中心に説明する。 (Embodiment 2)
In the first embodiment, the learning process is described using only the convolutional neural network 10a having the same configuration as the convolutional neural network used in the risk predictor 10, but the present invention is not limited to this. First, a first-stage learning process is performed using a convolutional neural network having a fully connected layer, and then a second-stage learning process is performed using a convolutional neural network in which the all-connected layer is changed to a convolutional layer. Good. Hereinafter, this case will be described as a second embodiment. In the following, description will be made centering on differences from the first embodiment.

［学習処理］
危険予測器１０の構成等は上記実施の形態１で説明した通りであるの説明を省略する。実施の形態２では、実施の形態１に対して、コンピュータが学習用データを用いて行う学習処理が異なる。 [Learning process]
The configuration of the risk predictor 10 is the same as that described in the first embodiment, and the description thereof is omitted. The second embodiment is different from the first embodiment in the learning process performed by the computer using the learning data.

図１８は、本実施の形態における２段階の学習処理の概要を示すフローチャートである。図１９は、１段階目の学習処理を概念的に説明するための説明図である。図２０は、２段階目の学習処理を概念的に説明するための説明図である。なお、図３および図７Ｄ等と同様の要素には同一の符号を付しており、詳細な説明は省略する。図１８は、図６に示すステップＳ１２の詳細処理の一例に該当する。また、図１９に示される第１ニューラルネットワーク１０ｂは、１段階目の学習処理で用いられる全結合層を有する畳み込みニューラルネットワークである。図２０に示される第２ニューラルネットワーク１０ｃは、２段階目の学習処理で用いられる畳み込みニューラルネットワークであり、第１ニューラルネットワーク１０ｂの全結合層を畳み込み層に変更したものに該当する。第２ニューラルネットワーク１０ｃは、危険予測器１０に用いられる畳み込みニューラルネットワークと同一構成である。 FIG. 18 is a flowchart showing an overview of a two-stage learning process in the present embodiment. FIG. 19 is an explanatory diagram for conceptually explaining the first-stage learning process. FIG. 20 is an explanatory diagram for conceptually explaining the learning process at the second stage. Elements similar to those in FIGS. 3 and 7D and the like are denoted by the same reference numerals, and detailed description thereof is omitted. FIG. 18 corresponds to an example of detailed processing in step S12 illustrated in FIG. Also, the first neural network 10b shown in FIG. 19 is a convolutional neural network having a fully connected layer used in the first stage learning process. The second neural network 10c shown in FIG. 20 is a convolutional neural network used in the learning process at the second stage, and corresponds to a configuration in which all the connection layers of the first neural network 10b are changed to convolutional layers. The second neural network 10 c has the same configuration as the convolutional neural network used for the risk predictor 10.

まず、コンピュータは、学習用データを用いて、第１ニューラルネットワーク１０ｂに重み（第１重み）を学習させる第１学習処理を行う（Ｓ２２１）。より具体的には、コンピュータは、学習用画像それぞれのうちの一部領域であって危険領域を示すアノテーションが付された領域の危険領域画像と当該アノテーションが付されていない領域の安全領域画像とを用いて、全結合層を有する畳み込みニューラルネットワークである第１ニューラルネットワーク１０ｂに、一部領域が安全領域であるまたは危険領域であると判定させるための当該第１ニューラルネットワークの第１重みを学習させる。 First, the computer performs a first learning process in which the first neural network 10b learns the weight (first weight) using the learning data (S221). More specifically, the computer includes a dangerous area image of a part of each of the learning images and an area with an annotation indicating the dangerous area, and a safe area image of an area without the annotation. Is used to learn the first weight of the first neural network for causing the first neural network 10b, which is a convolutional neural network having a fully connected layer, to determine that a partial region is a safe region or a dangerous region. Let

例えば、図１９に示す例では、コンピュータは、アノテーション付き画像である画像５１ｂの一部画像である危険領域画像６１を第１ニューラルネットワーク１０ｂに入力し、危険領域である判定を行わせるように第１ニューラルネットワーク１０ｂの第１重みを学習させている様子が示されている。 For example, in the example illustrated in FIG. 19, the computer inputs the dangerous area image 61 that is a partial image of the image 51 b that is an annotated image to the first neural network 10 b and performs the determination so as to perform the determination as the dangerous area. A state in which the first weight of one neural network 10b is learned is shown.

次に、コンピュータは、学習用データを用いて、第２ニューラルネットワーク１０ｃに第２学習処理を行う（Ｓ２２１〜Ｓ２２４）。より具体的には、まず、コンピュータは、第１ニューラルネットワークの全結合層を畳み込み層に変更した第２ニューラルネットワークを生成する（Ｓ２２２）。続いて、コンピュータは、Ｓ２２１で学習した第１ニューラルネットワークの重み（第１重み）を読み込み（Ｓ２２３）、Ｓ２２２で生成した第２ニューラルネットワークの重みの初期値を第１重みに更新する。続いて、コンピュータは、危険領域を含む学習用画像と、学習用画像に危険領域を示すアノテーションが付されたアノテーション付き画像とを用いて、第２ニューラルネットワーク１０ｃに、学習用画像の中の危険領域と当該危険領域の特徴とを推定させるための第２ニューラルネットワーク１０ｃの重み（第２重み）を学習させる（Ｓ２２４）。 Next, the computer performs the second learning process on the second neural network 10c using the learning data (S221 to S224). More specifically, the computer first generates a second neural network in which the total connection layer of the first neural network is changed to a convolutional layer (S222). Subsequently, the computer reads the weight (first weight) of the first neural network learned in S221 (S223), and updates the initial value of the weight of the second neural network generated in S222 to the first weight. Subsequently, the computer uses the learning image including the dangerous area and the annotated image in which the learning image is annotated with the annotation indicating the dangerous area, to the second neural network 10c, to detect the danger in the learning image. The weight (second weight) of the second neural network 10c for estimating the region and the feature of the dangerous region is learned (S224).

例えば、図２０に示す例では、コンピュータは、学習用画像（入力画像）である画像５０を第２ニューラルネットワーク１０ｃに入力し、危険度として当該危険領域および当該危険領域の危険度合いを示す尤度マップを推定させるように第２ニューラルネットワーク１０ｃの第２重みを更新（学習）させている。 For example, in the example illustrated in FIG. 20, the computer inputs an image 50 that is a learning image (input image) to the second neural network 10 c, and the likelihood indicating the risk area and the risk level of the risk area as the risk level. The second weight of the second neural network 10c is updated (learned) so as to estimate the map.

このような第２学習処理を行うことにより、第２ニューラルネットワークと同一構成である危険予測器１０に用いる畳み込みニューラルネットワークの重みを学習させることができる。 By performing such second learning processing, the weight of the convolutional neural network used for the risk predictor 10 having the same configuration as that of the second neural network can be learned.

なお、第１学習処理において、危険領域を示すアノテーションが付された危険領域画像と当該アノテーションが付されていない安全領域画像とを学習用画像として用いて、第１ニューラルネットワーク１０ｂに、入力した学習用画像が安全であるか危険であると判定させる第１重みを学習させるとしたが、それに限らない。 In the first learning process, the learning area input to the first neural network 10b using the dangerous area image with the annotation indicating the dangerous area and the safe area image without the annotation as the learning image. Although the first weight for determining that the image for use is safe or dangerous is learned, the present invention is not limited to this.

例えば図２１に示すように、第１学習処理において、危険領域とそのカテゴリを示すアノテーションが付された学習用画像の一部領域を学習用画像として用いてもよい。そして、第１ニューラルネットワーク１０ｂに、入力した学習用画像が安全であるか危険であると判定させ、かつ危険であると判定した場合にはその学習用画像が示すカテゴリを判定させる第１重みを学習させてもよい。 For example, as shown in FIG. 21, in the first learning process, a partial region of the learning image with the annotation indicating the dangerous region and its category may be used as the learning image. Then, the first neural network 10b is caused to determine whether the input learning image is safe or dangerous, and when it is determined to be dangerous, the first weight for determining the category indicated by the learning image is set. You may learn.

ここで、図２１は、１段階目の学習処理を概念的に説明するための別の説明図である。 Here, FIG. 21 is another explanatory diagram for conceptually explaining the first-stage learning process.

図２１に示される例では、コンピュータは、危険領域とそのカテゴリを示すアノテーション付き画像である画像５１ｄの一部領域である領域５１４ｄを切り取った危険領域画像６３を第１ニューラルネットワーク１０ｂに入力している。そして、コンピュータは、危険領域画像６３が危険領域である判定を行わせ、さらに、その危険領域画像６３が示すカテゴリ（図では危険カテゴリ（車））を判定させるように第１ニューラルネットワーク１０ｂの第１重みを学習させている様子が示されている。なお、危険領域は、上述したもの以外にも、種々のバリエーションが考えられるため、第１学習処理において、危険カテゴリを、物体の見え方毎などの個々のカテゴリ（詳細カテゴリ）に分割して学習させてもよい。 In the example shown in FIG. 21, the computer inputs a dangerous area image 63 obtained by cutting out a region 514d, which is a partial area of the image 51d, which is an annotated image indicating the dangerous area and its category, to the first neural network 10b. Yes. Then, the computer makes a determination that the dangerous area image 63 is a dangerous area, and further, determines the category indicated by the dangerous area image 63 (the dangerous category (vehicle) in the figure) of the first neural network 10b. A state in which one weight is learned is shown. Since there are various variations of the dangerous area other than those described above, in the first learning process, the dangerous category is divided into individual categories (detailed categories) such as each appearance of the object and learned. You may let them.

また、危険予測器１０が出力する危険度は、図２０に示されるような危険領域とその危険度合いとを示す尤度マップに限らない。つまり、第２学習処理において、危険領域とその特徴として、危険領域とその危険度合いとを示す尤度マップを出力させるように第２重みを学習させる場合に限らない。危険領域およびそのカテゴリを出力させるように第２重みを学習させてもよい。 Further, the risk level output by the risk predictor 10 is not limited to the likelihood map indicating the risk area and the risk level as shown in FIG. In other words, the second learning process is not limited to the case where the second weight is learned so as to output a likelihood map indicating the dangerous area and the degree of danger as the dangerous area and its feature. The second weight may be learned so as to output the dangerous area and its category.

［他の実施態様の可能性］
以上、実施の形態１〜３において本開示の危険予測方法について説明したが、各処理が実施される主体や装置に関しては特に限定しない。ローカルに配置された特定の装置内に組み込まれたプロセッサーなど（以下に説明）によって処理されてもよい。またローカルの装置と異なる場所に配置されているクラウドサーバなどによって処理されてもよい。 [Possibility of other embodiments]
As mentioned above, although the risk prediction method of this indication was demonstrated in Embodiment 1-3, it does not specifically limit regarding the main body and apparatus in which each process is implemented. It may be processed by a processor or the like (described below) embedded in a specific device located locally. Further, it may be processed by a cloud server or the like arranged at a location different from the local device.

また、学習用画像や危険予測時の入力画像としては、車載カメラで撮影された画像（全体画像）であってもよいし、全体画像の一部画像（部分画像）であってもよい。部分画像としては、上述したように危険と推定される領域の画像であってもよい。全体画像としては、危険状況が発生時または発生前の画像であってもよい。 Further, the learning image and the input image at the time of danger prediction may be an image (entire image) taken by an in-vehicle camera or a partial image (partial image) of the entire image. As described above, the partial image may be an image of an area estimated to be dangerous. The entire image may be an image when a dangerous situation occurs or before the occurrence.

なお、本開示は、上記実施の形態に限定されるものではない。例えば、本明細書において記載した構成要素を任意に組み合わせて、また、構成要素のいくつかを除外して実現される別の実施の形態を本開示の実施の形態としてもよい。また、上記実施の形態に対して本開示の主旨、すなわち、請求の範囲に記載される文言が示す意味を逸脱しない範囲で当業者が思いつく各種変形を施して得られる変形例も本開示に含まれる。 In addition, this indication is not limited to the said embodiment. For example, another embodiment realized by arbitrarily combining the components described in this specification and excluding some of the components may be used as an embodiment of the present disclosure. Further, the present disclosure also includes modifications obtained by making various modifications conceivable by those skilled in the art without departing from the gist of the present disclosure, that is, the meanings of the words described in the claims. It is.

また、本開示は、さらに、以下のような場合も含まれる。 The present disclosure further includes the following cases.

（１）上記の装置は、具体的には、マイクロプロセッサ、ＲＯＭ、ＲＡＭ、ハードディスクユニット、ディスプレイユニット、キーボード、マウスなどから構成されるコンピュータシステムである。前記ＲＡＭまたはハードディスクユニットには、コンピュータプログラムが記憶されている。前記マイクロプロセッサが、前記コンピュータプログラムにしたがって動作することにより、各装置は、その機能を達成する。ここでコンピュータプログラムは、所定の機能を達成するために、コンピュータに対する指令を示す命令コードが複数個組み合わされて構成されたものである。 (1) Specifically, the above apparatus is a computer system including a microprocessor, ROM, RAM, a hard disk unit, a display unit, a keyboard, a mouse, and the like. A computer program is stored in the RAM or hard disk unit. Each device achieves its functions by the microprocessor operating according to the computer program. Here, the computer program is configured by combining a plurality of instruction codes indicating instructions for the computer in order to achieve a predetermined function.

（２）上記の装置を構成する構成要素の一部または全部は、１個のシステムＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ：大規模集積回路）から構成されているとしてもよい。システムＬＳＩは、複数の構成部を１個のチップ上に集積して製造された超多機能ＬＳＩであり、具体的には、マイクロプロセッサ、ＲＯＭ、ＲＡＭなどを含んで構成されるコンピュータシステムである。前記ＲＡＭには、コンピュータプログラムが記憶されている。前記マイクロプロセッサが、前記コンピュータプログラムにしたがって動作することにより、システムＬＳＩは、その機能を達成する。 (2) A part or all of the constituent elements constituting the above-described apparatus may be configured by one system LSI (Large Scale Integration). The system LSI is an ultra-multifunctional LSI manufactured by integrating a plurality of components on a single chip, and specifically, a computer system including a microprocessor, ROM, RAM, and the like. . A computer program is stored in the RAM. The system LSI achieves its functions by the microprocessor operating according to the computer program.

（３）上記の装置を構成する構成要素の一部または全部は、各装置に脱着可能なＩＣカードまたは単体のモジュールから構成されているとしてもよい。前記ＩＣカードまたは前記モジュールは、マイクロプロセッサ、ＲＯＭ、ＲＡＭなどから構成されるコンピュータシステムである。前記ＩＣカードまたは前記モジュールは、上記の超多機能ＬＳＩを含むとしてもよい。マイクロプロセッサが、コンピュータプログラムにしたがって動作することにより、前記ＩＣカードまたは前記モジュールは、その機能を達成する。このＩＣカードまたはこのモジュールは、耐タンパ性を有するとしてもよい。 (3) A part or all of the constituent elements constituting the above-described device may be constituted by an IC card that can be attached to and detached from each device or a single module. The IC card or the module is a computer system including a microprocessor, a ROM, a RAM, and the like. The IC card or the module may include the super multifunctional LSI described above. The IC card or the module achieves its function by the microprocessor operating according to the computer program. This IC card or this module may have tamper resistance.

（４）また、本開示は、上記に示す方法であるとしてもよい。また、これらの方法をコンピュータにより実現するコンピュータプログラムであるとしてもよいし、前記コンピュータプログラムからなるデジタル信号であるとしてもよい。 (4) Moreover, this indication may be the method shown above. Further, the present invention may be a computer program that realizes these methods by a computer, or may be a digital signal composed of the computer program.

（５）また、本開示は、前記コンピュータプログラムまたは前記デジタル信号をコンピュータで読み取り可能な記録媒体、例えば、フレキシブルディスク、ハードディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ、ＤＶＤ−ＲＯＭ、ＤＶＤ−ＲＡＭ、ＢＤ（Ｂｌｕ−ｒａｙ（登録商標）Ｄｉｓｃ）、半導体メモリなどに記録したものとしてもよい。また、これらの記録媒体に記録されている前記デジタル信号であるとしてもよい。 (5) In addition, the present disclosure provides a computer-readable recording medium such as a flexible disk, hard disk, CD-ROM, MO, DVD, DVD-ROM, DVD-RAM, BD ( It may be recorded on a Blu-ray (registered trademark) Disc), a semiconductor memory, or the like. The digital signal may be recorded on these recording media.

また、本開示は、前記コンピュータプログラムまたは前記デジタル信号を、電気通信回線、無線または有線通信回線、インターネットを代表とするネットワーク、データ放送等を経由して伝送するものとしてもよい。 In addition, the present disclosure may transmit the computer program or the digital signal via an electric communication line, a wireless or wired communication line, a network represented by the Internet, a data broadcast, or the like.

また、本開示は、マイクロプロセッサとメモリを備えたコンピュータシステムであって、前記メモリは、上記コンピュータプログラムを記憶しており、前記マイクロプロセッサは、前記コンピュータプログラムにしたがって動作するとしてもよい。 The present disclosure may be a computer system including a microprocessor and a memory, the memory storing the computer program, and the microprocessor operating according to the computer program.

また、前記プログラムまたは前記デジタル信号を前記記録媒体に記録して移送することにより、または前記プログラムまたは前記デジタル信号を、前記ネットワーク等を経由して移送することにより、独立した他のコンピュータシステムにより実施するとしてもよい。 In addition, the program or the digital signal is recorded on the recording medium and transferred, or the program or the digital signal is transferred via the network or the like and executed by another independent computer system. You may do that.

本開示は、特に、自動運転を行うための車両に搭載される車載カメラ、システムや、運転支援を行うための装置やシステム等において、危険が発生しそうな危険領域の予測するための危険予測方法に利用できる。 The present disclosure particularly relates to a risk prediction method for predicting a danger area where danger is likely to occur in an in-vehicle camera and system mounted on a vehicle for automatic driving, a device or system for driving assistance, and the like. Available to:

１０危険予測器
１０ａ畳み込みニューラルネットワーク
１０ｂ第１ニューラルネットワーク
１０ｃ第２ニューラルネットワーク
２０学習用データ作成装置
３０学習装置
５０、５０ａ、５０ｂ、５１、５１ａ、５１ｂ、５１ｃ、５１ｄ、５１ｅ、５２画像
５４セグメント情報
５６ｂ、５７ｂ、５８ｂ、５９ｂアノテーション付き画像
６１、６３危険領域画像
１０１、１０２畳み込み層
２０１、２０３記憶部
２０２アノテーション付与部
３０１誤差算出部
３０２重み調整部
５０１、５１１バス
５０２、５０３、５１２、５１３、５６２、５７１、５８１、５９１、５９２人物
５０４、５０５、５０６、５１７、５１８尤度
５０７、５０８、５０９、５１４、５１４ｄ、５１５、５１５ｄ、５１６、５１６ｄ、５１９、５２０、５４１、５４２、５６３、５７４、５８３、５９３領域
５６１、５７３自動車
５７２、５８２物体 DESCRIPTION OF SYMBOLS 10 Risk predictor 10a Convolution neural network 10b 1st neural network 10c 2nd neural network 20 Learning data creation apparatus 30 Learning apparatus 50, 50a, 50b, 51, 51a, 51b, 51c, 51d, 51e, 52 Image 54 Segment information 56b, 57b, 58b, 59b Annotated image 61, 63 Danger area image 101, 102 Convolution layer 201, 203 Storage unit 202 Annotation unit 301 Error calculation unit 302 Weight adjustment unit 501, 511 Bus 502, 503, 512, 513, 562, 571, 581, 591, 592 Person 504, 505, 506, 517, 518 Likelihood 507, 508, 509, 514, 514d, 515, 515d, 516, 516d, 519, 5 0,541,542,563,574,583,593 area 561,573 motor vehicles 572,582 object

Claims

A risk prediction method performed by a computer of a risk predictor using a convolutional neural network,
An acquisition step of causing the convolutional neural network to acquire an input image captured by an in-vehicle camera mounted on the vehicle;
In the convolutional neural network, there is a possibility that a moving object appears in the travel route of the vehicle and collides with the vehicle when the vehicle travels as it is in the dangerous region in the input image acquired in the acquisition step. An output step for estimating a certain dangerous area and a feature of the dangerous area and outputting the estimated risk level for the input image.
Risk prediction method.

In the output step, the convolutional neural network is caused to estimate a risk level of the risk area as a characteristic of the risk area, and a likelihood map indicating the estimated risk area and the risk level of the risk area is used as the risk level. Output,
The risk prediction method according to claim 1.

In the output step, the convolutional neural network is caused to estimate the type of the moving object associated with the dangerous area as the characteristic of the dangerous area, and output the estimated dangerous area and the type as the degree of danger.
The risk prediction method according to claim 1.

The risk prediction method further includes:
Before performing the acquisition step, the learning image including the dangerous area and the annotated image with the annotation indicating the dangerous area are added to the convolutional neural network using the learning image. A learning step for learning the weight of the convolutional neural network for estimating the dangerous area in the image and the feature of the dangerous area;
The risk prediction method according to any one of claims 1 to 3.

The learning step includes
Using all of the learning images, a dangerous region image in a region with an annotation indicating the dangerous region, and a safe region image in a region without the annotation, the entire connection layer A first learning step of causing a first neural network that is a convolutional neural network to learn the first weight of the first neural network to determine that the partial region is a safe region or a dangerous region;
Updating an initial value of the weight of the second neural network having a configuration in which all the connection layers of the first neural network are changed to a convolution layer to the first weight learned in the first learning step; Using the learning image and the annotated image in which the learning image is annotated with the annotation indicating the dangerous region, the dangerous region in the learning image and the dangerous image in the second neural network. A second learning step of learning a weight of the convolutional neural network having the same configuration as the second neural network by learning a second weight of the second neural network for estimating the feature of the region; Including,
The risk prediction method according to claim 4.

The risk prediction method further includes:
A plurality of time-sequential images taken by an in-vehicle camera mounted on the vehicle are acquired, and the dangerous region is included in at least some of the acquired images, and the vehicle travels as it is In such a case, a dangerous area in which a moving object appears in the traveling route of the vehicle and may collide with the vehicle is determined, and an annotation indicating the determined dangerous area is given to the at least some images. An granting step,
In the learning step, the at least a part of the image to which the annotation is given in the giving step, and an image corresponding to the at least a part of the plurality of images, the learning image and the annotated image And learning the weight of the convolutional neural network,
The risk prediction method according to claim 4 or 5.

The dangerous area is an area including a part of the area of the shielding object existing in the learning image, and the moving object is hidden before the moving object appears in the travel route from the shielding object. The area of
The risk prediction method according to any one of claims 4 to 6.

The dangerous area is an area between the two or more moving objects that will cross the traveling route of the vehicle when two or more moving objects including a person approach each other.
The risk prediction method according to any one of claims 4 to 6.

The annotation indicates the dangerous area and the category of the moving object associated with the dangerous area.
The risk prediction method according to any one of claims 4 to 6.

The annotation indicates the dangerous area and control information including a brake strength or a steering wheel angle of the vehicle at the time of capturing the learning image.
The risk prediction method according to any one of claims 4 to 6.

The annotation is segment information of the dangerous area.
The risk prediction method according to any one of claims 4 to 6.