DE102017124600A1

DE102017124600A1 - Semantic segmentation of an object in an image

Info

Publication number: DE102017124600A1
Application number: DE102017124600.2A
Authority: DE
Inventors: Stephen Foy; Rosalia Barros; Ian Clancy
Original assignee: Connaught Electronics Ltd
Current assignee: Connaught Electronics Ltd
Priority date: 2017-10-20
Filing date: 2017-10-20
Publication date: 2019-04-25
Also published as: WO2019076867A1

Abstract

Die vorliegende Erfindung betrifft ein Verfahren für die semantische Segmentierung eines Objekts (5, 8) in einem Bild mit den folgenden Verfahrensschritten:- aufeinanderfolgendes Erfassen von Einzelbildern (6, 7, 11),- Eingeben eines ersten Einzelbilds (6) der aufeinanderfolgend erfassten Einzelbilder (6, 7, 11) in Echtzeit in ein neuronales Faltungsnetz,- Untersuchen durch das neuronale Faltungsnetz, ob im ersten Einzelbild (6) ein Objekt (5, 8) erkannt werden kann,- semantisches Klassifizieren der erkannten Objekte (5, 8) durch das neuronale Faltungsnetz durch Zuordnen jedes erkannten Objekts (5, 8) zu einer von einer Liste vordefinierter Objektklassen,- Bereitstellen einer Lookup-Tabelle mit einer Prioritätsliste, die eine jeweilige Prioritätsstufe für jede der vordefinierten Objektklassen aufweist,- Bestimmen einer jeweiligen Prioritätsstufe der erkannten Objekte (5, 8) durch Vergleich mit der Lookup-Tabelle,- Bestimmen von einem oder mehr Objekt(en) (5), die eine vordefinierte Prioritätsstufe haben,- Bestimmen eines Bereichs hoher Priorität (9) des Einzelbilds (6), der sich auf das oder ein Objekt (5) mit der vordefinierten Prioritätsstufe bezieht,- Eingeben eines nächsten Einzelbilds (7) der aufeinanderfolgend erfassten Einzelbilder (5, 6) in Echtzeit in das neuronale Faltungsnetz,- Analysieren nur des Bereichs hoher Priorität (9) im nächsten Einzelbild (7) durch das neuronale Faltungsnetz. So wird eine effiziente CNN-Architekturkonstruktion, die für eine Fahrzeugkamera (3) mit einem großen Sichtfeld angewendet wird, wobei das große Sichtfeld genutzt wird.The present invention relates to a method for the semantic segmentation of an object (5, 8) in an image with the following method steps: successively acquiring individual images (6, 7, 11), inputting a first individual image (6) of the successively acquired individual images (6, 7, 11) in real time into a neuronal convolution network, - examining by the neural convolution network whether in the first frame (6) an object (5, 8) can be detected, - semantically classifying the detected objects (5, 8) by the neural convolution network by associating each recognized object (5, 8) with one of a list of predefined object classes, - providing a lookup table with a priority list having a respective priority level for each of the predefined object classes, - determining a respective priority level of the recognized one Objects (5, 8) by comparison with the lookup table, - Determining one or more object (s) (5), the one predefined e priority level, - determining a high priority area (9) of the frame (6) relating to the or an object (5) with the predefined priority level, - inputting a next frame (7) of the consecutively captured frames (5, 6) in real time into the neural convolution network, - analyzing only the high priority area (9) in the next frame (7) by the neural convolution network. Thus, an efficient CNN architecture construction is applied to a vehicle camera (3) having a large field of view using the large field of view.

Description

Die vorliegende Erfindung betrifft ein Verfahren für die semantische Segmentierung eines Objekts in einem Bild mit den folgenden Verfahrensschritten:

- aufeinanderfolgendes Erfassen von Einzelbildern,
- Eingeben eines ersten Einzelbilds der aufeinanderfolgend erfassten Einzelbilder in Echtzeit in ein neuronales Faltungsnetz und
- Untersuchen durch das neuronale Faltungsnetz, ob im ersten Einzelbild ein Objekt für die semantische Segmentierung erkannt werden kann.

The present invention relates to a method for the semantic segmentation of an object in an image with the following method steps:

successive acquisition of individual images,
- Inputting a first frame of successively captured frames in real time in a neural folding network and
- Investigate through the neural folding network, whether in the first frame, an object for the semantic segmentation can be detected.

Eines der elementarsten Probleme in der Computer Vision für Kraftfahrzeuge ist die semantische Segmentierung von Objekten in einem Bild. Der Segmentierungsansatz betrifft die Probleme beim Assoziieren jedes Pixels mit seiner entsprechenden Objektklasse. In jüngerer Zeit gab es, unterstützt durch eine Steigerung der Rechenleistung in Rechnerarchitekturen und der Verfügbarkeit großer annotierter Datensätze eine Welle von Forschung und Design auf dem Gebiet der neuronalen Faltungsnetze (Convolutional Neuronal Networks, CNN).One of the most fundamental problems in the computer vision for motor vehicles is the semantic segmentation of objects in an image. The segmentation approach addresses the problems of associating each pixel with its corresponding feature class. Recently, supported by an increase in computing power in computing architectures and the availability of large annotated data sets, there has been a wave of research and design in the field of Convolutional Neuronal Networks (CNN).

CNN sind hinsichtlich Klassifizierungs- und Kategorisierungsaufgaben hoch erfolgreich, ein Großteil der Forschung erfolgt aber an standardmäßigen photometrischen RGB-Bildern und konzentriert sich nicht auf eingebettete Geräte in Kraftfahrzeugen. Hardwaregeräte für Kraftfahrzeuge dürfen nur wenig Energie verbrauchen und haben daher eine niedrige Rechenleistung.CNNs are highly successful in classifying and categorizing tasks, but much of the research is done on standard RGB photometric images and does not focus on embedded devices in automobiles. Hardware devices for motor vehicles may consume only little energy and therefore have a low computing power.

Beim maschinellen Lernen ist ein neuronales Faltungsnetz eine Klasse tiefer künstlicher neuronaler feedforward-Netze, die erfolgreich auf die Analyse von visuellen Bildern angewendet wurden. CNN verwenden eine Variante von mehrlagigen Perzeptronen, die dafür ausgelegt sind, eine minimale Vorverarbeitung zu erfordern. Faltungsnetze wurden von biologischen Prozessen inspiriert, bei denen die Verbindungsmuster zwischen Neuronen von der Gestaltung des tierischen visuellen Cortex inspiriert sind. Einzelne kortikale Neuronen reagieren nur in einem beschränkten Bereich des als das rezeptive Feld bekannten Sichtfelds auf Reize. Die rezeptiven Felder verschiedener Neuronen überlappen einander teilweise, so dass sie das gesamte Sichtfeld abdecken. In machine learning, a neuronal convolution network is a class of deep artificial feedforward neuronal networks that have been successfully applied to the analysis of visual images. CNNs use a variant of multilayer perceptrons designed to require minimal preprocessing. Convolution networks have been inspired by biological processes in which the connection patterns between neurons are inspired by the design of the animal visual cortex. Individual cortical neurons only respond to stimuli in a limited area of the field of view known as the receptive field. The receptive fields of different neurons partially overlap each other so that they cover the entire field of view.

Verglichen mit anderen Bildklassifizierungsalgorithmen verwenden CNN relativ wenig Vorverarbeitung. Das bedeutet, dass das Netz die Filter erlernt, die in traditionellen Algorithmen von Hand entwickelt wurden. Diese Unabhängigkeit von früherem Wissen und menschlichen Bemühungen bei der Merkmalskonstruktion ist ein bedeutender Vorteil. CNN haben Anwendungen in der Bild- und Videoerkennung, in Empfehlungsdiensten und in der Computerlinguistik.Compared with other image classification algorithms, CNN use relatively little preprocessing. This means the net learns the filters that have been manually developed in traditional algorithms. This independence of prior knowledge and human effort in feature construction is a significant advantage. CNN have applications in image and video recognition, referral services and computational linguistics.

Diesbezüglich lehrt US 2017/0200063 A1 das Anwenden eines Satzes von Abschnitten, die sich über eine Downsampling-Version eines Bilds einer Straßenszene erstrecken, auf einen Low-Fidelity-Klassifizierer, um einen Satz von Abschnittskandidaten zur Abbildung von einem oder mehr Objekten in einem Klassensatz zu bestimmen. Der Satz von Abschnittskandidaten der Downsampling-Version kann einem Satz potentieller Vektoren in einer High-Fidelity-Version des Bilds zugeordnet werden. Ein High-Fidelity-Klassifizierer kann zum Überprüfen des Satzes potentieller Sektoren verwendet werden, wobei die Anwesenheit von einem oder mehr Objekten aus dem Klassensatz bestimmt wird. Der Low-Fidelity-Klassifizierer kann ein erstes Faltungsnetz beinhalten, dessen Training an einem ersten Trainingssatz von Downsampling-Versionen von Ausschnittsbildern von Objekten in dem Klassensatz erfolgte. Desgleichen kann der High-Fidelity-Klassifizierer ein zweites CNN beinhalten, dessen Training an einem zweiten Trainingssatz von High-Fidelity-Versionen von Ausschnittsbildern von Objekten in dem Klassensatz erfolgte.In this regard, teaches US 2017/0200063 A1 applying a set of sections extending across a downsampling version of an image of a street scene to a low fidelity classifier to determine a set of section candidates for mapping one or more objects in a class set. The set of section candidates of the downsampling version may be assigned to a set of potential vectors in a high fidelity version of the picture. A high fidelity classifier may be used to check the set of potential sectors, determining the presence of one or more objects from the class set. The low fidelity classifier may include a first convolution network that has been trained on a first training set of downsampling versions of clipping images of objects in the class set. Likewise, the high fidelity classifier may include a second CNN trained on a second training set of high fidelity versions of clipping images of objects in the class set.

Aus US 2017/0099200 A1 ist bekannt, dass Daten empfangen werden, die eine Anfrage für die Agentenberechnung von Sensordaten kennzeichnen. Die Anfrage beinhaltet ein erforderliches Vertrauen und eine erforderliche Latenz für den Abschluss der Agentenberechnung. Abzufragende Agenten werden auf Basis des erforderlichen Vertrauens bestimmt. Daten werden übertragen, um bestimmte Agenten zum Bereitstellen der Analyse der Sensordaten abzufragen.Out US 2017/0099200 A1 It is known to receive data identifying a request for agent calculation of sensor data. The request includes required trust and latency required to complete the agent calculation. Agents to query are determined based on the required trust. Data is transmitted to query certain agents for providing analysis of the sensor data.

US 9 704 054 B1 beschreibt, dass Bildklassifizierung und verwandte Bildgebungsaufgaben, die mit Instrumenten des maschinellen Lernens durchgeführt werden, beschleunigt werden können, indem Instrumente zum Assoziieren eines Bilds mit einer Gruppe derartiger Bezeichnungen oder Kategorien und dann zum Auswählen von einer der Bezeichnungen oder Kategorien der Gruppe als mit dem Bild assoziierte verwendet werden. Die Gruppen von Bezeichnungen oder Kategorien können Bezeichnungen aufweisen, die miteinander verwechselt werden, z.B. zwei oder mehr Bezeichnungen oder Kategorien, die als mit einem einzelnen Bild assoziiert identifiziert wurden. Durch Definieren von Gruppen von Bezeichnungen oder Kategorien und Konfigurieren eines Instruments des maschinellen Lernens zum Assoziieren eines Bilds mit einer der Gruppen können Prozesse zum Identifizieren von Bezeichnungen oder Kategorien, die mit Bildern assoziiert sind, beschleunigt werden, weil Berechnungen, die mit nicht in der Gruppe enthaltenen Bezeichnungen und Kategorien assoziiert sind, Berechnungen weggelassen werden können. US 9,704,054 B1 describes that image classification and related imaging tasks performed with machine learning tools can be sped up by using tools to associate an image with a group of such labels or categories and then to select one of the group's labels or categories rather than the image associated to be used. The groups of labels or categories may have labels that are confused with each other, eg, two or more labels or categories that have been identified as being associated with a single image. By defining groups of labels or categories and configuring a machine learning instrument to associate an image with one of the groups, processes for identifying designations or categories associated with images can be speeded up because computations that are not included in the Group names and categories are associated, calculations can be omitted.

Es ist eine Aufgabe der vorliegenden Erfindung, eine effiziente CNN-Architekturkonstruktion bereitzustellen, die für eine Fahrzeugkamera mit einem großen Sichtfeld angewendet wird, wobei das große Sichtfeld vorteilhaft genutzt wird.It is an object of the present invention to provide an efficient CNN architecture construction which is applied to a vehicle camera with a large field of view, taking advantage of the large field of view.

Diese Aufgabe wird durch den Gegenstand der unabhängigen Ansprüche gelöst. Bevorzugte Ausgestaltungen werden in den Unteransprüchen beschrieben.This object is solved by the subject matter of the independent claims. Preferred embodiments are described in the subclaims.

Die Erfindung sieht daher ein Verfahren für die semantische Segmentierung eines Objekts in einem Bild mit den folgenden Verfahrensschritten vor:

- aufeinanderfolgendes Erfassen von Einzelbildern,
- Eingeben eines ersten Einzelbilds der aufeinanderfolgend erfassten Einzelbilder in Echtzeit in ein neuronales Faltungsnetz,
- Untersuchen durch das neuronale Faltungsnetz, ob im ersten Einzelbild ein Objekt erkannt werden kann,
- semantisches Klassifizieren der erkannten Objekte durch das neuronale Faltungsnetz durch Zuordnen jedes erkannten Objekts zu einer von einer Liste vordefinierter Objektklassen,
- Bereitstellen einer Lookup-Tabelle mit einer Prioritätsliste, die eine jeweilige Prioritätsstufe für jede der vordefinierten Objektklassen aufweist,
- Bestimmen einer jeweiligen Prioritätsstufe der erkannten Objekte durch Vergleich mit der Lookup-Tabelle,
- Bestimmen von einem oder mehr Objekt(en), die eine vordefinierte Prioritätsstufe haben,
- Bestimmen eines neuen Bereichs hoher Priorität des Einzelbilds, der sich auf das oder ein Objekt mit der vordefinierten Prioritätsstufe bezieht,
- Eingeben eines nächsten Einzelbilds der aufeinanderfolgend erfassten Einzelbilder in Echtzeit in das neuronale Faltungsnetz,
- Analysieren nur des Bereichs hoher Priorität im nächsten Einzelbild durch das neuronale Faltungsnetz.

The invention therefore provides a method for the semantic segmentation of an object in an image with the following method steps:

successive acquisition of individual images,
Inputting a first frame of the consecutively acquired individual images in real time into a neural folding network,
Examine by the neural folding network whether an object can be detected in the first frame,
semantically classifying the recognized objects by the neural convolution network by associating each recognized object with one of a list of predefined object classes,
Providing a lookup table with a priority list having a respective priority level for each of the predefined object classes,
Determining a respective priority level of the detected objects by comparison with the lookup table,
Determining one or more object (s) having a predefined priority level,
Determining a new high priority area of the frame that relates to the or an object having the predefined priority level,
Inputting a next frame of the successively acquired individual images in real time into the neural folding network,
- Analyzing only the high priority area in the next frame by the neural convolution network.

Grundidee der Erfindung ist es also, dass es möglich ist, dass anstelle einer regelmäßigen Verarbeitung ganzer Bilder für die semantische Segmentierung von Objekten in dem Bild nur ein Abschnitt des Bilds mit höherer Auflösung verarbeitet wird. Insbesondere wird anstelle dessen, dass immer das vollständige Bild analysiert wird, in einem ersten Einzelbild ein Bereich hoher Priorität des Bilds auf Basis der Prioritätsstufen der in dem Bild erkannten Objekte bestimmt. Dann wird in einem nächsten Einzelbild nur der Bereich hoher Priorität des Bilds verarbeitet, wodurch das Verfahren viel effektiver wird. Vorzugsweise werden die Prioritätsstufen der verschiedenen Objektklassen auf Basis einer Sicherheitsordnung definiert, z.B. Objekte, die zur Objektklasse „Person“ gehören, könnten wichtiger sein als Objekte, die zur Objektklasse „Straßenrand“ gehören.The basic idea of the invention is thus that it is possible that instead of processing whole images regularly for the semantic segmentation of objects in the image, only a portion of the image is processed with higher resolution. In particular, instead of always analyzing the entire image, in a first frame, a high priority area of the image is determined based on the priority levels of the objects detected in the image. Then, in a next frame, only the high priority area of the image is processed, making the process much more effective. Preferably, the priority levels of the various object classes are defined based on a security order, e.g. Objects belonging to the "Person" object class may be more important than objects belonging to the "Roadside" object class.

Vorzugsweise würde zu Beginn dieses Verfahrens der Bereich mit hoher Priorität durch das bzw. die Objekt(e) mit der höchsten Prioritätsstufe definiert, d.h. die vordefinierte Prioritätsstufe wäre die höchste Prioritätsstufe. Wenn diese Objekte vertrauenswürdig klassifiziert wurden, können Bereiche des Bilds mit Objekten, die niedrigere Prioritätsstufen haben, verarbeitet werden.Preferably, at the beginning of this process, the high priority area would be defined by the highest priority object (s), i. the predefined priority level would be the highest priority level. If these objects have been trusted, portions of the image may be processed with objects that have lower priority levels.

Der Schritt des Analysierens nur des Bereichs hoher Priorität im nächsten Einzelbild durch das neuronale Faltungsnetz kann, wie im Folgenden dargelegt, mit verschiedenen Methoden durchgeführt werden. Gemäß einer bevorzugten erfindungsgemäßen Ausgestaltung wird das Analysieren nur des Bereichs hoher Priorität im nächsten Einzelbild durch das neuronale Faltungsnetz durchgeführt durch

- Untersuchen durch das neuronale Faltungsnetz, ob im Bereich hoher Priorität irgendein Objekt erkannt werden kann,
- semantisches Klassifizieren der erkannten Objekte durch das neuronale Faltungsnetz durch Zuordnen jedes erkannten Objekts zu einer der Liste vordefinierter Objektklassen,
- Bestimmen einer jeweiligen Priorität der erkannten Objekte durch Vergleich mit der Lookup-Tabelle,
- Bestimmen des einen oder der mehr Objekte mit der vordefinierten Prioritätsstufe,
- Bestimmen eines neuen Bereichs hoher Priorität des Einzelbilds, der sich auf das oder ein Objekt mit der vordefinierten Prioritätsstufe bezieht,
- Eingeben eines nächsten Einzelbilds der aufeinanderfolgend erfassten Einzelbilder in Echtzeit in das neuronale Faltungsnetz und
- Analysieren nur des neuen Bereichs hoher Priorität im nächsten Einzelbild durch das neuronale Faltungsnetz.

The step of analyzing only the high priority area in the next frame by the neural convolution network may be performed by various methods as set forth below. According to a preferred embodiment of the invention, analyzing only the high priority area in the next frame is performed by the neural convolution network

- Examine by the neural convolution network whether any object can be detected in the high priority area,
semantically classifying the recognized objects by the neural convolution network by associating each recognized object with one of the list of predefined object classes,
Determining a respective priority of the detected objects by comparison with the lookup table,
Determining the one or more objects having the predefined priority level,
Determining a new high priority area of the frame that relates to the or an object having the predefined priority level,
- Inputting a next frame of successively captured frames in real time in the neural folding network and
- Analyzing only the new high priority area in the next frame by the neural convolution network.

Vorzugsweise wird der Schritt des Analysierens nur des Bereichs hoher Priorität im nächsten Einzelbild durch das neuronale Faltungsnetz durch

- Untersuchen durch das neuronale Faltungsnetz, ob im Bereich hoher Priorität irgendein Objekt erkannt werden kann,
- semantisches Klassifizieren der erkannten Objekte durch das neuronale Faltungsnetz durch Zuordnen jedes erkannten Objekts zu einer der Liste vordefinierter Objektklassen,
- Bestimmen einer jeweiligen Priorität der erkannten Objekte durch Vergleich mit der Lookup-Tabelle,
- Bestimmen des einen oder der mehr Objekte mit der vordefinierten Prioritätsstufe,
- Bestimmen eines neuen Bereichs hoher Priorität des Einzelbilds, der sich auf das oder ein Objekt mit der vordefinierten Prioritätsstufe bezieht,
- Eingeben eines nächsten Einzelbilds der aufeinanderfolgend erfassten Einzelbilder in Echtzeit in das neuronale Faltungsnetz und
- Analysieren nur des neuen Bereichs hoher Priorität im nächsten Einzelbild durch das neuronale Faltungsnetz wenigstens einmal wiederholt.

Preferably, the step of analyzing only the high priority area in the next frame is performed by the neural convolution network

Examining by the neural folding network whether any object can be detected in the high priority area,
semantically classifying the recognized objects by the neural convolution network by associating each recognized object with one of the list of predefined object classes,
Determining a respective priority of the detected objects by comparison with the lookup table,
Determining the one or more objects having the predefined priority level,
Determining a new high priority area of the frame that relates to the or an object having the predefined priority level,
- Inputting a next frame of successively captured frames in real time in the neural folding network and
- Analyzing only the new high priority area in the next frame by the neural folding network at least once repeatedly.

Auf diese Weise kann ein Bereich hoher Priorität mit Objekten, die zu klassifizieren sind, in einem mehrere Schritte umfassenden Verfahren definiert werden. Gemäß einer weiteren bevorzugten erfindungsgemäßen Ausgestaltung kann eine derartige Klassifizierung auch direkt nach der ersten Definition des Bereichs hoher Priorität durchgeführt werden. Daher wird gemäß einer bevorzugten erfindungsgemäßen Ausgestaltung das Analysieren nur des Bereichs hoher Priorität im nächsten Einzelbild durch das neuronale Faltungsnetz durch semantisches Klassifizieren des Objekts durch Zuordnen des Objekts zu einer der Liste vordefinierter Objektklassen durchgeführt. In dieser Hinsicht wird vorzugsweise der folgende Schritt durchgeführt:

- Akzeptieren der Objektklasse, der das Objekt beim Analysieren nur des Bereichs hoher Priorität im nächsten Einzelbild zugeordnet wurde, als eine vertrauenswürde Objektklasse. Wenn eine derartige vertrauenswürdige Klassifizierung von Objekten mit einer gewissen Prioritätsstufe erzielt wurde, werden vorzugsweise Bereiche mit Objekten mit der nächstniedrigeren Prioritätsklasse verarbeitet.

In this way, a high priority area can be defined with objects to be classified in a multi-step process. According to a further preferred embodiment of the invention, such a classification can also be carried out directly after the first definition of the high priority area. Therefore, according to a preferred embodiment of the invention, only the high priority area in the next frame is analyzed by the neural convolution network by semantically classifying the object by associating the object with one of the list of predefined object classes. In this regard, preferably the following step is performed:

Accept the object class to which the object was assigned when analyzing only the high priority area in the next frame as a trusted object class. If such a trusted classification of objects with a certain priority level has been achieved, areas with objects of the next lower priority class are preferably processed.

Im Allgemeinen kann das Eingeben eines nächsten Einzelbilds der aufeinanderfolgend erfassten Einzelbilder in Echtzeit in das neuronale Faltungsnetz durch Eingeben des vollständigen Einzelbilds durchgeführt werden. Gemäß einer bevorzugten erfindungsgemäßen Ausgestaltung wird aber das Eingeben eines nächsten Einzelbilds der aufeinanderfolgend erfassten Einzelbilder in Echtzeit in das neuronale Faltungsnetz durch Eingeben nur des Bereichs hoher Priorität des nächsten Einzelbilds in das neuronale Faltungsnetz durchgeführt.In general, inputting a next frame of the consecutively captured frames in real time into the neural convolution mesh can be performed by inputting the complete frame. However, in accordance with a preferred embodiment of the invention, inputting a next frame of successively acquired frames in real time to the neural convolution network is performed by inputting only the high priority area of the next frame into the neural convolution network.

Des Weiteren wird gemäß einer bevorzugten erfindungsgemäßen Ausgestaltung der Schritt des aufeinanderfolgenden Erfassens von Einzelbildern durch eine Kamera mit einem Sichtfeld von mehr als 150°durchgeführt, das jeweilige Einzelbilder ergibt, die einen Bildwinkel von mehr als 150° abdecken. Mehr v orzugsweise hat die Kamera ein Sichtfeld von mehr als 180°, das jeweilige Einzelbilder ergibt, die einen Bildwinkel von mehr als 180° abdecken. Auf diese Weise kann ein gr oßes Sichtfeld überwacht werden, während die reine Menge von Pixeln der durch eine derartige Kamera erfassten Bilder die Verarbeitungsgeschwindigkeit nicht merklich verlangsamt, da nicht für alle Einzelbilder die vollständigen Bilder verarbeitet werden müssen.Furthermore, according to a preferred embodiment of the invention, the step of successively capturing individual images is performed by a camera with a field of view of more than 150 °, which results in respective individual images covering an image angle of more than 150 °. More preferably, the camera has a field of view of more than 180 °, which gives respective frames that cover an angle of view of more than 180 °. In this way, a large field of view can be monitored, while the mere amount of pixels of the images captured by such a camera does not appreciably slow down the processing speed, since the complete images do not have to be processed for all individual images.

Auch betrifft die Erfindung die Verwendung eines Verfahrens, wie oben beschrieben, in einem Kraftfahrzeug.The invention also relates to the use of a method as described above in a motor vehicle.

Des Weiteren betrifft die Erfindung eine Sensoranordnung für ein Kraftfahrzeug, die zum Durchführen eines Verfahrens, wie oben beschrieben, konfiguriert ist.Furthermore, the invention relates to a sensor arrangement for a motor vehicle, which is configured to carry out a method as described above.

Auch betrifft die Erfindung einen nichtflüchtigen computerlesbaren Datenträger, der darauf gespeicherte Anweisungen aufweist, die bei Ausführung in einem Prozessor eine Sensoranordnung eines Kraftfahrzeugs zum Durchführen eines Verfahrens, wie oben beschrieben, veranlassen.Also, the invention relates to a non-transitory computer-readable medium having instructions stored thereon that when executed in a processor cause a sensor assembly of a motor vehicle to perform a method as described above.

Nachfolgend werden diese und andere erfindungsgemäße Aspekte unter Bezugnahme auf die im Folgenden beschriebenen Ausgestaltungen ersichtlich und näher erläutert. Die dargestellten Merkmale können sowohl jeweils einzeln als auch in Kombination einen Aspekt der Erfindung darstellen. Merkmale verschiedener Ausführungsbeispiele sind übertragbar von einem Ausführungsbeispiel auf ein anderes.Hereinafter, these and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described below. The illustrated features may represent an aspect of the invention both individually and in combination. Features of various embodiments are transmittable from one embodiment to another.

In den Zeichnungen zeigt:

1 eine schematische Ansicht eines Fahrzeugs mit einer Sensoranordnung gemäß einer bevorzugten erfindungsgemäßen Ausgestaltung,
2a, b schematische Ansichten der Verarbeitung von Einzelbildern gemäß einer bevorzugten erfindungsgemäßen Ausgestaltung und
3a - d schematische Ansichten eines weiteren Aspekts der Verarbeitung von Einzelbildern gemäß einer bevorzugten erfindungsgemäßen Ausgestaltung.

In the drawings shows:

1 a schematic view of a vehicle with a sensor arrangement according to a preferred embodiment of the invention,
2a, b schematic views of the processing of individual images according to a preferred embodiment of the invention and
3a d schematic views of another aspect of the processing of individual images according to a preferred embodiment of the invention.

Die 1 zeigt eine schematische Ansicht eines Kraftfahrzeugs 1 mit einer Sensoranordnung 2, die sich aus einer Kamera 3 und einer Auswertungseinheit 4 zusammensetzt. Die Sensoranordnung 2 ist für die semantische Segmentierung von durch die Kamera 3 aufgenommenen Bildern von Objekten 5 ausgeführt. Die Auswertungseinheit 4 kann Teil eines hochentwickelten Fahrerassistenzsystems zur Unterstützung des Fahrers des Kraftfahrzeugs 1 beim Fahrprozess sein. Die Kamera 3 ist eine Kamera mit großem Sichtfeld 3 und kann einen Blickwinkel haben, der größer als 180° ist. The 1 shows a schematic view of a motor vehicle 1 with a sensor arrangement 2 arising from a camera 3 and an evaluation unit 4 composed. The sensor arrangement 2 is for the semantic segmentation by the camera 3 taken pictures of objects 5 executed. The evaluation unit 4 can be part of a sophisticated driver assistance system to assist the driver of the motor vehicle 1 be in the driving process. The camera 3 is a camera with a large field of view 3 and can have a viewing angle greater than 180 °.

Das von der Sensoranordnung 2 gemäß der bevorzugten erfindungsgemäßen Ausgestaltung durchgeführte Verfahren ist wie im Folgenden beschrieben:That of the sensor arrangement 2 According to the preferred embodiment of the invention carried out method is as described below:

Die Kamera 3 erfasst Einzelbilder aufeinanderfolgend. Die Frequenz der Einzelbilderfassung kann 30 Bilder/Sekunde hoch sein. Für die effektive Verarbeitung der Einzelbilder hat sich aber eine Verarbeitungsfrequenz von 5 Bilder/Sekunde als ausreichend gezeigt. Zur Verarbeitung der Einzelbilder wird ein erstes Einzelbild 6 der aufeinanderfolgend erfassten Einzelbilder in Echtzeit in ein neuronales Faltungsnetz eingegeben. Das neuronale Faltungsnetz ist in der Auswertungseinheit 4, zu der die Einzelbilder der Kamera 3 übertragen werden.The camera 3 captures frames consecutively. The frequency of frame capture can be 30 frames / second. For the effective processing of the individual images, however, a processing frequency of 5 frames / second has been found to be sufficient. For processing the individual images is a first frame 6 of the consecutively captured frames entered in real time into a neural convolution network. The neural folding network is in the evaluation unit 4 to which the frames of the camera 3 be transmitted.

Im neuronalen Faltungsnetz wird untersucht, ob im ersten Einzelbild 6 ein Objekt 5, das nicht Teil des Bodenbereichs ist, auf dem das Kraftfahrzeug 1 fährt, erkannt werden kann. Wenn im ersten Einzelbild 6 derartige Objekte 5 erkannt werden können, werden diese Objekte vom neuronalen Faltungsnetz semantisch klassifiziert, indem jedes erkannte Objekt einer aus einer Liste vordefinierter Objektklassen zugeordnet wird.In the neural folding network is examined whether in the first frame 6 an object 5 that is not part of the floor area on which the motor vehicle 1 drives, can be detected. If in the first frame 6 such objects 5 can be recognized, these objects are semantically classified by the neural convolution network by assigning each detected object to one of a list of predefined object classes.

Gemäß der hier beschriebenen bevorzugten Ausgestaltung können diese Objektklassen „Person“, „Auto“, „Wand“, „Baum“, ... sein. Eine derartige semantische Klassifizierung von Objekten durch ein neuronales Faltungsnetz ist dem Fachmann gut bekannt und bedarf hier keiner weiteren Erklärung.According to the preferred embodiment described herein, these object classes may be "person", "car", "wall", "tree", ... Such a semantic classification of objects by a neural folding network is well known to those skilled in the art and needs no further explanation here.

Anders als bei herkömmlichen Verfahren ist aber gemäß der bevorzugten erfindungsgemäßen Ausgestaltung eine Lookup-Tabelle mit einer jeweiligen Prioritätsliste für jede der vordefinierten Objektklassen vorgesehen. Im vorliegenden Fall sieht diese Prioritätsliste wie folgt aus: Person 1. Priorität Auto 2. Priorität Wand 3. Priorität Baum 4. Priorität Unlike conventional methods, however, according to the preferred embodiment of the invention, a lookup table with a respective priority list is provided for each of the predefined object classes. In the present case, this priority list looks like this: person 1st priority automobile 2nd priority wall 3rd priority tree 4th priority

Diese Prioritätsliste kann weiter Objektklassen haben, die auf jeweilige Prioritäten bezogen sind. Für jedes Objekt, das im ersten Einzelbild 6 erkannt wurde, wird durch Vergleich mit der Lookup-Tabelle eine jeweilige Prioritätsstufe bestimmt.This priority list may further have object classes related to respective priorities. For every object in the first frame 6 has been detected, a respective priority level is determined by comparison with the lookup table.

In der 2a ist ein jeweiliges Einzelbild 6 zu sehen. In diesem Einzelbild 6 werden zwei Personen als ein Objekt 5 erkannt und des Weiteren wird eine Wand als ein weiteres Objekt 8 erkannt. Da die Objektklasse „Person“ eine höhere Priorität als die Objektklasse „Wand“ hat, wird ein Bereich hoher Priorität 9 bestimmt, der sich auf das Objekt 5 bezieht, das zur Objektklasse „Person“ gehört.In the 2a is a single frame 6 to see. In this single picture 6 be two people as an object 5 Furthermore, a wall is recognized as another object 8th recognized. Since the object class "Person" has a higher priority than the object class "Wall", it becomes a high priority area 9 determined, focusing on the object 5 refers to the object class "Person".

Dann wird ein nächstes Einzelbild 7 der aufeinanderfolgend erfassten Einzelbilder in Echtzeit in das neuronale Faltungsnetz eingegeben, wobei nur der Bereich hoher Priorität 9 im nächsten Einzelbild 7 durch das neuronale Faltungsnetz analysiert wird. Dies wird in der 2b gezeigt, in der das Einzelbild 7, das vom neuronalen Faltungsnetz zur semantischen Segmentierung der Objekte 5 verarbeitet wird, auf den Bereich hoher Priorität bezogen ist, der im vorhergehenden Verfahrensschritt im Einzelbild 6 bestimmt wurde. Auf diese Weise können die Objekte 5 in einer viel höheren Auflösung verarbeitet werden, die die semantische Segmentierung der Objekte 5, d.h. die Zuordnung der Objekte 5 zu einer der Liste vordefinierter Objektklassen, leichter und daher vertrauenswürdiger macht.Then a next frame 7 of the consecutively acquired frames entered in real time into the neural convolution network, with only the high priority area 9 in the next frame 7 is analyzed by the neural folding network. This is in the 2 B shown in the single picture 7 that of the neural folding network for the semantic segmentation of objects 5 is processed, is related to the high priority area in the previous step in the frame 6 was determined. That way the objects can 5 be processed in a much higher resolution, the semantic segmentation of the objects 5 ie the assignment of the objects 5 to one of the list of predefined object classes, makes it easier and therefore more trustworthy.

Gemäß einer bevorzugten erfindungsgemäßen Ausführungsform kann ein Bereich hoher Priorität mit Objekten, die klassifiziert werden sollten, auch in einem mehrere Schritte umfassenden Prozess definiert werden, wie im Folgenden mit Bezug auf die 3a bis d beschrieben wird.According to a preferred embodiment of the invention, a high priority area with objects that should be classified may also be defined in a multi-step process, as described below with reference to FIGS 3a to d is described.

In der 3a wird gezeigt, dass ein Bereich hoher Priorität 8 definiert wird, der zwei Objekte 5, 8 umfasst, die zu verschiedenen Objektklassen, d.h. „Person“ und „Wand“, gehören. Anstatt sich direkt auf Objekt 5 zu konzentrieren, welches das Objekt mit der höheren Priorität ist, wird ein Bereich hoher Priorität 9 definiert, der beide Objekte 5, 8 umfasst, die dann im nächsten Einzelbild 7, das in 3b gezeigt wird, mit einer höheren Auflösung analysiert werden können.In the 3a is shown to be a high priority area 8th is defined, the two objects 5 . 8th which belong to different object classes, ie "person" and "wall". Instead of focusing directly on object 5 Concentrating which object is the higher priority becomes a high priority area 9 defines both objects 5 . 8th includes, then in the next frame 7 , this in 3b can be analyzed at a higher resolution.

Diese Analyse mit höherer Auflösung ermöglicht es, deutlich zwischen den zwei Objekten 5, 8 zu unterscheiden und einen neuen Bereich hoher Priorität 10 zu definieren, der sich nur auf das Objekt 5 bezieht, das zur Objektklasse mit der höchsten Priorität, d.h. „Person“, gehört, wie in 3c gezeigt.This higher resolution analysis makes it possible to clearly distinguish between the two objects 5 . 8th to distinguish and a new high priority area 10 to define, focusing only on the object 5 which belongs to the object class with the highest priority, ie "person", as in 3c shown.

Dann wird in einem weiteren Einzelbild 11, das in 3d gezeigt wird, nur dieser neue Bereich hoher Priorität 10 untersucht, d.h. semantische Segmentierung wird nur für das Objekt 5 durchgeführt, um nachzuprüfen, dass das hier erkannte Objekt 5 tatsächlich zur Objektklasse „Person“ gehört.Then in another frame 11 , this in 3d shown is just this new high priority area 10 examined, ie semantic Segmentation is only for the object 5 performed to verify that the object recognized here 5 actually belongs to the object class "Person".

BezugszeichenlisteLIST OF REFERENCE NUMBERS

11: Kraftfahrzeugmotor vehicle
22: Sensoranordnungsensor arrangement
33: Kameracamera
44: Auswertungseinheitevaluation unit
55: Personenpeople
66: erstes Einzelbildfirst single picture
77: nächstes Einzelbildnext frame
88th: Wandwall
99: Bereich hoher PrioritätHigh priority area
1010: neuer Bereich hoher Prioritätnew high priority area
1111: weiterer nächster Bildbereichanother next picture area

ZITATE ENTHALTEN IN DER BESCHREIBUNG QUOTES INCLUDE IN THE DESCRIPTION

Diese Liste der vom Anmelder aufgeführten Dokumente wurde automatisiert erzeugt und ist ausschließlich zur besseren Information des Lesers aufgenommen. Die Liste ist nicht Bestandteil der deutschen Patent- bzw. Gebrauchsmusteranmeldung. Das DPMA übernimmt keinerlei Haftung für etwaige Fehler oder Auslassungen.This list of the documents listed by the applicant has been generated automatically and is included solely for the better information of the reader. The list is not part of the German patent or utility model application. The DPMA assumes no liability for any errors or omissions.

Zitierte PatentliteraturCited patent literature

US 2017/0200063 A1 [0006]
US 2017/0099200 A1 [0007]
US 9704054 B1 [0008]

Claims

Method for the semantic segmentation of an object (5, 8) in an image with the following method steps: - successive acquisition of individual images (6, 7, 11), Inputting a first individual image (6) of the consecutively acquired individual images (6, 7, 11) in real time into a neural folding network, Examine by the neural folding network whether in the first frame (6) an object (5, 8) can be detected, - semantically classifying the recognized objects (5, 8) by the neural convolution network by associating each recognized object (5, 8) with one of a list of predefined object classes, Providing a lookup table with a priority list having a respective priority level for each of the predefined object classes, Determining a respective priority level of the detected objects (5, 8) by comparison with the lookup table, Determining one or more object (s) (5) having a predefined priority level, Determining a new high priority area (9) of the single image (6) relating to the one or more objects (5) having the predefined priority level, Inputting a next single image (7) of the successively acquired individual images (5, 6) in real time into the neural folding network, - Analyzing only the high priority area (9) in the next frame (7) by the neural convolution network.

Method according to Claim 1 wherein analyzing only the high priority area (9) in the next frame (7) by the neural convolution network is performed by - examining by the neural convolution network whether any object (5, 8) can be detected in the high priority area (9) - semantically classifying the recognized objects (5, 8) by the neural convolution network by assigning each recognized object (5, 8) to one of the list of predefined object classes, - determining a respective priority of the detected objects (5, 8) by comparison with the Lookup table, - determining the one or more objects (5) with the predefined priority level, - determining a new high priority area (10) of the single image that relates to the or an object (5) having the predefined priority level, Inputting a next frame (11) of the consecutively acquired frames (5, 6) in real time into the neural convolution mesh and - analyzing only the new range ho Priority (10) in the next frame (11) by the neural convolution network.

Method according to Claim 2 by repeating the step of analyzing only the high priority area in a further next frame by the neural convolution network at least once, examining by the neural convolution network whether any object (5, 8) can be detected in the high priority area, - semantically classifying the detected objects (5, 8) by the neural convolution network by assigning each recognized object to one of the list of predefined object classes, - determining a respective priority of the detected objects (5, 8) by comparison with the lookup table, - determining the one or the more objects (5, 8) with the predefined priority level, - determining a new high priority area of the frame relating to the or an object (5) having the predefined priority level, - inputting a next frame of the consecutively captured frames in real time into the neural folding network and - analyzing only the new one High priority area in the next frame by the neural convolution network.

Method according to one of Claims 1 to 3 wherein analyzing only the high priority area (9, 10) in the next frame (7, 11) by the neural convolution network by semantically classifying the object (5, 8) by assigning the object (5, 8) to one of the list Object classes is performed.

Method according to Claim 4 with the following method step: - accepting the object class to which the object (5, 8) was assigned when analyzing only the high priority area in the next single image (7, 11) as a trusted object class.

A method according to any one of the preceding claims, wherein inputting a next frame (7, 11) of the successively captured frames (6, 7, 11) in real time into the neural convolution network by inputting only the high priority area (9, 10) of the next frame (7, 11) into the neural folding network.

Method according to one of the preceding claims, wherein the step of successively capturing individual images (6, 7, 11) is performed by a camera (3) with a field of view of more than 150 °, which produces respective individual images (6, 7, 11), which cover an angle of view of more than 150 °.

Use of the method according to one of the preceding claims in a motor vehicle (1).

Sensor arrangement (2) for a motor vehicle (1), which is used to carry out a method according to one of the Claims 1 to 8th is configured.

A non-transitory computer-readable medium having instructions stored thereon that, when executed in a processor, comprise a sensor assembly (2) of a motor vehicle (1) for performing the method of any one of Claims 1 to 8th cause.