CN111258408A - An object boundary determination method and device for human-computer interaction - Google Patents
An object boundary determination method and device for human-computer interaction Download PDFInfo
- Publication number
- CN111258408A CN111258408A CN202010369965.3A CN202010369965A CN111258408A CN 111258408 A CN111258408 A CN 111258408A CN 202010369965 A CN202010369965 A CN 202010369965A CN 111258408 A CN111258408 A CN 111258408A
- Authority
- CN
- China
- Prior art keywords
- computing board
- scene image
- boundary
- user
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/002—Specific input/output arrangements not covered by G06F3/01 - G06F3/16
- G06F3/005—Input arrangements through a video camera
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
- G06V10/235—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition based on user input or interaction
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
技术领域technical field
本发明涉及人机交互技术领域,特别是一种用于人机交互的对象边界确定方法及设备。The present invention relates to the technical field of human-computer interaction, in particular to an object boundary determination method and device for human-computer interaction.
背景技术Background technique
伴随着信息技术的发展,众多现实技术不断发展,尤其增强现实技术已经广泛应用于娱乐、工程等方面,让现实世界中的人们能够近距离的解决一些虚拟事物,其中涉及到的技术有多媒体、三维建模、实时跟踪、智能交互、传感等多种技术手段。With the development of information technology, many real-world technologies continue to develop. In particular, augmented reality technology has been widely used in entertainment, engineering, etc., allowing people in the real world to solve some virtual things at close range. The technologies involved include multimedia, Three-dimensional modeling, real-time tracking, intelligent interaction, sensing and other technical means.
人机交互是一门研究系统与用户之间的交互关系的学问。系统可以是各种各样的机器,也可以是计算机化的系统和软件。人机交互界面通常是指用户可见的部分。用户通过人机交互界面与系统交流,并进行操作。Human-computer interaction is the study of the interaction between systems and users. A system can be a variety of machines, as well as computerized systems and software. Human-computer interface usually refers to the part that is visible to the user. The user communicates with the system through the human-computer interface and performs operations.
现有技术中,通过投影仪投出用户操作界面时一般通过手工调整用户界面的位置、大小等,费事费力,且现有技术难以在非空白界面上投出用户操作界面,比如在,教科书中投出用户操作界面,且现有技术中,无法对操作界面中的显示内容进行自动识别进行初步定界,且现有的定界模式中,无法根据投影的内容进行自适应确定定界的方式,导致了人机交互效率低下,现有技术更不能根据定界物体的大小自适应调整投影仪与投影界面的距离,导致投影模糊,影响了用户体验。In the prior art, when the user interface is projected through the projector, the position, size, etc. of the user interface are generally adjusted manually, which is time-consuming and laborious, and the prior art is difficult to project the user interface on a non-blank interface, such as in textbooks. The user operation interface is projected, and in the prior art, it is impossible to automatically identify the displayed content in the operation interface to perform preliminary delimitation, and in the existing delimitation mode, it is impossible to adaptively determine the delimitation method according to the projected content. , resulting in low human-computer interaction efficiency, and the existing technology cannot adaptively adjust the distance between the projector and the projection interface according to the size of the delimiting object, resulting in blurred projection and affecting user experience.
发明内容SUMMARY OF THE INVENTION
本发明针对上述现有技术中的缺陷,提出了如下技术方案。The present invention proposes the following technical solutions in view of the above-mentioned defects in the prior art.
一种用于人机交互的对象边界确定方法,所述方法包括:An object boundary determination method for human-computer interaction, the method comprising:
场景信息获取步骤,使用广角摄像头实时拍摄场景图像,并将拍摄的每一帧场景图像发送至计算板;In the step of acquiring scene information, a wide-angle camera is used to capture a scene image in real time, and each frame of the captured scene image is sent to the computing board;
确定步骤,计算板基于获取的每一帧场景图像判断当前场景是否可以确定边界,如果是,则确定所述对象的边界范围。In the determining step, the computing board determines whether the current scene can determine the boundary based on the acquired scene images of each frame, and if so, determines the boundary range of the object.
更进一步地,所述对象为用户操作界面或显示的内容。Further, the object is a user operation interface or displayed content.
更进一步地,所述对象为用户操作界面时,所述确定步骤包括:Further, when the object is a user operation interface, the determining step includes:
所述计算板接收广角摄像头实时传输的每一帧场景图像,并对当前场景的物体分布利用mobilenet-ssd检测网络进行处理确定各物体的形状和物体相应的类别;The computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and uses the mobilenet-ssd detection network to process the object distribution of the current scene to determine the shape of each object and the corresponding category of the object;
所述计算板基于所述场景图像和确定的物体形状计算出各物体在场景图像的空间中的位置,将各物体的位置和相应的类别合并后生成物体数据集;The calculation board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after merging the position of each object and the corresponding category;
所述计算板从物体数据集中读取位置信息,基于所述位置信息从所述场景图像中减去所有的物体分布信息得到空白区域信息,然后所述计算板将根据用户设置进行是否可以进行边界确定,如果是,则将可定界的区域位置信息进行计算以确定边界范围,并对所述边界范围进行存储;The computing board reads the position information from the object data set, subtracts all the object distribution information from the scene image based on the position information to obtain blank area information, and then the computing board will determine whether the boundary can be performed according to user settings. Determine, if yes, calculate the position information of the delimitable area to determine the boundary range, and store the boundary range;
所述计算板将存储的边界范围传输至投影单元,同时传输一个定界成功的信号,当所述投影单元收到定界成功的信号后从所述计算板中获取存储的当前用户的设置信息;The computing board transmits the stored boundary range to the projection unit, and at the same time transmits a signal of successful delimitation, when the projection unit receives the signal of successful delimitation, obtains the stored setting information of the current user from the computing board ;
所述计算板确定是否在空白区域进行投影,如果是,则根据所述边界范围和所述设置信息确定投影区域,在所述投影区域投影出用户的操作界面;如果不是,则由用户选择出要投影的物体,然后所述计算板在所述物体数据集中读取所述物体的位置信息,基于所述物体的位置信息投影出所述用户的操作界面。The computing board determines whether to perform projection in the blank area, if so, determines the projection area according to the boundary range and the setting information, and projects the user's operation interface in the projection area; if not, the user selects the projection area. The object to be projected, then the computing board reads the position information of the object in the object data set, and projects the user's operation interface based on the position information of the object.
更进一步地,所述对象为用户操作界面时,所述方法还包括:Further, when the object is a user operation interface, the method further includes:
第一更新步骤,计算板以第一时间间隔对广角摄像头传输的场景图像进行处理后与之前的已确定边界的场景图像相比较,如果比较结果不一致,则重新进行边界确定。In the first update step, the computing board processes the scene image transmitted by the wide-angle camera at the first time interval and compares the scene image with the previously determined boundary. If the comparison result is inconsistent, the boundary is determined again.
更进一步地,所述对象为用户操作界面时,所述更新步骤包括:Further, when the object is a user operation interface, the updating step includes:
所述计算板每隔一秒再次从所述广角摄像头中获取一帧场景图像,并使用所述mobilenet-ssd检测网络获取该帧场景图像中的所有物体的分布状态;The computing board obtains a frame of scene image from the wide-angle camera every second, and uses the mobilenet-ssd detection network to obtain the distribution state of all objects in the frame of scene image;
所述计算板获取当前投影单元的设置信息,并与所述物体的分布状态相比较,如果比较结果误差大于第一阈值,则不可进行边界确定,则将投影单元更新至不可定界的警告状态,并将所述计算板状态调整为实时判断是否可定界的状态;如果比较结果误差小于所述第一阈值,则将确定新的边界范围,并与之前存储的边界范围进行比较,如果比较结果小于第二阈值,不进行更新,否则,则将新的边界范围进行存储并传输至所述投影单元,所述投影单元根据新的边界范围,将投影区域进行相应的调整。The computing board obtains the setting information of the current projection unit, and compares it with the distribution state of the object. If the error of the comparison result is greater than the first threshold, the boundary determination cannot be performed, and the projection unit is updated to the non-delimitable warning state. , and adjust the state of the computing board to a state that determines whether it can be delimited in real time; if the error of the comparison result is less than the first threshold, a new boundary range will be determined, and compared with the previously stored boundary range, if the comparison If the result is less than the second threshold, no update is performed; otherwise, the new boundary range is stored and transmitted to the projection unit, and the projection unit adjusts the projection area accordingly according to the new boundary range.
更进一步地,所述对象为显示的内容时,所述内容为显示在用户操作界面上的内容,所述确定步骤包括:Further, when the object is displayed content, the content is the content displayed on the user operation interface, and the determining step includes:
所述广角摄像头实时拍摄场景图像并以第二时间间隔传输至所述计算板,所述计算板将所述场景图像传输至云端服务器;所述云端服务器使用深度学习学习网络预测所述内容中文字的位置,并同时将包含所述文字的图片进行裁剪后得到第一子图片并进行存储;The wide-angle camera captures a scene image in real time and transmits it to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server uses a deep learning learning network to predict the Chinese characters of the content position, and at the same time, the picture containing the text is cropped to obtain the first sub-picture and stored;
所述云端服务器对所述第一子图片使用ctc算法进行文字内容的识别,识别后将所述文字和相应的位置生成内容数据集;The cloud server uses the ctc algorithm to identify the text content of the first sub-picture, and generates a content data set from the text and the corresponding position after the recognition;
所述服务器将所述内容数据集传输至所述计算板,所述计算板将所述内容数据集中的位置的信息传输至所述投影单元;the server transmits the content data set to the computing pad, the computing pad transmits information of the location in the content data set to the projection unit;
所述投影单元投影出所述计算板得到的内容数据集,并由用户选择需要进行定界的内容。The projection unit projects the content data set obtained by the computing board, and the user selects the content that needs to be delimited.
更进一步地,所述投影单元投影出所述计算板得到的内容数据集并由用户选择需要进行定界的内容包括:Further, the content data set obtained by the projection unit projected by the computing board and the content that needs to be delimited selected by the user includes:
所述投影单元实时监听到所述计算板,当接收到所述计算板发送的识别出的文字及位置后在投影区域中对所述内容进行浅色的显示;The projection unit monitors the computing board in real time, and displays the content in a light color in the projection area after receiving the recognized text and position sent by the computing board;
所述用户根据显示出的已识别内容进行选择,选择后会将在对应内容的位置处的边界进行明显化,表示已经选定了当前处的内容。The user selects according to the displayed identified content, and after selection, the boundary at the position of the corresponding content will be made obvious, indicating that the current content has been selected.
更进一步地,所述明显化为添加外框显示。Further, the obvious is adding an outer frame to display.
更进一步地,所述对象为显示的内容时,所述方法还包括:Further, when the object is displayed content, the method further includes:
第二更新步骤,所述计算板对用户已选择的内容进行深入识别,得到所述内容的具体信息后更新到所述投影单元并进行投影显示。In the second update step, the computing board performs in-depth identification of the content selected by the user, obtains the specific information of the content, and updates the content to the projection unit for projection display.
更进一步地,所述更新步骤包括:Further, the updating step includes:
当所述用户选择一个已经识别出的区域时,所述计算板获取用户的选择区域,并记录所选区域的位置;When the user selects an identified area, the computing board obtains the user's selected area and records the location of the selected area;
所述计算板基于所选区域的位置将用户选择的区域裁剪为第二子图片,并使用智能识别API对所述第二子图片中的文字或者图片信息进行分析;The computing board cuts the region selected by the user into a second sub-picture based on the position of the selected region, and uses an intelligent recognition API to analyze the text or picture information in the second sub-picture;
所述计算板将分析出的文字或者图片的具体信息和位置信息相结合得到所选区域的详细信息,并提取出所述详细信息中的有效部分进行规范化后得到规范数据传输至所述投影单元;The computing board combines the specific information and position information of the analyzed text or picture to obtain detailed information of the selected area, and extracts the effective part of the detailed information for normalization to obtain standardized data and transmits it to the projection unit. ;
所述投影单元接收到来自所述计算板的规范数据后,在投影区中的用户操作区域进行更新相应的显示。After the projection unit receives the normative data from the computing board, the corresponding display is updated in the user operation area in the projection area.
本发明还提出了一种用于人机交互的对象边界确定设备,所述设备包括:投影单元、广角摄像头和计算板;The present invention also provides an object boundary determination device for human-computer interaction, the device comprising: a projection unit, a wide-angle camera and a computing board;
所述广角摄像头实时拍摄场景图像,并将拍摄的每一帧场景图像发送至计算板;所述计算板接收到所述每一帧场景图像后基于获取的每一帧场景图像判断当前场景是否可以确定边界,如果是,则确定所述对象的边界范围。The wide-angle camera captures the scene image in real time, and sends each frame of the captured scene image to the computing board; after the computing board receives the each frame of the scene image, it determines whether the current scene can be Determine the bounds, and if so, determine the bounds of the object.
更进一步地,所述对象为用户操作界面或显示的内容。Further, the object is a user operation interface or displayed content.
更进一步地,所述对象为用户操作界面时,所述计算板接收到所述每一帧场景图像后基于获取的每一帧场景图像判断当前场景是否可以确定边界,如果是,则确定所述对象的边界范围包括:Further, when the object is a user operation interface, after receiving the each frame of the scene image, the computing board determines whether the current scene can determine the boundary based on each frame of the acquired scene image, and if so, determines the The bounds of the object include:
所述计算板接收广角摄像头实时传输的每一帧场景图像,并对当前场景的物体分布利用mobilenet-ssd检测网络进行处理确定各物体的形状和物体相应的类别;The computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and uses the mobilenet-ssd detection network to process the object distribution of the current scene to determine the shape of each object and the corresponding category of the object;
所述计算板基于所述场景图像和确定的物体形状计算出各物体在场景图像的空间中的位置,将各物体的位置和相应的类别合并后生成物体数据集;The calculation board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after merging the position of each object and the corresponding category;
所述计算板从物体数据集中读取位置信息,基于所述位置信息从所述场景图像中减去所有的物体分布信息得到空白区域信息,然后所述计算板将根据用户设置进行是否可以进行边界确定,如果是,则将可定界的区域位置信息进行计算以确定边界范围,并对所述边界范围进行存储;The computing board reads the position information from the object data set, subtracts all the object distribution information from the scene image based on the position information to obtain blank area information, and then the computing board will determine whether the boundary can be performed according to user settings. Determine, if yes, calculate the position information of the delimitable area to determine the boundary range, and store the boundary range;
所述计算板将存储的边界范围传输至投影单元,同时传输一个定界成功的信号,当所述投影单元收到定界成功的信号后从所述计算板中获取存储的当前用户的设置信息;The computing board transmits the stored boundary range to the projection unit, and at the same time transmits a signal of successful delimitation, when the projection unit receives the signal of successful delimitation, obtains the stored setting information of the current user from the computing board ;
所述计算板确定是否在空白区域进行投影,如果是,则根据所述边界范围和所述设置信息确定投影区域,在所述投影区域投影出用户的操作界面;如果不是,则由用户选择出要投影的物体,然后所述计算板在所述物体数据集中读取所述物体的位置信息,基于所述物体的位置信息投影出所述用户的操作界面。The computing board determines whether to perform projection in the blank area, if so, determines the projection area according to the boundary range and the setting information, and projects the user's operation interface in the projection area; if not, the user selects the projection area. The object to be projected, then the computing board reads the position information of the object in the object data set, and projects the user's operation interface based on the position information of the object.
更进一步地,所述对象为用户操作界面时,所述计算板以第一时间间隔对广角摄像头传输的场景图像进行处理后与之前的已确定边界的场景图像相比较,如果比较结果不一致,则重新进行边界确定。Further, when the object is a user operation interface, the computing board processes the scene image transmitted by the wide-angle camera at the first time interval and compares it with the previously determined scene image of the boundary. If the comparison results are inconsistent, then Redo the boundary determination.
更进一步地,所述对象为用户操作界面时,所述计算板以第一时间间隔对广角摄像头传输的场景图像进行处理后与之前的已确定边界的场景图像相比较,如果比较结果不一致,则重新进行边界确定包括:Further, when the object is a user operation interface, the computing board processes the scene image transmitted by the wide-angle camera at the first time interval and compares it with the previously determined scene image of the boundary. If the comparison results are inconsistent, then Redefining boundaries includes:
所述计算板每隔一秒再次从所述广角摄像头中获取一帧场景图像,并使用所述mobilenet-ssd检测网络获取该帧场景图像中的所有物体的分布状态;The computing board obtains a frame of scene image from the wide-angle camera every second, and uses the mobilenet-ssd detection network to obtain the distribution state of all objects in the frame of scene image;
所述计算板获取当前投影单元的设置信息,并与所述物体的分布状态相比较,如果比较结果误差大于第一阈值,则不可进行边界确定,则将投影单元更新至不可定界的警告状态,并将所述计算板状态调整为实时判断是否可定界的状态;如果比较结果误差小于所述第一阈值,则将确定新的边界范围,并与之前存储的边界范围进行比较,如果比较结果小于第二阈值,不进行更新,否则,则将新的边界范围进行存储并传输至所述投影单元,所述投影单元根据新的边界范围,将投影区域进行相应的调整。The computing board obtains the setting information of the current projection unit, and compares it with the distribution state of the object. If the error of the comparison result is greater than the first threshold, the boundary determination cannot be performed, and the projection unit is updated to the non-delimitable warning state. , and adjust the state of the computing board to a state that determines whether it can be delimited in real time; if the error of the comparison result is less than the first threshold, a new boundary range will be determined, and compared with the previously stored boundary range, if the comparison If the result is less than the second threshold, no update is performed; otherwise, the new boundary range is stored and transmitted to the projection unit, and the projection unit adjusts the projection area accordingly according to the new boundary range.
更进一步地,所述对象为显示的内容时,所述内容为显示在用户操作界面上的内容,所述计算板接收到所述每一帧场景图像后基于获取的每一帧场景图像判断当前场景是否可以确定边界,如果是,则确定所述对象的边界范围包括:Further, when the object is the displayed content, the content is the content displayed on the user operation interface, and the computing board determines the current image based on each frame of the scene image obtained after receiving the each frame of the scene image. Whether the scene can determine the boundary, if so, determine the boundary range of the object including:
所述广角摄像头实时拍摄场景图像并以第二时间间隔传输至所述计算板,所述计算板将所述场景图像传输至云端服务器;所述云端服务器使用深度学习学习网络预测所述内容中文字的位置,并同时将包含所述文字的图片进行裁剪后得到第一子图片并进行存储;The wide-angle camera captures a scene image in real time and transmits it to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server uses a deep learning learning network to predict the Chinese characters of the content position, and at the same time, the picture containing the text is cropped to obtain the first sub-picture and stored;
所述云端服务器对所述第一子图片使用ctc算法进行文字内容的识别,识别后将所述文字和相应的位置生成内容数据集;The cloud server uses the ctc algorithm to identify the text content of the first sub-picture, and generates a content data set from the text and the corresponding position after the recognition;
所述服务器将所述内容数据集传输至所述计算板,所述计算板将所述内容数据集中的位置的信息传输至所述投影单元;the server transmits the content data set to the computing pad, the computing pad transmits information of the location in the content data set to the projection unit;
所述投影单元投影出所述计算板得到的内容数据集,并由用户选择需要进行定界的内容。The projection unit projects the content data set obtained by the computing board, and the user selects the content that needs to be delimited.
更进一步地,所述投影单元投影出所述计算板得到的内容数据集并由用户选择需要进行定界的内容包括:Further, the content data set obtained by the projection unit projected by the computing board and the content that needs to be delimited selected by the user includes:
所述投影单元实时监听到所述计算板,当接收到所述计算板发送的识别出的文字及位置后在投影区域中对所述内容进行浅色的显示;The projection unit monitors the computing board in real time, and displays the content in a light color in the projection area after receiving the recognized text and position sent by the computing board;
所述用户根据显示出的已识别内容进行选择,选择后会将在对应内容的位置处的边界进行明显化,表示已经选定了当前处的内容。The user selects according to the displayed identified content, and after selection, the boundary at the position of the corresponding content will be made obvious, indicating that the current content has been selected.
更进一步地,所述明显化为添加外框显示。Further, the obvious is adding an outer frame to display.
更进一步地,所述对象为显示的内容时,所述计算板对用户已选择的内容进行深入识别,得到所述内容的具体信息后更新到所述投影单元并进行投影显示。Further, when the object is displayed content, the computing board performs in-depth identification of the content selected by the user, obtains the specific information of the content, and updates it to the projection unit for projection display.
更进一步地,所述计算板对用户已选择的内容进行深入识别,得到所述内容的具体信息后更新到所述投影单元并进行投影显示包括:Further, the computing board performs in-depth identification of the content selected by the user, and after obtaining the specific information of the content, updating it to the projection unit and performing projection display includes:
当所述用户选择一个已经识别出的区域时,所述计算板获取用户的选择区域,并记录所选区域的位置;When the user selects an identified area, the computing board obtains the user's selected area and records the location of the selected area;
所述计算板基于所选区域的位置将用户选择的区域裁剪为第二子图片,并使用智能识别API对所述第二子图片中的文字或者图片信息进行分析;The computing board cuts the region selected by the user into a second sub-picture based on the position of the selected region, and uses an intelligent recognition API to analyze the text or picture information in the second sub-picture;
所述计算板将分析出的文字或者图片的具体信息和位置信息相结合得到所选区域的详细信息,并提取出所述详细信息中的有效部分进行规范化后得到规范数据传输至所述投影单元;The computing board combines the specific information and position information of the analyzed text or picture to obtain detailed information of the selected area, and extracts the effective part of the detailed information for normalization to obtain standardized data and transmits it to the projection unit. ;
所述投影单元接收到来自所述计算板的规范数据后,在投影区中的用户操作区域进行更新相应的显示。After the projection unit receives the normative data from the computing board, the corresponding display is updated in the user operation area in the projection area.
本发明的技术效果为:本发明的一种用于人机交互的对象边界确定方法,所述方法包括:场景信息获取步骤,使用广角摄像头实时拍摄场景图像,并将拍摄的每一帧场景图像发送至计算板;确定步骤,计算板基于获取的每一帧场景图像判断当前场景是否可以确定边界,如果是,则确定所述对象的边界范围。本发明的主要优点在于:本发明在对场景图像判断后,通过去除场景图像中物体的方式得到空白区域的位置,使得定界准确度,进而使投出的用户界面十分清晰;且本发明支持用户操作界面的定界与显示内容定界的自由切换,因此,有助于在投影定界时增添其他的操作,例如还能对其内容进行进一步的提取,例如提取出具体的文字、检索图片的深层信息,而且信息还可直接借助投影进行展示,对于内容定界,投影自动给出识别的区域,并且附带有文字和边框的提醒,而且再加上投影的实时跟踪效果,使得定界后,无论是投影的区域显示还是面板的显示情况都有很好的视觉效果,且定界实时更新,从而实现了在移动物体上的投影用户操作界面,且可以实现基于物体大小进行投影仪与界面距离的自动调整,大大提高了用户的体验。The technical effect of the present invention is: an object boundary determination method for human-computer interaction according to the present invention, the method includes: the step of acquiring scene information, using a wide-angle camera to capture scene images in real time, and recording each frame of the captured scene images Send to the computing board; in the determining step, the computing board determines whether the current scene can determine the boundary based on each frame of scene images obtained, and if so, determines the boundary range of the object. The main advantages of the present invention are: after judging the scene image, the present invention obtains the position of the blank area by removing the objects in the scene image, so as to make the delimitation accuracy, and then make the projected user interface very clear; and the present invention supports The delimitation of the user interface and the delimitation of the displayed content can be freely switched. Therefore, it is helpful to add other operations during the projection and delimitation. For example, the content can be further extracted, such as extracting specific text and retrieving pictures. In addition, the information can also be displayed directly by means of projection. For content delimitation, the projection automatically gives the recognized area, and comes with reminders of text and borders, and coupled with the real-time tracking effect of the projection, the delimitation , both the projected area display and the panel display have good visual effects, and the delimitation is updated in real time, thereby realizing the projection user interface on the moving object, and can realize the projector and interface based on the size of the object. The automatic adjustment of the distance greatly improves the user experience.
附图说明Description of drawings
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显。Other features, objects and advantages of the present application will become more apparent upon reading the detailed description of non-limiting embodiments taken with reference to the following drawings.
图1是根据本发明的实施例之一的一种用于人机交互的对象边界确定方法的流程图。FIG. 1 is a flowchart of an object boundary determination method for human-computer interaction according to one of the embodiments of the present invention.
图2是根据本发明的实施例之一的一种用于人机交互的对象边界确定设备的示意图。FIG. 2 is a schematic diagram of an object boundary determination device for human-computer interaction according to one of the embodiments of the present invention.
具体实施方式Detailed ways
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。The present application will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the related invention, but not to limit the invention. In addition, it should be noted that, for the convenience of description, only the parts related to the related invention are shown in the drawings.
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that the embodiments in the present application and the features of the embodiments may be combined with each other in the case of no conflict. The present application will be described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
图1示出了本发明的一种用于人机交互的对象边界确定方法,所述方法包括:FIG. 1 shows a method for determining an object boundary for human-computer interaction according to the present invention, and the method includes:
场景信息获取步骤S101,使用广角摄像头实时拍摄场景图像,并将拍摄的每一帧场景图像发送至计算板;The scene information acquisition step S101 is to use a wide-angle camera to capture a scene image in real time, and send each frame of the captured scene image to the computing board;
确定步骤S102,计算板基于获取的每一帧场景图像判断当前场景是否可以确定边界,如果是,则确定所述对象的边界范围。In determination step S102, the computing board determines whether the current scene can determine the boundary based on each frame of the acquired scene image, and if so, determines the boundary range of the object.
本发明的方法可以应用在智能台灯上,所述智能台灯具有投影单元,即投影仪、广角摄像头、深度摄像头、红外摄像头等等,其内部具有计算板,计算板至少具有处理器和存储器,用于完成数据的处理等等,当然,其也必然具有电源、电源控制器等等。投影单元可以是投影仪,通过本发明的方法,可以确定投影单元在桌面上投出一个操作界面的边界。当用户在操作界面上操作时,还可以确定显示内容的边界,至于对何种对象进行定界,计算板根据当前投影单元所投影的内容进行判断,根据判断结果确定是对用户操作界面定界还是对显示内容进行定界,然后再进行相应的定界操作,比如初始化时,是对投影的用户操作界面进行定界,当具有了用户操作界面后,用户在操作界面上进行操作时,对操作界面上的显示内容进行定界,即本发明支持用户操作界面的定界与显示内容定界的自由切换。The method of the present invention can be applied to a smart desk lamp. The smart desk lamp has a projection unit, that is, a projector, a wide-angle camera, a depth camera, an infrared camera, etc., and has a computing board inside. In order to complete the processing of data, etc., of course, it must also have a power supply, a power supply controller, and the like. The projection unit may be a projector, and through the method of the present invention, the boundary of the projection unit to project an operation interface on the desktop can be determined. When the user operates on the operation interface, the boundary of the displayed content can also be determined. As for which object to delimit, the computing board judges according to the content projected by the current projection unit, and determines whether to delimit the user operation interface according to the judgment result. Or delimit the display content, and then perform the corresponding delimitation operation. For example, during initialization, the projected user operation interface is delimited. After the user operation interface is provided, when the user operates on the operation interface, the The content displayed on the operation interface is delimited, that is, the present invention supports the free switching between the delimitation of the user operation interface and the delimitation of the display content.
在一个实施例中,当计算板根据当前投影单元所投影的内容进行判断,根据判断结果确定是对用户操作界面定界时,即所述对象为用户操作界面时,所述确定步骤S102包括:In one embodiment, when the computing board judges according to the content projected by the current projection unit, and determines according to the judgment result that the user operation interface is delimited, that is, when the object is the user operation interface, the determining step S102 includes:
所述计算板接收广角摄像头实时传输的每一帧场景图像,并对当前场景的物体分布利用mobilenet-ssd检测网络进行处理确定各物体的形状和各物体相应的类别;The computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and uses the mobilenet-ssd detection network to process the object distribution in the current scene to determine the shape of each object and the corresponding category of each object;
所述计算板基于所述场景图像和确定的物体形状计算出各物体在场景图像的空间中的位置,将各物体的位置和相应的类别合并后生成物体数据集;The calculation board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after merging the position of each object and the corresponding category;
所述计算板从物体数据集中读取位置信息,基于所述位置信息从所述场景图像中减去所有的物体分布信息得到空白区域信息,然后所述计算板将根据用户设置进行是否可以进行边界确定,如果是,则将可定界的区域位置信息进行计算以确定边界范围,并对所述边界范围进行存储;The computing board reads the position information from the object data set, subtracts all the object distribution information from the scene image based on the position information to obtain blank area information, and then the computing board will determine whether the boundary can be performed according to user settings. Determine, if yes, calculate the position information of the delimitable area to determine the boundary range, and store the boundary range;
所述计算板将存储的边界范围传输至投影单元,同时传输一个定界成功的信号,当所述投影单元收到定界成功的信号后从所述计算板中获取存储的当前用户的设置信息;The computing board transmits the stored boundary range to the projection unit, and at the same time transmits a signal of successful delimitation, when the projection unit receives the signal of successful delimitation, obtains the stored setting information of the current user from the computing board ;
所述计算板确定是否在空白区域进行投影,如果是,则根据所述边界范围和所述设置信息确定投影区域,在所述投影区域投影出用户的操作界面;如果不是,则由用户选择出要投影的物体,然后所述计算板在所述物体数据集中读取所述物体的位置信息,基于所述物体的位置信息投影出所述用户的操作界面。The computing board determines whether to perform projection in the blank area, if so, determines the projection area according to the boundary range and the setting information, and projects the user's operation interface in the projection area; if not, the user selects the projection area. The object to be projected, then the computing board reads the position information of the object in the object data set, and projects the user's operation interface based on the position information of the object.
定界方式有两种主要有两种,一为空白区域的投影(投影在空白区域上),如果区域大小满足用户设置的投影范围大小则说明可定界,这是比较传统的方式;另一种方式为基于识别后物体的投影,即投影在特定的书籍或者纸张上,如果有可投影的物体,例如书籍、纸张等,则说明可定界,本发明通过上述确定的具体操作去除场景图像中物体的方式得到空白区域的位置,然后再进行定界,使得定界准确度,进而使投出的用户界面十分清晰,这是本发明的一个重要发明点。There are two main ways of delimitation, one is the projection of the blank area (projection on the blank area), if the size of the area meets the projection range set by the user, it means that it can be delimited, which is a more traditional way; The first method is based on the projection of the recognized object, that is, projected on a specific book or paper. If there is an object that can be projected, such as a book, paper, etc., it means that the boundary can be delimited. The present invention removes the scene image through the specific operation determined above. The position of the blank area is obtained by the method of middle object, and then the delimitation is carried out, so that the delimitation accuracy is improved, and the projected user interface is very clear, which is an important invention point of the present invention.
在一个实施例中,所述对象为用户操作界面时,所述方法还包括:In one embodiment, when the object is a user operation interface, the method further includes:
第一更新步骤S103,计算板以第一时间间隔对广角摄像头传输的场景图像进行处理后与之前的已确定边界的场景图像相比较,如果比较结果不一致,则重新进行边界确定,本申请中,所有的‘如果比较结果不一致’的含义为前后两次检测后场景中物体分布情况相差较大,这表明有物体的位置被较大幅度的改变。所述对象为用户操作界面时,所述第一更新步骤S103包括:In the first update step S103, the computing board processes the scene image transmitted by the wide-angle camera at the first time interval and compares it with the scene image of the previously determined boundary. If the comparison result is inconsistent, the boundary is determined again. In this application, All 'if the comparison results are inconsistent' means that the distribution of objects in the scene after the two detections before and after the detection is quite different, which indicates that the position of the object has been greatly changed. When the object is a user operation interface, the first update step S103 includes:
所述计算板每隔一秒再次从所述广角摄像头中获取一帧场景图像,并使用所述mobilenet-ssd检测网络获取该帧场景图像中的所有物体的分布状态;The computing board obtains a frame of scene image from the wide-angle camera every second, and uses the mobilenet-ssd detection network to obtain the distribution state of all objects in the frame of scene image;
所述计算板获取当前投影单元的设置信息,并与所述物体的分布状态相比较,如果比较结果误差大于第一阈值,则不可进行边界确定,则将投影单元更新至不可定界的警告状态,并将所述计算板状态调整为实时判断是否可定界的状态;如果比较结果误差小于所述第一阈值,则将确定新的边界范围,并与之前存储的边界范围进行比较,如果比较结果小于第二阈值,不进行更新,否则,则将新的边界范围进行存储并传输至所述投影单元,所述投影单元根据新的边界范围,将投影区域进行相应的调整。The computing board obtains the setting information of the current projection unit, and compares it with the distribution state of the object. If the error of the comparison result is greater than the first threshold, the boundary determination cannot be performed, and the projection unit is updated to the non-delimitable warning state. , and adjust the state of the computing board to a state that determines whether it can be delimited in real time; if the error of the comparison result is less than the first threshold, a new boundary range will be determined, and compared with the previously stored boundary range, if the comparison If the result is less than the second threshold, no update is performed; otherwise, the new boundary range is stored and transmitted to the projection unit, and the projection unit adjusts the projection area accordingly according to the new boundary range.
通过上述更新操作,本发明使得定界实时更新,从而实现了在移动物体上的投影用户操作界面,即投影可以跟踪物体的移动,方便了用户的操作,即通过实时刷新实现的跟踪效果,使得定界的能力进一步提升,使得用户在非大范围移动设备时能够保证投影区域的自动跟踪,大大提高了用户的体验,这是本发明的另一个重要发明点。Through the above updating operation, the present invention enables the delimitation to be updated in real time, thereby realizing the projection user operation interface on the moving object, that is, the projection can track the movement of the object, which is convenient for the user's operation, that is, the tracking effect realized by the real-time refresh makes the The ability to delimit is further improved, so that the user can ensure the automatic tracking of the projection area when the device is not moving in a large range, which greatly improves the user's experience, which is another important invention point of the present invention.
在一个实施例中,当计算板根据当前投影单元所投影的内容进行判断,根据判断结果确定是对显示内容进行定界时,即所述对象为显示的内容时,所述内容为显示在用户操作界面上的内容,所述确定步骤S102包括:In one embodiment, when the computing board makes a judgment according to the content projected by the current projection unit, and determines according to the judgment result that the displayed content is to be delimited, that is, when the object is the displayed content, the content is displayed on the user The content on the operation interface, the determining step S102 includes:
所述广角摄像头实时拍摄场景图像并以第二时间间隔传输至所述计算板,所述计算板将所述场景图像传输至云端服务器;所述云端服务器使用深度学习学习网络预测所述内容中文字的位置,并同时将包含所述文字的图片进行裁剪后得到第一子图片并进行存储;The wide-angle camera captures a scene image in real time and transmits it to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server uses a deep learning learning network to predict the Chinese characters of the content position, and at the same time, the picture containing the text is cropped to obtain the first sub-picture and stored;
所述云端服务器对所述第一子图片使用ctc算法进行文字内容的识别,识别后将所述文字和相应的位置生成内容数据集;The cloud server uses the ctc algorithm to identify the text content of the first sub-picture, and generates a content data set from the text and the corresponding position after the recognition;
所述服务器将所述内容数据集传输至所述计算板,所述计算板将所述内容数据集中的位置的信息传输至所述投影单元;the server transmits the content data set to the computing pad, the computing pad transmits information of the location in the content data set to the projection unit;
所述投影单元投影出所述计算板得到的内容数据集,并由用户选择需要进行定界的内容。The projection unit projects the content data set obtained by the computing board, and the user selects the content that needs to be delimited.
在一个实施例中,所述投影单元投影出所述计算板得到的内容数据集并由用户选择需要进行定界的内容包括:In one embodiment, the projection unit projects the content data set obtained by the computing board and the content that needs to be delimited selected by the user includes:
所述投影单元实时监听到所述计算板,当接收到所述计算板发送的识别出的文字及位置后在投影区域中对所述内容进行浅色的显示;The projection unit monitors the computing board in real time, and displays the content in a light color in the projection area after receiving the recognized text and position sent by the computing board;
所述用户根据显示出的已识别内容进行选择,选择后会将在对应内容的位置处的边界进行明显化,表示已经选定了当前处的内容。所述明显化为添加外框显示。The user selects according to the displayed identified content, and after selection, the boundary at the position of the corresponding content will be made obvious, indicating that the current content has been selected. The manifesting is adding a frame to the display.
通过上述内容的定界,可以对显示的内容进行进一步的提取,例如提取出具体的文字、检索图片的深层信息,而且信息还可直接借助投影进行展示,对于内容定界,投影自动给出识别的区域,并且附带有文字和边框的提醒,而且再加上投影的实时跟踪效果,并且让用户能够再次获取(赋值),使得用户在获取信息时有了全新的感受,这属于本发明的另一个重要发明点。Through the delimitation of the above content, the displayed content can be further extracted, such as extracting specific text, retrieving the deep information of the picture, and the information can also be displayed directly by means of projection. For content delimitation, projection automatically gives recognition It also has a reminder of text and borders, plus the real-time tracking effect of projection, and allows users to obtain (assign) again, so that users have a new feeling when obtaining information, which belongs to another aspect of the present invention. an important invention.
在一个实施例中,所述对象为显示的内容时,所述方法还包括:In one embodiment, when the object is displayed content, the method further includes:
第二更新步骤S104,所述计算板对用户已选择的内容进行深入识别,得到所述内容的具体信息后更新到所述投影单元并进行投影显示。在一个实施例中,所述第二更新步骤S104包括:In the second update step S104, the computing board performs an in-depth identification of the content selected by the user, obtains the specific information of the content, and updates the content to the projection unit for projection display. In one embodiment, the second update step S104 includes:
当所述用户选择一个已经识别出的区域时,所述计算板获取用户的选择区域,并记录所选区域的位置;When the user selects an identified area, the computing board obtains the user's selected area and records the location of the selected area;
所述计算板基于所选区域的位置将用户选择的区域裁剪为第二子图片,并使用智能识别API对所述第二子图片中的文字或者图片信息进行分析;The computing board cuts the region selected by the user into a second sub-picture based on the position of the selected region, and uses an intelligent recognition API to analyze the text or picture information in the second sub-picture;
所述计算板将分析出的文字或者图片的具体信息和位置信息相结合得到所选区域的详细信息,并提取出所述详细信息中的有效部分进行规范化后得到规范数据传输至所述投影单元;规范化是指将信息的有效部分提取出来,比如成语的解释等。The computing board combines the specific information and position information of the analyzed text or picture to obtain detailed information of the selected area, and extracts the effective part of the detailed information for normalization to obtain standardized data and transmits it to the projection unit. ; Normalization refers to extracting the effective part of the information, such as the interpretation of idioms.
所述投影单元接收到来自所述计算板的规范数据后,在投影区中的用户操作区域进行更新相应的显示。After the projection unit receives the normative data from the computing board, the corresponding display is updated in the user operation area in the projection area.
显示内容的定界是为了更好的记录和文字图片相关的标记位置而实现的方法,同时定界后也有助于进行信息的采集进行下一步的应用,通过上述更新操作,本发明使得显示内容实时更新,方便用户对显示内容的操作,大大提高了用户的体验,这是本发明的另一个重要发明点。The delimitation of the displayed content is a method implemented to better record the marked positions related to the text and pictures, and at the same time, the delimitation also helps to collect information for the next application. Through the above update operation, the present invention makes the displayed content The real-time update facilitates the user's operation of the displayed content and greatly improves the user's experience, which is another important invention point of the present invention.
此外,在一个实施例中,为了保证投影在不同大小不同距离的物体上都有优秀的投影效果,本发明采用以中心点距离为基础调节投影仪焦距的方法,具体操作如下:计算板已经基于用户的选择确定了要定界的物体(比如,图书),并计算出边界在将要投影在场景中的位置;基于定界的边界,计算板以物体的四个边界为基础计算出定界区域对角线的交点,即中心点的位置,再次进行存储;然后计算板启用深度摄像头对场景进行拍摄,在获取到场景完整的RGB-D信息后进行临时存储;计算板基于获取到的RGB-D信息,从中提取出深度信息后,与定界区域中心点的位置结合,进而得到中心点与摄像头的距离,再基于摄像头与投影仪的位置微调后得到投影仪与定界区域中心的距离;进而计算板调用投影仪的初始化方法,将该距离作为原始的焦距,在经过初始化过程中投影仪自身的梯形校正处理,便可基于距离实现相应位置的清晰投影显示。通过该操作,在定界时,实现了基于定界物体的大小自动调整投影单元的与定界区域中心,使得投影出来的 界面更加清晰,这是本发明的另一个重要发明点。In addition, in one embodiment, in order to ensure excellent projection effects on objects of different sizes and distances, the present invention adopts a method of adjusting the focal length of the projector based on the center point distance. The specific operations are as follows: the calculation board has been based on The user's selection determines the object to be delimited (for example, a book), and calculates the position of the boundary to be projected in the scene; based on the delimited boundary, the calculation board calculates the delimited area based on the four boundaries of the object The intersection of the diagonal lines, that is, the position of the center point, is stored again; then the computing board enables the depth camera to shoot the scene, and temporarily stores it after obtaining the complete RGB-D information of the scene; the computing board is based on the obtained RGB-D information. D information, after extracting the depth information from it, combine it with the position of the center point of the bounding area, and then obtain the distance between the center point and the camera, and then obtain the distance between the projector and the center of the bounding area after fine-tuning based on the positions of the camera and the projector; Then, the computing board calls the initialization method of the projector, and uses the distance as the original focal length. After the projector's own keystone correction processing during the initialization process, a clear projection display of the corresponding position can be realized based on the distance. Through this operation, when delimiting, the center of the projection unit and the delimiting area are automatically adjusted based on the size of the delimiting object, so that the projected interface is clearer, which is another important invention point of the present invention.
图2示出了本发明的一种用于人机交互的对象边界确定设备,所述设备包括:投影单元、广角摄像头和计算板等等;FIG. 2 shows an object boundary determination device for human-computer interaction according to the present invention, the device includes: a projection unit, a wide-angle camera, a computing board, and the like;
所述广角摄像头实时拍摄场景图像,并将拍摄的每一帧场景图像发送至计算板;所述计算板接收到所述每一帧场景图像后基于获取的每一帧场景图像判断当前场景是否可以确定边界,如果是,则确定所述对象的边界范围。The wide-angle camera captures the scene image in real time, and sends each frame of the captured scene image to the computing board; after the computing board receives the each frame of the scene image, it determines whether the current scene can be Determine the bounds, and if so, determine the bounds of the object.
本发明的设备可以是智能台灯上,所述智能台灯具有投影单元,即投影仪、广角摄像头、深度摄像头、红外摄像头等等,其内部具有计算板,计算板至少具有处理器和存储器,用于完成数据的处理等等,当然,其也必然具有电源、电源控制器等等。投影单元可以是投影仪,通过本发明的方法,可以确定投影单元在桌面上投出一个操作界面的边界。当用户在操作界面上操作时,还可以确定显示内容的边界,至于对何种对象进行定界,计算板根据当前投影单元所投影的内容进行判断,根据判断结果确定是对用户操作界面定界还是对显示内容进行定界,然后再进行相应的定界操作,比如初始化时,是对投影的用户操作界面进行定界,当具有了用户操作界面后,用户在操作界面上进行操作时,对操作界面上的显示内容进行定界,即本发明支持用户操作界面的定界与显示内容定界的自由切换,所述智能台灯可以与服务器进行交互,在图2中示出了服务器,但服务器不属于智能台灯的一部分,服务器可以是云服务器等。The device of the present invention can be on a smart desk lamp, the smart desk lamp has a projection unit, that is, a projector, a wide-angle camera, a depth camera, an infrared camera, etc., and has a computing board inside, and the computing board at least has a processor and a memory for To complete the processing of data, etc., of course, it must also have a power supply, a power supply controller, and the like. The projection unit may be a projector, and through the method of the present invention, the boundary of the projection unit to project an operation interface on the desktop can be determined. When the user operates on the operation interface, the boundary of the displayed content can also be determined. As for which object to delimit, the computing board judges according to the content projected by the current projection unit, and determines whether to delimit the user operation interface according to the judgment result. Or delimit the display content, and then perform the corresponding delimitation operation. For example, during initialization, the projected user operation interface is delimited. After the user operation interface is provided, when the user operates on the operation interface, the The display content on the operation interface is delimited, that is, the present invention supports the free switching between the delimitation of the user operation interface and the delimitation of the display content, and the smart desk lamp can interact with the server. The server is shown in FIG. 2, but the server Not part of the smart desk lamp, the server can be a cloud server, etc.
在一个实施例中,当计算板根据当前投影单元所投影的内容进行判断,根据判断结果确定是对用户操作界面定界时,即所述对象为用户操作界面时,所述计算板接收到所述每一帧场景图像后基于获取的每一帧场景图像判断当前场景是否可以确定边界,如果是,则确定所述对象的边界范围包括:In one embodiment, when the computing board judges according to the content projected by the current projection unit, and determines according to the judgment result that the user operation interface is delimited, that is, when the object is the user operation interface, the computing board receives the After describing each frame of scene image, it is determined whether the current scene can determine the boundary based on each frame of scene image obtained, and if so, then determining the boundary range of the object includes:
所述计算板接收广角摄像头实时传输的每一帧场景图像,并对当前场景的物体分布利用mobilenet-ssd检测网络进行处理确定各物体的形状和各物体相应的类别;The computing board receives each frame of scene image transmitted by the wide-angle camera in real time, and uses the mobilenet-ssd detection network to process the object distribution in the current scene to determine the shape of each object and the corresponding category of each object;
所述计算板基于所述场景图像和确定的物体形状计算出各物体在场景图像的空间中的位置,将各物体的位置和相应的类别合并后生成物体数据集;The calculation board calculates the position of each object in the space of the scene image based on the scene image and the determined object shape, and generates an object data set after merging the position of each object and the corresponding category;
所述计算板从物体数据集中读取位置信息,基于所述位置信息从所述场景图像中减去所有的物体分布信息得到空白区域信息,然后所述计算板将根据用户设置进行是否可以进行边界确定,如果是,则将可定界的区域位置信息进行计算以确定边界范围,并对所述边界范围进行存储;The computing board reads the position information from the object data set, subtracts all the object distribution information from the scene image based on the position information to obtain blank area information, and then the computing board will determine whether the boundary can be performed according to user settings. Determine, if yes, calculate the position information of the delimitable area to determine the boundary range, and store the boundary range;
所述计算板将存储的边界范围传输至投影单元,同时传输一个定界成功的信号,当所述投影单元收到定界成功的信号后从所述计算板中获取存储的当前用户的设置信息;The computing board transmits the stored boundary range to the projection unit, and at the same time transmits a signal of successful delimitation, when the projection unit receives the signal of successful delimitation, obtains the stored setting information of the current user from the computing board ;
所述计算板确定是否在空白区域进行投影,如果是,则根据所述边界范围和所述设置信息确定投影区域,在所述投影区域投影出用户的操作界面;如果不是,则由用户选择出要投影的物体,然后所述计算板在所述物体数据集中读取所述物体的位置信息,基于所述物体的位置信息投影出所述用户的操作界面。The computing board determines whether to perform projection in the blank area, if so, determines the projection area according to the boundary range and the setting information, and projects the user's operation interface in the projection area; if not, the user selects the projection area. The object to be projected, then the computing board reads the position information of the object in the object data set, and projects the user's operation interface based on the position information of the object.
定界方式有两种主要有两种,一为空白区域的投影(投影在空白区域上),如果区域大小满足用户设置的投影范围大小则说明可定界,这是比较传统的方式;另一种方式为基于识别后物体的投影,即投影在特定的书籍或者纸张上,如果有可投影的物体,例如书籍、纸张等,则说明可定界,本发明通过上述确定的具体操作去除场景图像中物体的方式得到空白区域的位置,然后再进行定界,使得定界准确度,进而使投出的用户界面十分清晰,这是本发明的一个重要发明点。There are two main ways of delimitation, one is the projection of the blank area (projection on the blank area), if the size of the area meets the projection range set by the user, it means that it can be delimited, which is a more traditional way; The first method is based on the projection of the recognized object, that is, projected on a specific book or paper. If there is an object that can be projected, such as a book, paper, etc., it means that the boundary can be delimited. The present invention removes the scene image through the specific operation determined above. The position of the blank area is obtained by the method of middle object, and then the delimitation is carried out, so that the delimitation accuracy is improved, and the projected user interface is very clear, which is an important invention point of the present invention.
在一个实施例中,所述对象为用户操作界面时,所述计算板以第一时间间隔对广角摄像头传输的场景图像进行处理后与之前的已确定边界的场景图像相比较,如果比较结果不一致,则重新进行边界确定。In one embodiment, when the object is a user interface, the computing board processes the scene image transmitted by the wide-angle camera at a first time interval and compares it with the scene image with the determined boundary before, if the comparison result is inconsistent , the boundary is determined again.
在一个实施例中,所述对象为用户操作界面时,所述计算板以第一时间间隔对广角摄像头传输的场景图像进行处理后与之前的已确定边界的场景图像相比较,如果比较结果不一致,则重新进行边界确定包括:In one embodiment, when the object is a user interface, the computing board processes the scene image transmitted by the wide-angle camera at a first time interval and compares it with the scene image with the determined boundary before, if the comparison result is inconsistent , the boundary determination again includes:
所述计算板每隔一秒再次从所述广角摄像头中获取一帧场景图像,并使用所述mobilenet-ssd检测网络获取该帧场景图像中的所有物体的分布状态;The computing board obtains a frame of scene image from the wide-angle camera every second, and uses the mobilenet-ssd detection network to obtain the distribution state of all objects in the frame of scene image;
所述计算板获取当前投影单元的设置信息,并与所述物体的分布状态相比较,如果比较结果误差大于第一阈值,则不可进行边界确定,则将投影单元更新至不可定界的警告状态,并将所述计算板状态调整为实时判断是否可定界的状态;如果比较结果误差小于所述第一阈值,则将确定新的边界范围,并与之前存储的边界范围进行比较,如果比较结果小于第二阈值,不进行更新,否则,则将新的边界范围进行存储并传输至所述投影单元,所述投影单元根据新的边界范围,将投影区域进行相应的调整。The computing board obtains the setting information of the current projection unit, and compares it with the distribution state of the object. If the error of the comparison result is greater than the first threshold, the boundary determination cannot be performed, and the projection unit is updated to the non-delimitable warning state. , and adjust the state of the computing board to a state that determines whether it can be delimited in real time; if the error of the comparison result is less than the first threshold, a new boundary range will be determined, and compared with the previously stored boundary range, if the comparison If the result is less than the second threshold, no update is performed; otherwise, the new boundary range is stored and transmitted to the projection unit, and the projection unit adjusts the projection area accordingly according to the new boundary range.
通过上述更新操作,本发明使得定界实时更新,从而实现了在移动物体上的投影用户操作界面,即投影可以跟踪物体的移动,方便了用户的操作,即通过实时刷新实现的跟踪效果,使得定界的能力进一步提升,使得用户在非大范围移动设备时能够保证投影区域的自动跟踪,大大提高了用户的体验,这是本发明的另一个重要发明点。Through the above updating operation, the present invention enables the delimitation to be updated in real time, thereby realizing the projection user operation interface on the moving object, that is, the projection can track the movement of the object, which is convenient for the user's operation, that is, the tracking effect realized by the real-time refresh makes the The ability to delimit is further improved, so that the user can ensure the automatic tracking of the projection area when the device is not moving in a large range, which greatly improves the user's experience, which is another important invention point of the present invention.
在一个实施例中,当计算板根据当前投影单元所投影的内容进行判断,根据判断结果确定是对显示内容进行定界时,即所述对象为显示的内容时,所述内容为显示在用户操作界面上的内容,所述计算板接收到所述每一帧场景图像后基于获取的每一帧场景图像判断当前场景是否可以确定边界,如果是,则确定所述对象的边界范围包括:In one embodiment, when the computing board makes a judgment according to the content projected by the current projection unit, and determines according to the judgment result that the displayed content is to be delimited, that is, when the object is the displayed content, the content is displayed on the user The content on the operation interface, after receiving the each frame of scene image, the computing board judges whether the current scene can determine the boundary based on each frame of the scene image obtained, and if so, then determine the boundary range of the object includes:
所述广角摄像头实时拍摄场景图像并以第二时间间隔传输至所述计算板,所述计算板将所述场景图像传输至云端服务器;所述云端服务器使用深度学习学习网络预测所述内容中文字的位置,并同时将包含所述文字的图片进行裁剪后得到第一子图片并进行存储;The wide-angle camera captures a scene image in real time and transmits it to the computing board at a second time interval, and the computing board transmits the scene image to a cloud server; the cloud server uses a deep learning learning network to predict the Chinese characters of the content position, and at the same time, the picture containing the text is cropped to obtain the first sub-picture and stored;
所述云端服务器对所述第一子图片使用ctc算法进行文字内容的识别,识别后将所述文字和相应的位置生成内容数据集;The cloud server uses the ctc algorithm to identify the text content of the first sub-picture, and generates a content data set from the text and the corresponding position after the recognition;
所述服务器将所述内容数据集传输至所述计算板,所述计算板将所述内容数据集中的位置的信息传输至所述投影单元;the server transmits the content data set to the computing pad, the computing pad transmits information of the location in the content data set to the projection unit;
所述投影单元投影出所述计算板得到的内容数据集,并由用户选择需要进行定界的内容。优选地,所述投影单元投影出所述计算板得到的内容数据集并由用户选择需要进行定界的内容包括:The projection unit projects the content data set obtained by the computing board, and the user selects the content that needs to be delimited. Preferably, the projection unit projects the content data set obtained by the computing board and the content that needs to be delimited selected by the user includes:
所述投影单元实时监听到所述计算板,当接收到所述计算板发送的识别出的文字及位置后在投影区域中对所述内容进行浅色的显示;The projection unit monitors the computing board in real time, and displays the content in a light color in the projection area after receiving the recognized text and position sent by the computing board;
所述用户根据显示出的已识别内容进行选择,选择后会将在对应内容的位置处的边界进行明显化,表示已经选定了当前处的内容。所述明显化为添加外框显示。The user selects according to the displayed identified content, and after selection, the boundary at the position of the corresponding content will be made obvious, indicating that the current content has been selected. The manifesting is adding a frame to the display.
通过上述内容的定界,可以对显示的内容进行进一步的提取,例如提取出具体的文字、检索图片的深层信息,而且信息还可直接借助投影进行展示,对于内容定界,投影自动给出识别的区域,并且附带有文字和边框的提醒,而且再加上投影的实时跟踪效果,并且让用户能够再次获取(赋值),使得用户在获取信息时有了全新的感受,这属于本发明的另一个重要发明点。Through the delimitation of the above content, the displayed content can be further extracted, such as extracting specific text, retrieving the deep information of the picture, and the information can also be displayed directly by means of projection. For content delimitation, projection automatically gives recognition It also has a reminder of text and borders, plus the real-time tracking effect of projection, and allows users to obtain (assign) again, so that users have a new feeling when obtaining information, which belongs to another aspect of the present invention. an important invention.
在一个实施例中,所述对象为显示的内容时,所述计算板对用户已选择的内容进行深入识别,得到所述内容的具体信息后更新到所述投影单元并进行投影显示。所述计算板对用户已选择的内容进行深入识别,得到所述内容的具体信息后更新到所述投影单元并进行投影显示包括:In one embodiment, when the object is displayed content, the computing board performs in-depth identification of the content selected by the user, obtains specific information of the content, and updates it to the projection unit for projection display. The computing board performs in-depth identification of the content selected by the user, and after obtaining the specific information of the content, updating it to the projection unit and performing projection display includes:
当所述用户选择一个已经识别出的区域时,所述计算板获取用户的选择区域,并记录所选区域的位置;When the user selects an identified area, the computing board obtains the user's selected area and records the location of the selected area;
所述计算板基于所选区域的位置将用户选择的区域裁剪为第二子图片,并使用智能识别API对所述第二子图片中的文字或者图片信息进行分析;The computing board cuts the region selected by the user into a second sub-picture based on the position of the selected region, and uses an intelligent recognition API to analyze the text or picture information in the second sub-picture;
所述计算板将分析出的文字或者图片的具体信息和位置信息相结合得到所选区域的详细信息,并提取出所述详细信息中的有效部分进行规范化后得到规范数据传输至所述投影单元;规范化是指将信息的有效部分提取出来,比如成语的解释等。The computing board combines the specific information and position information of the analyzed text or picture to obtain detailed information of the selected area, and extracts the effective part of the detailed information for normalization to obtain standardized data and transmits it to the projection unit. ; Normalization refers to extracting the effective part of the information, such as the interpretation of idioms.
所述投影单元接收到来自所述计算板的规范数据后,在投影区中的用户操作区域进行更新相应的显示。After the projection unit receives the normative data from the computing board, the corresponding display is updated in the user operation area in the projection area.
显示内容的定界是为了更好的记录和文字图片相关的标记位置而实现的方法,同时定界后也有助于进行信息的采集进行下一步的应用,通过上述更新操作,本发明使得显示内容实时更新,方便用户对显示内容的操作,大大提高了用户的体验,这是本发明的另一个重要发明点。The delimitation of the displayed content is a method implemented to better record the marked positions related to the text and pictures, and at the same time, the delimitation also helps to collect information for the next application. Through the above update operation, the present invention makes the displayed content The real-time update facilitates the user's operation of the displayed content and greatly improves the user's experience, which is another important invention point of the present invention.
此外,在一个实施例中,为了保证投影在不同大小不同距离的物体上都有优秀的投影效果,本发明采用以中心点距离为基础调节投影仪焦距的方法,具体操作如下:计算板已经基于用户的选择确定了要定界的物体(比如,图书),并计算出边界在将要投影在场景中的位置;基于定界的边界,计算板以物体的四个边界为基础计算出定界区域对角线的交点,即中心点的位置,再次进行存储;然后计算板启用深度摄像头对场景进行拍摄,在获取到场景完整的RGB-D信息后进行临时存储;计算板基于获取到的RGB-D信息,从中提取出深度信息后,与定界区域中心点的位置结合,进而得到中心点与摄像头的距离,再基于摄像头与投影仪的位置微调后得到投影仪与定界区域中心的距离;进而计算板调用投影仪的初始化方法,将该距离作为原始的焦距,在经过初始化过程中投影仪自身的梯形校正处理,便可基于距离实现相应位置的清晰投影显示。通过该操作,在定界时,实现了基于定界物体的大小自动调整投影单元的与定界区域中心,使得投影出来的 界面更加清晰,这是本发明的另一个重要发明点。In addition, in one embodiment, in order to ensure excellent projection effects on objects of different sizes and distances, the present invention adopts a method of adjusting the focal length of the projector based on the center point distance. The specific operations are as follows: the calculation board has been based on The user's selection determines the object to be delimited (for example, a book), and calculates the position of the boundary to be projected in the scene; based on the delimited boundary, the calculation board calculates the delimited area based on the four boundaries of the object The intersection of the diagonal lines, that is, the position of the center point, is stored again; then the computing board enables the depth camera to shoot the scene, and temporarily stores it after obtaining the complete RGB-D information of the scene; the computing board is based on the obtained RGB-D information. D information, after extracting the depth information from it, combine it with the position of the center point of the bounding area, and then obtain the distance between the center point and the camera, and then obtain the distance between the projector and the center of the bounding area after fine-tuning based on the positions of the camera and the projector; Then, the computing board calls the initialization method of the projector, and uses the distance as the original focal length. After the projector's own keystone correction processing during the initialization process, a clear projection display of the corresponding position can be realized based on the distance. Through this operation, when delimiting, the center of the projection unit and the delimiting area are automatically adjusted based on the size of the delimiting object, so that the projected interface is clearer, which is another important invention point of the present invention.
为了描述的方便,描述以上装置时以功能分为各种单元分别描述。当然, 在实施本申请时可以把各单元的功能在同一个或多个软件和/或硬件中实现。For the convenience of description, when describing the above device, the functions are divided into various units and described respectively. Of course, when implementing the present application, the functions of each unit may be implemented in one or more software and/or hardware.
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件平台的方式来实现。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质 中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例或者实施例的某些部分所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the present application can be implemented by means of software plus a necessary general hardware platform. Based on this understanding, the technical solutions of the present application can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products can be stored in storage media, such as ROM/RAM, magnetic disks , CD, etc., including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods described in various embodiments or some parts of the embodiments of the present application.
最后所应说明的是:以上实施例仅以说明而非限制本发明的技术方案,尽管参照上述实施例对本发明进行了详细说明,本领域的普通技术人员应当理解:依然可以对本发明进行修改或者等同替换,而不脱离本发明的精神和范围的任何修改或局部替换,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only to illustrate rather than limit the technical solutions of the present invention. Although the present invention has been described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: the present invention can still be modified or Equivalent replacements, and any modifications or partial replacements that do not depart from the spirit and scope of the present invention, shall all be included in the scope of the claims of the present invention.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010369965.3A CN111258408B (en) | 2020-05-06 | 2020-05-06 | An object boundary determination method and device for human-computer interaction |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010369965.3A CN111258408B (en) | 2020-05-06 | 2020-05-06 | An object boundary determination method and device for human-computer interaction |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111258408A true CN111258408A (en) | 2020-06-09 |
| CN111258408B CN111258408B (en) | 2020-09-01 |
Family
ID=70955199
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010369965.3A Active CN111258408B (en) | 2020-05-06 | 2020-05-06 | An object boundary determination method and device for human-computer interaction |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111258408B (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112558818A (en) * | 2021-02-19 | 2021-03-26 | 北京深光科技有限公司 | Projection-based remote live broadcast interaction method and system |
| WO2025121650A1 (en) * | 2023-12-07 | 2025-06-12 | 삼성전자 주식회사 | Electronic device and image processing method using electronic device |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120236038A1 (en) * | 2011-03-17 | 2012-09-20 | International Business Machines Corporation | Organizing projections on a surface |
| CN104052976A (en) * | 2014-06-12 | 2014-09-17 | 海信集团有限公司 | Projection method and device |
| CN106060310A (en) * | 2016-06-17 | 2016-10-26 | 联想(北京)有限公司 | Display control method and display control apparatus |
| CN106254847A (en) * | 2016-06-12 | 2016-12-21 | 深圳超多维光电子有限公司 | A kind of methods, devices and systems of the display limit determining stereo display screen |
| CN106507077A (en) * | 2016-11-28 | 2017-03-15 | 江苏鸿信系统集成有限公司 | Projector picture correction and occlusion avoidance method based on image analysis |
| CN107689082A (en) * | 2016-08-03 | 2018-02-13 | 腾讯科技(深圳)有限公司 | A kind of data projection method and device |
| CN110769224A (en) * | 2018-12-27 | 2020-02-07 | 成都极米科技股份有限公司 | Projection area acquisition method and projection method |
-
2020
- 2020-05-06 CN CN202010369965.3A patent/CN111258408B/en active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20120236038A1 (en) * | 2011-03-17 | 2012-09-20 | International Business Machines Corporation | Organizing projections on a surface |
| CN104052976A (en) * | 2014-06-12 | 2014-09-17 | 海信集团有限公司 | Projection method and device |
| CN106254847A (en) * | 2016-06-12 | 2016-12-21 | 深圳超多维光电子有限公司 | A kind of methods, devices and systems of the display limit determining stereo display screen |
| CN106060310A (en) * | 2016-06-17 | 2016-10-26 | 联想(北京)有限公司 | Display control method and display control apparatus |
| CN107689082A (en) * | 2016-08-03 | 2018-02-13 | 腾讯科技(深圳)有限公司 | A kind of data projection method and device |
| CN106507077A (en) * | 2016-11-28 | 2017-03-15 | 江苏鸿信系统集成有限公司 | Projector picture correction and occlusion avoidance method based on image analysis |
| CN110769224A (en) * | 2018-12-27 | 2020-02-07 | 成都极米科技股份有限公司 | Projection area acquisition method and projection method |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112558818A (en) * | 2021-02-19 | 2021-03-26 | 北京深光科技有限公司 | Projection-based remote live broadcast interaction method and system |
| CN112558818B (en) * | 2021-02-19 | 2021-06-08 | 北京深光科技有限公司 | Projection-based remote live broadcast interaction method and system |
| WO2022174706A1 (en) * | 2021-02-19 | 2022-08-25 | 北京深光科技有限公司 | Remote live streaming interaction method and system based on projection |
| WO2025121650A1 (en) * | 2023-12-07 | 2025-06-12 | 삼성전자 주식회사 | Electronic device and image processing method using electronic device |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111258408B (en) | 2020-09-01 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11176355B2 (en) | Facial image processing method and apparatus, electronic device and computer readable storage medium | |
| EP3886448A1 (en) | Video processing method and device, electronic equipment and computer readable medium | |
| US20230038674A1 (en) | Arrangement for generating head related transfer function filters | |
| WO2019128101A1 (en) | Projection region-adaptive dynamic projection method, apparatus, and electronic device | |
| US12034996B2 (en) | Video playing method, apparatus and device, storage medium, and program product | |
| CN110996183B (en) | Video abstract generation method, device, terminal and storage medium | |
| CN107820013A (en) | A kind of photographic method and terminal | |
| US11854238B2 (en) | Information insertion method, apparatus, and device, and computer storage medium | |
| WO2022073389A1 (en) | Video picture display method and electronic device | |
| CN107636728A (en) | For the method and apparatus for the depth map for determining image | |
| CN111258408B (en) | An object boundary determination method and device for human-computer interaction | |
| CN110781823A (en) | Screen recording detection method and device, readable medium and electronic equipment | |
| WO2023097805A1 (en) | Display method, display device, and computer-readable storage medium | |
| CN112752110B (en) | Video presentation method and device, computing device and storage medium | |
| KR101308184B1 (en) | Augmented reality apparatus and method of windows form | |
| CN112492375A (en) | Video processing method, storage medium, electronic device and video live broadcast system | |
| CN111258410B (en) | A human-computer interaction device | |
| TWI790560B (en) | Side by side image detection method and electronic apparatus using the same | |
| KR20160126985A (en) | Method and apparatus for determining an orientation of a video | |
| CN110597397A (en) | Augmented reality implementation method, mobile terminal and storage medium | |
| CN113963355B (en) | OCR character recognition method, device, electronic equipment and storage medium | |
| CN113780414B (en) | Eye movement behavior analysis method, image rendering method, component, device and medium | |
| EP3752956B1 (en) | Methods, systems, and media for detecting two-dimensional videos placed on a sphere in abusive spherical video content | |
| CN119520749B (en) | Display method based on LCD projector and LCD projector | |
| CN115883794A (en) | Projection method, device, storage medium and projection device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant | ||
| CP03 | Change of name, title or address | ||
| CP03 | Change of name, title or address |
Address after: 313300 Zhejiang Province Huzhou City Anji County Tiangengping Town Qinglaiji 40 Building A301 (self-declared) Patentee after: Huzhou Shenguang Technology Co.,Ltd. Country or region after: China Address before: Beijing City Haidian District Zhichun Road No. 7 Zhenzhen Building A Building 4th Floor A036 Patentee before: Beijing Shenguang Technology Co.,Ltd. Country or region before: China |