LU102286B1

LU102286B1 - Display control of electronic documents

Info

Publication number: LU102286B1
Application number: LU102286A
Authority: LU
Inventors: Ross Garrett Cutler
Original assignee: Microsoft Technology Licensing Llc
Priority date: 2020-12-15
Filing date: 2020-12-15
Publication date: 2022-06-15

Abstract

Techniques for identifying speech elements spoken during a meeting to automatically update a displayed position of an electronic document are disclosed herein. Descriptors are accessed that associate speech elements with positions within an electronic document. During the meeting, words spoken by participants of the meeting are used to identify a matched speech element. Using the matched speech element and the accessed descriptors, the displayed position of the electronic document is automatically updated.

Description

DISPLAY CONTROL OF ELECTRONIC DOCUMENTS

TECHNICAL FIELD

[0001] This document pertains generally, but not by way of limitation. to display of documents during meetings, and particularly but not by way of limitation to automatically controlling the display of documents during meetings.

BACKGROUND

[0002] Documents are often shared during meetings. For example, many presenters utilize electronic documents as a tool to present during à meeting to provide other participants of the meeting with additional visual details. Slideshows, for example. arc used to present information across various slides. As the meeting progresses. a presenter or other participant may manually advance the slides to a desired slide at a desired point within the meeting. This manual process can often cause delays when slides are advanced too quickly, or not quickly enough, especially when a participant other than the presenter is controlling the slides.

BRIEF DESCRIPTION OF THE DRAWINGS

[0003] In the drawings. which are not necessarily drawn to scale, like numerals may describe similar components in different views. Like numerals having different letter suffixes may represent different instances of similar components. Some embodiments are illustrated by way of example, and not limitation, in the figures of the accompanying drawings in which:

[0004] FIG. | is a network diagram illustrating several computing devices configured to participate in online communication sessions. 10005] FIG. 2 is a diagram illustrating an example graphical user interface that displays an electronic document for a meeting.

[0006] FIG. 3 is a diagram illustrating associations between descriptors for an electronic document and agenda items of a meeting agenda. 10007! FIG. 4 is a diagram illustrating an example machine learning module. 1 Client Docket No. 408962-LU-NP

[0008] FIG. 5 is a flowchart illustrating a method of controlling display of an electronic document.

[0009] FIG. 6 is a flowchart illustrating a method of controlling display of an electronic document based on identified agenda items.

[0010] FIG. 7 is a block diagram illustrating an example of a machine upon which one or more embodiments may be implemented.

DETAILED DESCRIPTION

[0011] Systems and methods are disclosed herein for automatically controlling the display of electronic documents. Often when sharing electronic documents during online or live meetings, one of the participants of the meeting controls the display of an electronic document. This can lead to confusion and delays when the controls do not work as planned such as due to communication lag as a result of bandwidth limitations, for example, or can lead to confusion or miscommunication between a presenter and a separate meeting participant that is controlling the display of the document. Further. presenters often forget to advance the position of the displayed document, leaving stale information on the display device that does not match up with what is currently being discussed. It is desirable to automatically control the display of the electronic document to remove these issues.

[0012] Technical problems arise with how to identify the proper time to control a change in display of the electronic document during a meeting. À technical solution to this problem is to access descriptors that associate positions within the electronic document with speech elements. such as keywords or speech embeddings that may be detected from spoken words during the meeting. For example, the keywords may be words present in agenda items of an agenda for the meeting. may be indicators such as “next slide”. “advance”, “slide three”, and the like, or may be any other keywords. Speech embeddings are semantic representations of words or portions of words used by a speech recognition model or other speech recognition application. Speech embeddings group words. syllables. or other portions of words by semantic similarity rather than by meaning of the word which can be used by speech analysis applications to map to actual words or phrases. 2 Client Docket No. 408962-LU-NP

[0013] The descriptors can be generated prior to the meeting. In an example. the descriptors associate positions in a document with items of an agenda. The agenda can be generated for the meeting that includes several agenda items indicating topies for discussion at specific points within a meeting. In an example, the descriptors are be generated automatically by associating words, or speech embeddings for those words, within description text of agenda items with words, or speech embeddings for those words. at specific positions within the electronic document. For example, a title of a slide in a slide show may be substantially similar to description text of an agenda item and à descriptor is generated to associate the respective slide with the agenda item.

[0014] | During the meeting. speech of participants, text of an online chat, or other information can be monitored to identify one or more speech elements. such as keywords or speech embeddings. In some examples, the speech elements are used to identify a current agenda item of the agenda for the meeting. For example. speech data of a participant is captured by a microphone or other audio capture device and be provided to a speech analysis mode! that identifies spoken words of the speech data. When identifving the spoken words, one or more speech embeddings may be identified from the speech data. Thus, the identified spoken words or speech embeddings are used to identify a current agenda item of the meeting agenda by matching or otherwise correlating the spoken words or speech embeddings to an agenda item. Once the current agenda item is identified the descriptors that associate positions within the electronic document with agenda items may be referenced to identify a position that is associated with the current agenda item. The display of the document may then be automatically updated to the identified position. This has the technical effect of improving the display of electronic documents during meetings by automatically updating the display position within the document based on an identified current agenda item. In this way a content stream comprising electronic documents, which is being broadeast to participants of a meeting. is automatically coordinated. or synchronized with an audio channel broadcast by a current speaker of the meeting. In the case of an online meeting. end users are better able to participate in the meeting and have 3 Client Docket No. 408962-LU-NP reduced burden of user input at their respective computing devices since the coordination and/or synchronization of the content stream and audio stream is achieved automatically. In some cases, the coordination and/or synchronization takes into account lags introduced unexpectedly due to drops in communications bandwidth between the computing devices participating in the online meeting. Such lags are accommodated through the use of the descriptors as mentioned above.

[0015] Automatically controlling the display of an electronic document may be performed in online meetings. live meetings, or a combination thereof. FIG. | isa block diagram itlustrating a communication service 100 that automatically updates the display of an electronic document for remote or live meetings. First computing device 110. second computing device 111, third computing device 112, and fourth computing device 113 may be members of a same active network-based communication session (e.g.. a video conferencing session) provided by a communication server 130 and the respective instances of communication application 115. First computing device 110 may execute a first instance of a communication application 115 (shown as 115-1). second computing device 111 may execute a second instance of the communication application 115 (shown as 115-2). third computing device 112 may execute a third instance of the communication application 115 (shown as 115-3), and fourth computing device 113 may execute a fourth instance of communication application 115 (shown as 115-4).

[0016] Communications applications 115 may communicate with the communication server 130 to setup, join, and participate in the network-based communication session. This includes sending, receiving and presenting one or more of voice, video, and content data that is part of the network-based communication session. In some examples. one or more of the computing devices 110. 111, 112. and 113 may contain or be communicatively coupled to a video capture device. such as a video camera. In some examples. the video capture device may be in the form of a meeting room capture device 105 — which is shown in FIG. ] as being coupled to the first computing device 110. The meeting room capture device 105 may be à camera, a set of cameras, a 360-degree camera, or the like that may capture a large portion of the room. One or more of the computing devices 4 Client Docket No. 408962-LU-NP

110, 111. 112. and 113 may also be communicatively coupled to an audio capture device. such as a microphone.

[0617] For online meetings, first computing device 110, second computing device 111, third computing device 112. and fourth computing device [13 may execute instances of the communication application 115, denoted as 115-1. 115-2, 115-3, and 115-4 respectively. These instances of communication application 115 may also communicate with the communication server 130 to setup, join. and participate in a network-based communication session. This includes sending. receiving and presenting one or more of voice, video. and content data that is part of the network-based communication session. Collectively, the communication applications 115 and the communication server 130 provide for the network-based communication session by communicating over the network 120.

[0018] Second, third. and fourth computing devices 111, 112, and 113 respectively may or may not be communicatively coupled to a video capture device.

As shown in FIG. 1. second computing device 111 and third computing device 112 are coupled to video cameras. however fourth computing device is not coupled to a video camera. Communication server 130 may process the one or more video streams from the first, second, third, and fourth computing devices 110. 111. 112, and 113 respectively.

[0019] The communication server 130 may provide a communication service which provides Microsoft Teams meetings. for example. and the communication applications 115-1, 115-2, 115-3, and 115-4 may be Microsoft Teams clients, for example. The communication server 130 may also include one or more applications, such as an application that implements functions to identify agenda items and automatically update a position of a displayed electronic document.

[0020] While described for online meetings, in some examples a live meeting may occur such that only the first computing device 110 is used. In this example, the communication application 115-1 may be replaced by an application implemented to track a position within an agenda and update the display of an electronic document accordingly within the online communication aspect. Also. in this example, the entire process may execute in an “offline” manner on the

Client Docket No. 408962-LU-NP computing device 110 such that the connection to the server 130 is not required.

[0021] FIG. 2 is a diagram illustrating an example graphical user interface (GUI) that displays an electronic document for a meeting. The GUI may be generated by a respective communication application 115-1, 115-2, 115-3, or | 15-4 for on a display device associated with a respective computing device 110. 111. 112, or E13. While illustrated as a slideshow, any type of electronic document may be displayed on the

GUI including word-processing documents, portable document format (PDF) documents, or any other type of electronic document. Also, while illustrated as displayed during an online meeting, the electronic document may be displayed on a projector or using another display device located within a meeting room during a live meeting that may or may not include remote participants. 10022] For online meetings, the GUI may include a staging area 205. During a meeting a presenter may share a document 210 such as a slide show, word processing document, or other document. The participants 215, which may include the presenter, may be displayed in a participant display area or other position within the GUI. The display of the document 205 may be updated during the meeting by analyzing speech data of the participants 215 to identify a current agenda item for the meeting. In another example, the GUI may include a group chat for the online meeting. Text of the chat may also or alternatively be analyzed to identify a current agenda item of the meeting.

[0023] Once an agenda item is identified, the displayed document 210 can be updated to a new position. For example, page 2 of a slide show is illustrated in FIG. 2. Upon determining that the participants have spoken a speech element. such as a keyword or speech embedding, the slide show may be advanced to page 3 of the slideshow. This may be accomplished using generated descriptors for the electronic document, such as metadata. that associates positions within the document with speech elements or specific agenda items of a meeting agenda. For example. a descriptor may associate a slide number with an agenda item. The agenda may include several distinct items scheduled for specified times during the meeting. The descriptor may identify a position within the document. such as a page number or slide number, and associate that position with a specified speech element or agenda 6 Client Docket No. 408962-LU-NP item. The descriptors may be stored in a memory storage device with the electronic document. or apart from the electronic document.

[0024] FIG. 3 is a diagram illustrating descriptors 302. which are associations between locations in an electronic document 304 and agenda items 306 of a meeting agenda 308. A meeting agenda may be auto-generated for a meeting or may be manually generated by a participant of the meeting. In some examples, a participant may utilize software to input agenda items 306 and associated time periods, and the software may generate the agenda 308 for the meeting. In another example, the document 304 may be input to one or more applications that implement a trained model, for example. which may extract agenda items 306 trom the document and automatically generate the agenda 308. In another example. one or more applications may extract high level details. such as slide titles. and generate a rough agenda. which a user may edit to generale a completed agenda 308.

[0025] The agenda 308 may be provided to participants of the meeting. This may occur prior to or during the meeting. The agenda 308 may be displayed during the meeting and a GUI for an online meeting, for example, may display the agenda 308 and a current agenda item 306 while the meeting is in progress. Each agenda item 306 may include description text, for example, describing the agenda item, and a time period indicating a time during the meeting for the agenda item. In other examples, the agenda items 306 may only include description text with no associated time period.

[0026] The descriptors 302 may be metadata. for example, or any other descriptors capable of defining associations between positions within the document 304 and the agenda items 306. In an example. each descriptor may include values that include a position within the document 304 and an agenda item 306. These values may be plain text values or may be in any data format. In an example, the position within the document may be a slide number, page number, line number, or any other identifiable position within the document 304. The agenda items may include the full text or partial text of the agenda item 306. an identifier that identifies the agenda item 306. or the like. For example, each agenda item 306 may include an associated identifier. If there are ten agenda items 306, the agenda items 7 Client Docket No. 408962-LU-NP

306 may be numbered one through ten. The descriptors may then associate a position within the document 304 with the identifier of the agenda item 306.

[0027] The descriptors 302 may be generated automatically or manually. For example. a participant may input document positions and agenda items into a software interface for one or more applications that generate the descriptors 302 using the participant input. In another example. the participant may provide the electronic document 304 and the agenda 308 to an agenda association model 310.

The agenda association model 310 may output the descriptors 302 based on the document 304 and the agenda 308 as input. For example. the model 310 may associate words. or speech embeddings associated with words, within the document 304 with specific agenda items. In an example, a title of a slide of the document 304 may include words or phrases that match words or phrases of an agenda item 306. In some examples. a participant may review and update the descriptors 302 generated by the model 310.

[0028] FIG. 4 is a diagram illustrating an example machine learning module 400 for training and/or using a model. such as the model 310. The machine learning module 400 may be implemented in whole or in part by any servers 130 and/or any of the computing devices 110-113. In some examples, the training module 410 may be implemented by a different device than the prediction module 420. In these examples, the model 310, which may be an agenda association model, may be created on a first machine and then sent to a second machine (for example. created by servers 130 and send to one of the computing devices 110-113).

[0029] Machine learning module 400 utilizes a training module 410 and a prediction module 420. Training module 410 inputs feature data 430 into feature determination module 450. The feature data 430 may include electronic document data and agenda item data. Feature determination module 450 determines one or more features for feature vector 460 from the feature data 430. Features of the feature vector 460 may include document position and agenda item pairs. Features chosen for inclusion in the feature vector 460 may be all the feature data 430 or in some examples. may be a subset of all the feature data 430. In examples in which the features chosen for the feature vector 460 are a subset of the feature data 430, a 8 Client Docket No. 408962-L{-NP predetermined list of which feature data 430 is included in the feature vector may be utilized. The teature vector 460 may be utilized (along with any applicable labels) by the machine learning algorithm 470 to produce a model 310.

[0030] In the prediction module 420, the current feature data 490 may include an clectronic document 304 and an agenda 308 for a specified meeting. Feature determination module 495 may determine the same set of features or a different set of features as feature determination module 450, such as document position and agenda item pairs. In some examples, feature determination module 450 and 495 are the same modules or different instances of the same module. Feature determination module 495 produces feature vector 497, which are input into the model! 310. The output 499 of the model 310 may include rankings or selections of associated document positions and agenda items for inclusion in the descriptors 302.

[0031] The training module 410 may operate in an offline manner to train the model 310. The prediction module 420, however, may be designed to operate in an online manner. It should be noted that the model 310 may be periodically updated via additional training and/or user feedback. For example, additional feature data 430 may be collected as users initiate and participate in various meetings. The attributes may then be fed back through the training module 410 labelled with indicators in order to refine the model 310.

[0032] The machine learning algorithm 470 may be selected from among many different potential supervised or unsupervised machine learning algorithms.

Examples of supervised learning algorithms include artificial neural networks. convolutional neural networks, Bayesian networks, instance-based learning, support vector machines, decision trees (e.g. Iterative Dichotomiser 3, C4.5, Classification and Regression Tree (CART), Chi-squared Automatic Interaction Detector (CHAID). and the like). random forests. linear classifiers, quadratic classifiers, k- nearest neighbor, linear regression, logistic regression, and hidden Markov models.

Examples of unsupervised learning algorithms include expectation-maximization algorithms, vector quantization, and information bottleneck method. Unsupervised models may not have a training module 410. j0033] FIG. 5 is a flowchart illustrating an example method 500 of controlling 9 Client Docket No. 408962-LU-NP display of an electronic document. At step 502, descriptors are accessed that associate positions within the electronic document and speech elements. These descriptors may be accessed from a local memory storage device, a remote memory storage device, or any other location. For example, an application running on one of the computing devices 110, 111, 112. or 113 may request the descriptors from the server 130 or any other device. The application may then store the descriptors locally or access the descriptors from the server 130 each time the descriptors are accessed. In other examples, such as when an application running on the computing devices 110, 111. 112, or 113 generates the descriptors prior to, or during, the meeting, the descriptors may be stored in a local memory storage device. The keywords may be commands, descriptor words such as vocal commands, words associated with agenda items, or any other keywords. The descriptors may be in a format that includes two values, for example, such as <slide number, keyword(s)>.

Rather than keywords, the speech elements may also be speech embeddings, which are semantic representations of speech items such as words, phrases, and the like.

For example, speech recognition models may identify speech embeddings of spoken words or phrases in order to identify those spoken words or phrases. The speech embeddings may be phonetic representations of portions of the speech. For example, some words may be spoken in phonetically differing ways, and each way may map to the same word through the use of speech embeddings. By using speech embeddings rather than keywords, the speed and accuracy of matching descriptors may be improved. Step 502 may be executed during or prior to a start of the meeting.

[0034] At step 504, while the meeting is in progress, the electronic document is displayed at a first position. This may be a first slide of a slideshow, for example.

The document may be displayed on a display device for the participants of the meeting. In an example, the meeting may be a live meeting, and the display device may be a projector, a liquid crystal display (LCD). light emitting diode (LED) display, or the like. In another example, the meeting may be an online meeting and the display device may be a local display associated with a computing device of the participant. For example, each computing device 110, 111, 112, and 113 of FIG. 1

Client Docket No, 408962-LU-NP may include an associated display device such as an LCD or LED display configured to output a GUI, such as the GUI described with respect to FIG. 2. that displays the electronic document to a respective participant.

[0035] At step 506, an audio stream is received for a participant of the meeting.

For live meetings, this may be accomplished using a microphone or other audio capture device positioned within the meeting room. For an online meeting, this may be accomplished using a microphone or other audio capture device associated with a respective computing device, such as the computing devices 110. 111,112, and 113.

The audio stream may include speech data of one or more participants within the meeting. For example. a participant may be presenting information found in the displayed document.

[0036] | At step 508. spoken words are identified from the received audio stream.

In an example. a speech analysis model may be used to identify spoken words from the speech data of the audio stream. In an example. the model may be the

NVIDIA® Jasper automatic speech recognition model. or any other speech analysis model. In an example. the audio stream may be dircetly input to the speech analysis model and the speech analysis model may output one or more identified words for a respective segment of audio data. This step may be performed continuously throughout the meeting. Words may also be identified from an online chat, other shared document. or the like. These words may be used in conjunction with the spoken words or alternatively to the spoken words.

[0037] Atstep 510. the spoken words or other identified words are used to match a respective speech clement. For example. the speech element may be keywords that are words present in agenda items of an agenda for the meeting. may be indicators such as “next slide”, “advance”, “slide three”, phrases such as “go to slide two” or “go to agenda item three”. or the like. In an example, the speech element may be speech embeddings that are semantic representations of words of portions of words present in agenda items of an agenda tor the meeting or other words. At step 512. a second position within the document is identified using the matched speech element(s). This may be performed using the descriptors accessed at step 502. For example, a speech element may be used to reference the descriptors to identity a 11 Client Docket No. 408962-LU-NP descriptor pair that matches the speech element. The position value from the identified descriptor may then be used as the second position within the document.

At step 514, the document is displayed at the second position. For example, a slideshow may be advanced to a newly identified slide.

[0038] The method 500 may be executed by one or more applications running on the server 130, one or more of the computing devices 110, 111, 112, or 113, or any other computing device. While described for a single document, the above process may be executed for several documents. For example. if multiple documents will be shared during the meeting, descriptors may be accessed for the multiple documents.

If a matched descriptor indicates a position in a document not currently displayed, the indicated document may automatically be displayed at the identitied position and the previous document may be automatically closed. This way, if a user wishes to share several documents during a presentation, the display of all documents may be achieved using the methods described herein.

[0039] FIG. 6 is a flowchart illustrating an example method 500 of controlling display of an electronic document based on identified agenda items during a meeting. At step 602. an agenda and an electronic document are received for an upcoming meeting. The agenda may include several agenda items. For example, an agenda item may include a time period and description text such as 79:00-9:05 —

Recap prior meeting minutes.” The electronic document may be any type of document for display during the upcoming meeting. For example. the document may be a slideshow that includes several slides for display during the meeting.

[0040] At step 604. descriptors are generated to associate positions within the electronic document and the agenda items. This may be accomplished using the agenda association model 310 as described herein. may be accomplished by software using manual inputs from a participant of the meeting, or may be accomplished using any other method. In an example, the descriptors are metadata that include values indicative of positions within the electronic document and values indicative of a respective agenda item. For example, the agenda items may be numbered. and a descriptor may be a value pair such as: <slide number, agenda item number>, This way, several positions within the document may be associated with 12 Client Docket No. 408962-LU-NP respective agenda items. Step 604 may bc executed during or prior to a start of the mecting.

[0041] Atstep 606, while the meeting is in progress, the clectronic document is displayed at a first position. This may be a first slide of a slideshow, for example.

The document may be displayed on a display device for the participants of the meeting. In an example, the meeting may be a live meeting, and the display device may be a projector, a liquid crystal display (LCD), light emitting diode (LED) display. or the like. In another example, the meeting may be an online meeting and the display device may be a local display associated with a computing device of the participant. For example, each computing device 110, 111, 112, and 113 of FIG. may include an associated display device such as an LCD or LED display configured to output a GUI, such as the GUI described with respect to FIG. 2, that displays the electronic document to a respective participant.

[0042] | At step 608, an audio stream is received for a participant of the meeting.

For live meetings, this may be accomplished using a microphone or other audio capture device positioned within the meeting room. For an online meeting, this may be accomplished using a microphone or other audio capture device associated with a respective computing device, such as the computing devices 110, 111, 1 12, and 113.

The audio stream may include speech data of one or more participants within the meeting. For example, a participant may be presenting information found in the displayed document. 10043] Atstep 610. spoken words are identified from the received audio stream.

In an example, a speech analysis model may be used to identify spoken words from the speech data of the audio stream. In an example, the model may be the

NVIDIA® Jasper automatic speech recognition model, or any other speech analysis model. In an example. the audio stream may be directly input to the speech analysis model and the speech analysis model may output one or more identified words for a respective segment of audio data. This step may be performed continuously throughout the meeting. Words may also be identified from an online chat, other shared document, or the like. These words may be used in conjunction with the spoken words or alternatively to the spoken words. 13 Client Docket No. 408962-LU-NP

[0044] Atstep 612, the spoken words or other identified words are used to match a respective agenda item. For example, if the agenda item is “recap prior meeting minutes”, and the spoken words are identified as “meeting minutes”, the agenda item may be identified as a current agenda item. This identification may be accomplished using a trained model for example, that receives the spoken words and the agenda items as input, and outputs an identified agenda item. In another example, specific phrases or words may be designated to associate with a specific agenda item. For example. a participant may say “go to slide two” or “go to agenda item three”. This may be recognized using the speech analysis and the respective agenda item may be identified accordingly.

[0045] At step 614, a second position within the document is identified using the identified current agenda item. This may be performed using the descriptors generated at step 604. For example, an agenda item indicator may be used to reference the descriptors to identify a descriptor pair that matches the agenda item.

The position value from the identified descriptor may then be used as the second position within the document. At step 616, the document 1s displayed at the second position. For example, a slideshow may be advanced to a newly identified slide.

[0046] The method 600 may be executed by one or more applications running on the server 130, one or more of the computing devices 110, 111, 112, or 113, or any other computing device. While described for a single document, the above process may be executed for several documents. For example, if multiple documents will be shared during the meeting, the multiple documents may be input to the model 310 to generate associations for each document and the agenda items 306. If a matched agenda item indicates a position in a document not currently displayed, the indicated document may automatically be displayed at the identified position and the previous document may be automatically closed. This way, if a user wishes to share several documents during a presentation, the display of all documents may be achieved using the methods described herein.

[0047] FIG. 7 illustrates a block diagram of an example machine 700 upon which any one or more of the techniques (e.g., methodologies) discussed herein may perform. For example, the machine 700 can be any one or more of the server 130, 14 Client Docket No. 408962-LU-NP and/or computing devices 110-113. Examples, as described herein. may include, or may operate by, logic or a number of components, or mechanisms in the machine 700. Circuitry (e.g.. processing circuitry} is a collection of circuits implemented in tangible entities of the machine 700 that include hardware (e.g., simple circuits, gates, logic, ete.). Circuitry membership may be flexible over time. Circuitries include members that may, alone or in combination, perform specified operations when operating. In an example, hardware of the circuitry may be immutably designed to carry out a specific operation {e.g., hardwired). In an example, the hardware of the circuitry may include variably connected physical components (e.g., execution units, transistors, simple circuits, ete.) including a machine readable medium physically modified (e.g., magnetically, electrically, moveable placement of invariant massed particles, ete.) to encode instructions of the specific operation.

In connecting the physical components, the underlying electrical properties of a hardware constituent are changed, for example, from an insulator to a conductor or vice versa. The instructions enable embedded hardware (c.g.. the execution units or a loading mechanism) to create members of the circuitry in hardware via the variable connections to carry out portions of the specific operation when in operation. Accordingly, tn an example, the machine readable medium elements are part of the circuitry or are communicatively coupled to the other components of the circuitry when the device is operating. In an example, any of the physical components may be used in more than one member of more than one circuitry. For example. under operation, execution units may be used in a first circuit of a first circuitry at one point in time and reused by a second circuit in the first circuitry, or by a third circuit in a second circuitry at a different time. Additional examples of these components with respect to the machine 700 follow.

[0048] In alternative embodiments. the machine 700 may operate as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine 700 may operate in the capacity of a server machine, a client machine, or both in server-client network environments. In an example. the machine 700 may act as a peer machine in peer-to-peer (P2P) (or other distributed) network environment. The machine 700 may be a personal computer (PC), a tablet

Client Docket No. 408962-LU-NP

PC. a set-top box (STB). a personal digital assistant (PDA), a mobile telephone. a web appliance. a network router, switch or bridge. or any machine capable of executing instructions (sequential or otherwise) that specify actions to be taken by that machine. Further, while only a single machine ts illustrated. the term “machine” shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein, such as cloud computing, software as a service (SaaS), other computer cluster configurations.

[0049] The machine (e.g.. computer system) 700 may include a hardware processor 702 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), a hardware processor core, or any combination thercof), a main memory 704, a static memory (e.g., memory or storage for firmware, microcode, a basic- input-output (BIOS), unified extensible firmware interface (UEFT) etc.) 706. and mass storage 708 (e.g., hard drive. tape drive, flash storage, or other block devices) some or all of which may communicate with each other via an interlink (e.g.. bus) 730. The machine 700 may further include a display unit 710. an alphanumeric input device 712 (e.g.. a keyboard), and a user interface (UT) navigation device 714 (e.2., a mouse). In an example, the display unit 710, input device 712 and Ul navigation device 714 may be a touch screen display. The machine 700 may additionally include a storage device (¢.g., drive unit) 708, a signal generation device 718 (eg. a speaker). a network interface device 720, and one or more sensors 716, such as a global positioning system (GPS) sensor, compass. accelerometer. or other sensor. The machine 700 may include an output controller 728, such as a serial (c.g., universal serial bus (USB), parallel, or other wired or wireless (cg, infrared (IR). near field communication (NFC), etc.) connection to communicate or control one or more peripheral devices (e.g.. a printer. card reader. etc}, (0050 Registers of the processor 702. the main memory 704, the static memory 706, or the mass storage 708 may be. or include. a machine readable medium 722 on which is stored one or more sets of data structures or instructions 724 (e.g. software) embodying or utilized by any one or more of the techniques or functions 16 Client Docket No. 408962-LU-NP described herein. The instructions 724 may also reside, completely or at least partially. within any of registers of the processor 702. the main memory 704, the static memory 706. or the mass storage 708 during execution thereof by the machine 700. In an example. one or any combination of the hardware processor 702. the main memory 704, the static memory 706, or the mass storage 708 may constitute the machine readable media 722. While the machine readable medium 722 is illustrated as a single medium, the term "machine readable medium" may include a single medium or multiple media (e.g., a centralized or distributed database. and/or associated caches and servers) configured to store the one or more instructions 724.

[0051] The term “machine readable medium” may include any medium that is capable of storing, encoding, or carrying instructions for execution by the machine 700 and that cause the machine 700 to perform any one or more of the techniques of the present disclosure, or that is capable of storing. encoding or carrving data structures used by or associated with such instructions. Non-limiting machine readable medium examples may include solid-state memories, optical media. magnetic media. and signals (¢.g.. radio frequency signals. other photon based signals, sound signals, etc.). In an example, a non-transitory machine readable medium comprises a machine readable medium with a plurality of particles having invariant (e.g.. rest) mass. and thus are compositions of matter. Accordingly, non- transitory machine-readable media are machine readable media that do not include transitory propagating signals. Specific examples of non-transitory machine rcadable media may include: non-volatile memory, such as semiconductor memory devices (e.g. Electrically Programmable Read-Only Memory (EPROM),

Electrically Erasable Programmable Read-Only Memory (FEPROM)) and flash memory devices: magnetic disks. such as internal hard disks and removable disks: magneto-optical disks: and CD-ROM and DVD-ROM disks. 10052] The instructions 724 may be further transmitted or received over a communications network 726 using a transmission medium via the network interface device 720 utilizing any one of a number of transter protocols (e.g.. frame relay, internet protocol (IP), transmission control protocol (TCP), user datagram protocol (UDP), hypertext transfer protocol (HTTP), ete.). Example communication 17 Client Docket No. 408962-LU-NP networks may include a local arca network {LAN), a wide area network (WAN), a packet data network (eg, the Internet), mobile telephone networks (e.g. cellular networks), Plain Old Telephone (POTS) networks, and wireless data networks (e.g,

Institute of Electrical and Electronics Engineers (IEEE) 802.11 family of standards known as Wi-Fi®), IEEE 802.16.4 family of standards, peer-to-peer (P2P) networks, among others. In an example, the network interface device 720 may include one or more physical jacks {e.g., Ethernet, coaxial, or phone jacks) or one or more antennas to connect to the communications network 726. In an example, the network interface device 720 may include a plurality of antennas to wirelessly communicate using at least one of single-input multiple-output (SIMO), multiple- input multiple-output (MIMO), or multiple-input single-output (MISO) techniques.

The term “transmission medium” shall be taken to include any intangible medium that is capable of storing. encoding or carrying instructions for execution by the machine 700. and includes digital or analog communications signals or other intangible medium to facilitate communication of such software. À transmission medium is a machine readable medium. 10053] Non-Limiting Examples

[0054] Example 1 is a system for automatically controlling a displayed position of a document during a meeting, the system comprising: means for accessing descriptors that associate positions within an electronic document with one or more speech elements, the speech elements comprising a keyword or speech embedding: means for displaying the electronic document at a first position to a participant of the meeting on a display device: means for receiving an audio stream of the participant, the audio stream comprising speech data for the participant; means for identifying spoken words of the participant using the speech data for the participant: means for identifying a matched speech clement of the one or more speech elements using the identified spoken words: means for identifying a second position within the electronic document using the matched speech element and the accessed descriptors; and means for displaying the electronic document at the second position to the participants on the display device in response to identifying the second position. 18 Client Docket No. 408962-LU-NP

[0055] In Example 2, the subject matter of Example 1 includes, wherein the electronic document is a slideshow and wherein the positions within the clectronic document are slides of the slideshow.

[0056] In Example 3. the subject matter of Examples 1-2 includes, wherein the means for accessing the descriptors comprises means for accessing metadata for the electronic document defining the associations between the electronic document and the one or more speech clements. 10057] In Example 4, the subject matter of Example 3 includes. wherein the means for identitying the second position within the electronic document comprises means for obtaining the second position from the metadata using the matched speech element.

[0058] In Example 5, the subject matter of Examples 1-4 includes, wherein the means for identifying the spoken words of the participant comprises means for identifying the spoken words using a speech analysis model, wherein the speech data is input to the speech analysis model and the identified spoken words are output from the speech analysis model.

[0059] In Example 6, the subject matter of Examples 1-5 includes, wherein the one or morc speech elements comprises a plurality of respective agenda items of an agenda of the meeting, and wherein the means for identifying the matched speech element comprises means for identifying a matched agenda item of the plurality of respective agenda items using the identified spoken words, and wherein identifying the second position comprises identifying the second position within the electronic document using the matched agenda item and the accessed descriptors.

[6060] In Example 7, the subject matter of Example 6 includes, means for generating the descriptors to associate the positions within the document with the plurality of agenda items prior to start of the meeting.

[0061] In Example 8, the subject matter of Examples 6-7 includes, wherein each item of the plurality of agenda items comprises description text, and wherein the means for identifying the matched agenda item comprises means for matching the spoken words with words or speech embeddings of the description text of the matched agenda item. 19 Client Docket No. 408962-LU-NP

[0062] In Example 9, the subject matter of Examples 1-8 includes. means for generating the descriptors using a model that receives the electronic document as input and generates the descriptors to associate the positions within the electronic document with the one or more speech elements.

[0063] Example 10 is a method for automatically controlling a displayed position of a document during a meeting, the method comprising: accessing descriptors that associate positions within an electronic document with one or more speech elements, the speech elements comprising keywords or speech embeddings: displaying the electronic document at a first position to a participant of the meeting on a display device; receiving an audio stream of the participant, the audio stream comprising speech data for the participant; identifying spoken words of the participant using the speech data for the participant; identifying a matched speech element of the one or more speech elements using the identified spoken words: identifying a second position within the electronic document using the matched speech element and the accessed descriptors; and in response to identifying the second position, displaying the electronic document at the second position to the participants on the display device,

[0064] In Example [1, the subject matter of Example 10 includes, wherein the electronic document is a slideshow and wherein the positions within the electronic document are slides of the slideshow.

[0065] In Example 12, the subject matter of Examples 10-11 includes, wherein accessing the descriptors comprises accessing metadata for the electronic document defining the associations between the electronic document and the one or more speech elements.

[0066] In Example 13, the subject matter of Example 12 includes, wherein tdentifying the second position within the electronic document comprises obtaining the second position from the metadata using the matched speech element. 10067] In Example 14, the subject matter of Examples 10-13 includes. wherein identifying the spoken words of the participant comprises identifying the spoken words using a speech analysis model. wherein the speech data is input to the speech analysis model and the identified spoken words are output from the speech analysis

Client Docket No. 408962-LU-NP model.

[0068] In Example ES, the subject matter of Examples 10-14 includes, wherein the one or more speech elements comprises a plurality of respective agenda items of an agenda of the meeting. and wherein identifying the matched speech clement comprises identifying a matched agenda item of the plurality of respective agenda items using the identified spoken words, and wherein identifying the second position comprises identifying the second position within the electronic document using the matched agenda item and the accessed descriptors.

[0069] In Example 16. the subject matter of Example 15 includes. generating the descriptors to associate the positions within the document with the plurality of agenda items prior to start of the meeting.

[0070] In Example 17, the subject matter of Examples 15-16 includes, wherein cach item of the plurality of agenda items comprises description text, and wherein identifying the matched agenda item comprises matching the spoken words with words or speech embeddings of the description text of the matched agenda item.

[0071] In Example 18. the subject matter of Examples 10-17 includes, generating the descriptors using a model that receives the electronic document as input and generates the descriptors to associate the positions within the electronic document with the one or more speech elements.

[0072] Example 19 is a system for automatically controlling a displayed position of a document during a meeting, the system comprising: one or more hardware

Processors: one or more memory units, storing instructions. which when executed. cause the one or more hardware processors to perform operations comprising: accessing descriptors that associate positions within an clectronic document with one or more speech elements, the speech elements comprising a keyword or speech embedding: displaying the electronic document at a first position to a participant of the meeting on a display device: receiving an audio stream of the participant, the audio stream comprising speech data for the participant; identifying spoken words of the participant using the speech data for the participant; identifying a matched speech element of the one or more speech elements using the identified spoken words; identifying a second position within the electronic document using the 21 Client Docket No. 408962-LU-NP matched speech element and the accessed descriptors: and displaying the electronic document at the second position to the participants on the display device in response to identifying the second position. 10073] In Example 20. the subject matter of Example 19 includes. wherein the electronic document is a slideshow and wherein the positions within the electronic document are slides of the slideshow,

[0074] In Example 21, the subject matter of Examples 19-20 includes, wherein accessing the descriptors comprises accessing metadata for the electronic document defining the associations between the electronic document and the one or more speech elements.

[0075] In Example 22, the subject matter of Example 21 includes, wherein identifying the second position within the electronic document comprises obtaining the second position from the metadata using the matched speech element.

[0076] In Example 23. the subject matter of Examples 19-22 includes, wherein identifying the spoken words of the participant comprises identifying the spoken words using a speech analysis model. wherein the speech data is input to the speech analysis model and the identified spoken words are output from the speech analysis model.

[0077] In lxample 24, the subject matter of Examples 19-23 includes, wherein the one or more speech elements comprises a plurality of respective agenda items of an agenda of the meeting. and wherein identifying the matched speech element comprises identifying a matched agenda item of the plurality of respective agenda items using the identified spoken words. and wherein identifying the second position comprises identifying the second position within the electronic document using the matched agenda item and the accessed descriptors.

[8078] [In Example 25, the subject matter of Example 24 includes, generating the descriptors to associate the positions within the document with the plurality of agenda items prior to start of the meeting. 10079] In Example 26, the subject matter of Examples 24-25 includes. wherein cach item of the plurality of agenda items comprises description text, and wherein identifying the matched agenda item comprises matching the spoken words with 22 Client Docket No. 408962-LU-NP words or speech embeddings of the description text of the matched agenda item.

[0080] In Example 27. the subject matter of Examples 19-26 includes. operations of generating the descriptors using a model that receives the electronic document as input and generates the descriptors to associate the positions within the electronic document with the one or more speech elements.

[0081] Example 28 is at least one machine-readable medium including instructions that, when executed by processing circuitry, cause the processing circuitry to perform operations to implement of any of Examples 1-27.

[0082] Example 29 is an apparatus comprising means to implement of any of

Examples 1-27.

[0083] Example 30 is a system to implement of any of Examples 1-27,

[0084] Example 31 is a method to implement of any of Examples 1-27.

[0085] The above description includes references to the accompanying drawings. which form a part of the detailed description. The drawings show, by way of illustration, specific embodiments in which the invention can be practiced. These embodiments are also referred to herein as “examples.” Such examples can include elements in addition to those shown or described. However, the present inventors also contemplate examples in which only those elements shown or described are provided. Moreover, the present inventors also contemplate examples using any combination or permutation of those elements shown or described {or one or more aspects thereof), either with respect to a particular example (or one or more aspects thereof), or with respect to other examples (or one or more aspects thereof) shown or described herein.

[0086] In this document, the terms “a” or an” are used, as is common in patent documents, to include one or more than one, independent of any other instances or usages of ‘at least one” or "one or more.” In this document, the term “or” is used to refer to a nonexclusive or, such that 7A or B” includes "A but not B.° “B but not A.” and “A and B.” unless otherwise indicated. In this document, the terms “including” and “in which” are used as the plain-English equivalents of the respective terms “comprising” and “wherein.” Also, in the following claims, the terms “including” and “comprising” are open-ended, that is, a system, device, 23 Client Docket No. 408962-LU-NP article, composition. formulation, or process that includes elements in addition to those listed after such a term in a claim are still deemed to fall within the scope of that claim. Moreover, in the following claims, the terms “first.” “second,” and “third.” ete. are used merely as labels. and are not intended to impose numerical requirements on their objects.

[0087] The above description is intended to be illustrative, and not restrictive.

For example, the above-described examples (or one or more aspects thereof) may be used in combination with each other. Other embodiments can be used. such as by one of ordinary skill in the art upon reviewing the above description. The Abstract is provided to allow the reader to quickly ascertain the nature of the technical disclosure. It is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. Also. in the above Detailed

Description, various features may be grouped together to streamline the disclosure.

This should not be interpreted as intending that an unclaimed disclosed feature is essential to any claim. Rather, inventive subject matter may lic in less than all features of a particular disclosed embodiment. Thus, the following claims are hereby incorporated into the Detailed Description as examples or embodiments, with cach claim standing on its own as a separate embodiment. and it is contemplated that such embodiments can be combined with cach other in various combinations or permutations. The scope of the invention should be determined wilh reference to the appended claims. along with the full scope of equivalents to which such claims are entitled. 24 Client Docket No. 408962-LU-NP

Claims

WHAT IS CLAIMED IS:

1. A system for automatically controlling a displayed position of a document during a meeting, the system comprising: means for accessing descriptors that associate positions within an electronic document with one or more speech elements. the speech elements comprising a keyword or speech embedding: means for displaying the electronic document at a first position to a participant of the meeting on a display device: means for receiving an audio stream of the participant, the audio stream comprising speech data for the participant: means for identifying spoken words of the participant using the speech data for the participant: means for identifying a matched speech clement of the one or more speech clements using the identified spoken words: means for identifying a second position within the electronic document using the matched speech element and the accessed descriptors: and means for displaying the electronic document at the second position to the participants on the display device in response to identifying the second position.

2. The system of claim 1. wherein the electronic document is a slideshow and wherein the positions within the electronic document are slides of the slideshow.

3. The system of claim | or claim 2, wherein the means for accessing the descriptors comprises means for accessing metadata for the electronic document defining the associations between the electronic document and the one or more speech elements. Client Docket No. 408962-1,U-NP

4. The system of claim 3. wherein the means for identifying the second position within the electronic document comprises means for obtaining the second position from the metadata using the matched speech element.

5. The system of any preceding claim. wherein the means for identifving the spoken words of the participant comprises means for identifving the spoken words using a speech analysis model, wherein the speech data is input to the speech analysis model and the identified spoken words are output from the speech analysis model

0. The system of any preceding claim. wherein the one or more speech elements comprises a plurality of respective agenda items of an agenda of the meeting, and wherein the means for identifying the matched speech clement comprises means for identifving a matched agenda item of the plurality of respective agenda items using the identified spoken words. and wherein identifying the second position comprises identifying the second position within the electronic document using the matched agenda item and the accessed descriptors,

7. The system of claim 6, further comprising means for generating the descriptors to associate the positions within the document with the plurality of agenda items prior to start of the meeting.

8. The system of claim 6 or claim 7, whercin each item of the plurality of agenda items comprises description text, and wherein the means for identilving the matched agenda item comprises means for matching the spoken words with words or speech embeddings of the description text of the matched agenda item,

9. The system of any preceding claim. further comprising means for generating the descriptors using a model that receives the electronic document as input and gcnerates the descriptors to associate the positions within the electronic document with the one or more speech elements. 26 Client Docket No. 408962-LU-NP

10. A method for automatically controlling a displayed position of a document during a meeting. the method comprising: accessing descriptors that associate positions within an electronic document with one or more speech elements, the speech elements comprising keywords or speech embeddings: displaying the clectronic document at a first position lo a participant of the meeting on a display device; receiving an audio stream of the participant. the audio stream comprising speech data for the participant, identifying spoken words of the participant using the speech data for the participant: identifying a matched speech element of the one or more speech elements using the identified spoken words: identifying a second position within the electronic document using the matched speech element and the accessed descriptors: and in response to identifying the second position. displaying the electronic document at the second position to the participants on the display device.

11. The method of claim 10, wherein accessing the descriptors comprises accessing metadata for the electronic document defining the associations between the electronic document and the one or more speech elements,

12. The method of claim 11. wherein identifying the second position within the electronic document comprises obtaining the second position from the metadata using the matched speech element.

13. The method of any of claims 10 to 12. wherein identifying the spoken words of the participant comprises identifying the spoken words using a speech analysis 27 Client Docket No. 408962-1U-NP model. wherein the speech data is input to the speech analysis model and the identified spoken words are output Irom the speech analysis model.

14. The method of any of claims 10 to 13, wherein the one or more speech clements comprises a plurality of respective agenda items of an agenda of the meeting. and wherein identifving the matched speech element comprises identifying a matched agenda item of the plurality of respective agenda items using the identified spoken words. and wherein identifying the second position comprises identifying the second position within the electronic document using the matched agenda item and the accessed descriptors, and wherein the method further comprises generating the descriptors to associate the positions within the document with the plurality of agenda items prior to start of the mecting.

15. The method of any of claims 10 to 14, further comprising generating, the descriptors using a model that receives the electronic document as input and generates the descriptors to associate the positions within the electronic document with the one or more speech elements, 28 Client Docket No. 408962-LU-NP