[go: up one dir, main page]

WO2015172126A1 - Full duplex high quality audio/video communication over internet - Google Patents

Full duplex high quality audio/video communication over internet Download PDF

Info

Publication number
WO2015172126A1
WO2015172126A1 PCT/US2015/030061 US2015030061W WO2015172126A1 WO 2015172126 A1 WO2015172126 A1 WO 2015172126A1 US 2015030061 W US2015030061 W US 2015030061W WO 2015172126 A1 WO2015172126 A1 WO 2015172126A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
location
video data
packets
server
Prior art date
Application number
PCT/US2015/030061
Other languages
French (fr)
Inventor
Michael J. MARESCA, Jr.
Original Assignee
High Speed Video
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by High Speed Video filed Critical High Speed Video
Publication of WO2015172126A1 publication Critical patent/WO2015172126A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/222Monitoring or handling of messages using geographical location information, e.g. messages transmitted or received in proximity of a certain spot or area
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/762Media network packet handling at the source 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/75Media network packet handling
    • H04L65/764Media network packet handling at the destination 

Definitions

  • This invention relates to the transmission of data over communication links and, in particular, to the transmission of live, full duplex, high frame rate, synchronized audio and video over the Internet, using PCs or the like as user terminals.
  • a high-quality, low cost video call or multi-party conferencing solution has not been available that is accessible to users utilizing their existing or inexpensively available broadband Internet connections, and equipment such as PCs and laptop computers that have connected or built-in webcams, microphones and speakers.
  • the Internet operates by transferring data packets from a source to one or more destinations.
  • the packets may be of equal or unequal length and contain information to indicate where they begin and end, how to assemble them at the destination, and source and destination information.
  • Packet based communications over shared resources like the internet can incur latency, that is, delays in transmission and processing, and can result in degraded quality of the received transmission. Such delays and degradation are undesirable in applications requiring a high data transmission rate such as full-motion, live, full duplex video transmission.
  • a system is needed to provide live, full-motion, full duplex transmission of high quality voice/video calls and multiparty conferences, in which users utilize their existing PCs and the like as communication terminals, and their existing broadband Internet and/or other existing computer networks such as Ethernet-based LANs and WANs as communication channels.
  • the herein disclosed systems and methods comprise facilitating peer to peer or managed multiparty communications between user terminals that have elements for capturing and transmitting, and for receiving, reassembling, and presenting, audio signals, video signals, or both (hereinafter "audio/video”), in real time, such as in audio/video calls or conferences.
  • audio/video audio signals, video signals, or both
  • Embodiments facilitate simultaneous full duplex communication between two parties in a peer to peer mode, or between two or more user terminals as two-party calls, or as multiparty video conferences. Thereby, real-time audio/video communications and conferencing including interactivity is provided between two or more parties.
  • each user terminal is equipped to capture audio and video in real time, such as using a built-in or coupled webcam and microphone, and to convert the audio and video signals into a digital signal suitable for sending over the Internet.
  • the video from one or more of the user terminals may be or include that terminal's graphical user interface (GUI) "desktop".
  • GUI graphical user interface
  • the digital signal is packetized, that is, divided into segments that are formed into data packets.
  • Each participating user terminal is also equipped to receive data packets and to convert the packets into a digital signal for processing into real-time audio and synchronized video, in which video is presented on a graphical display or the like, and audio is played on an integrated or connected speaker.
  • the transmitted and received audio/video signals are conveyed through the Internet in packets.
  • the packets may be of a predetermined length, which may be the same length for each packet.
  • the packets include source and destination information; information indicating the length of the packet; and information indicating how to decode the audio/video data.
  • the communications may be set up in a peer to peer mode between two user terminals, or may be set up in a managed communication mode between two or more user terminals, wherein the communications between all participating user terminals are handled by and pass through a server in a network operations center (NOC).
  • NOC network operations center
  • the present invention comprises means for capturing video images and associated audio, packeting the same, transmitting them as a smooth continuous stream of video and audio data over the Internet, decoupling the data laden packets, and reassembling the data as video images with associated audio at the desired recipient location.
  • the present invention also permits a combination of the images from diverse locations, so that there is not only interactivity between a single sender and receiver, but, as well, with multiple parties.
  • the present invention is capable of functioning over a secure peer-to-peer backbone.
  • a user has at his or her location a means to capture video imagery and the associated audio and convert it to a digital signal, and a means to convert a received digital signal into a presentation of video imagery and the associated audio.
  • These means are generally well-known in the art and would include devices such as a microphone, a camera, a video/audio encoder/decoder, a monitor, and a speaker.
  • the means are either integral in or can be made available by employing a personal computer ("PC").
  • the transmitted or received video signal is transmitted to or received from the Internet in packets of predetermined, preferably equal length. Each packet is generally encoded with the following information: (1 ) information indicating the beginning and the end of the packet, (2) information indicating the length of the packet, (3) information indicating the algorithm used to encode the audio/video data, and (4) the encoded audio/video data itself.
  • the packet stream is routed to the recipient who has the apparatus, software or both designed in accordance with the present invention to capture the data and convert the packets to an audio/video signal which is in turn displayed on a video device with accompanying audio projected.
  • the transmission and reception of the packets is based on a first in/first out protocol, as packets are pulled out to be transmitted or received, they are immediately replaced with the next packets required to be transmitted or received. As a consequence, the sequence is maintained in a relatively uninterrupted manner. Although it is indeed preferable to transmit and receive all data without any error, the system of the present invention need not transmit and receive 100% of the data since an acceptable, indeed very high quality video signal will be enabled even with a loss of some data from the stream of data.
  • the present invention thus provides for the transmission and reception of full motion, full duplex, live video data and accompanying audio data over the Internet or other computer network, with the concomitant benefit of permitting parties at remote locations to visually and audibly communicate with each other.
  • one of the parties to the connection is a television studio originating a video broadcast, the video communications received will be and can be transmitted and received between any two points served by the conventional telephone network, at a cost which does not deter widespread use.
  • NOC Network Operation Center
  • the system creates a meeting room, a virtual location where the users of the system "meet.”
  • the NOC permits all users to have the same transmission rates of data and as more specifically described permits certain supplementary enhancements, such as featuring a larger image of the person then speaking.
  • a room system may also connect to the
  • a room system is a videoconferencing station that typically includes large monitors with a wide-angle camera and serves groups of people who meet in a room and conference with other groups at remote locations. If the room system has an Internet connection, it would connect to the videoconference system in a manner similar to that used by a personal computer. In such a case, if there was only one other user, the connection would be peer-to-peer, whereas if there were three or more total users, the connection would be made through a meeting room at the NOC. If the room system does not have an Internet connection, but instead has a private network, then it would connect to the NOC through a hardware bridge.
  • FIG. 1 is a schematic diagram showing full duplex peer to peer
  • FIG. 2 is a schematic diagram showing an embodiment of full duplex
  • audio/video communications between a plurality of users wherein each user is communicatively coupled through the Internet to a network operations center which manages the data flow between the user terminals.
  • FIG. 3 is a schematic diagram showing an embodiment of the bidirectional transmission of video between two users, where each is using a personal computer via an Internet path.
  • FIG. 4 is a schematic diagram showing an embodiment of the bidirectional transmission of video between two users, where one is using a personal computer with an Internet connection and the other is using a room system with a private network.
  • FIG. 5 is a schematic diagram showing an embodiment of the bidirectional transmission of video between three or more users, where each user accesses a meeting room.
  • the present invention is a system for full duplex high quality audio/video communications using conventional PCs, laptops, or the like, as user terminals.
  • Each of the terminals has a broadband connection to the Internet.
  • the broadband connection may be through a private or public network to which a user terminal is connected, wherein that network is coupled to the Internet; or the broadband connection may be through a direct connection to the Internet using a conventional commonly available broadband data service, such as would be provided by a multisystem operator (MSO).
  • MSO multisystem operator
  • FIG. 1 shows an implementation of a system 100 featuring bidirectional transmission of audio/video between two users, each using an Internet- connected personal computer (PC) as a user terminal, and the Internet as at least part of the communication path between them.
  • System 100 includes an Internet-connected server computer 120 disposed in a network operations center (NOC) 125, to which two user terminals 1 10, 1 15 are communicatively coupled.
  • the server may be or include a server computer or a plurality of interconnected server computers, preferably disposed at the NOC 125, each server containing at least tangible processor and storage devices and configured to perform one or more server functions that will be described.
  • the exemplary communication paths shown, including path 130 between server 120 and PC 1 1 0, path 140 between server 1 20 and PC 1 15, and path 150 directly between PC 1 1 0 and PC 1 15, are set up over network paths that include the internet.
  • User terminals 1 10, 1 1 5 may each be a desktop computer, laptop or the like
  • PC personal computer
  • a graphical display with Internet access, a graphical display, an embedded or connected video camera such as a webcam, a microphone, and a speaker.
  • Each PC is running a web-enabled application such as a web browser, such as Google's Chrome browser (hereinafter “browser”).
  • the user (user “A”) at PC 1 1 0 wants to establish a real time full duplex audio/video connection with the user (user "B") at PC 1 15. To do so, user A launches his browser and uses it to link to a web server function in server 120 (hereinafter, every server function may be referred to as a "function server 120", such as "web server 1 20”) in NOC 1 25, for example by typing its name into the address bar of the browser.
  • the link may be a secure link, such as an SSL link.
  • User A logs onto the system, for example with a username and password.
  • other or additional logon security measures may be included in the logon process, in any well known manner.
  • Registration server 1 20 keeps track of which PCs are logged onto the system, and notes that A has logged on, and stores A's network address and any other information needed to include A in an audio/video communication session. Thereafter, when A attempts to call B, the system first checks to see whether B's PC is also currently online and logged in to the system by checking with registration server 120.
  • the system retrieves B's address from the registration server, and uses it to facilitate establishing a communication session between A and B over path 150, which may be secured by SSL or the like.
  • a and B can thereafter communicate directly with each other in a peer to peer configuration.
  • the system tries to locate B within A's local network, if any. If B is found there, the system obtains B's address directly from B, and makes the connection. A and B can thereafter engage in peer to peer communications. [0031 ] In an embodiment, if B is not found on A's local network, the system then connects to one or more other network servers, such as non-local servers on a WAN that A's LAN is connected to. The system then looks for B on the extended network. If B is found, the system obtains B's address directly from B, or from a LAN or WAN server to which B is communicatively coupled. The system then facilitates a connection between A and B on the WAN, and again A and B can communicate in a peer to peer mode.
  • one or more other network servers such as non-local servers on a WAN that A's LAN is connected to.
  • the system looks for B on the extended network. If B is found, the system obtains B's address directly from B, or from a LAN or
  • B If B still cannot be found, it may be because a needed port is not open on a network to which B is communicatively coupled, for example a port on a gateway to that network.
  • the system can look for one or more additional servers that are not local to A, that is, not on A's LAN, but are accessible to A, for example on the WAN that A's LAN is part of, in search of servers that have an open port available to other networks to which B may be coupled. If B is located, the system then facilitates a connection between A and B, after which A and B can communicate in a peer to peer mode.
  • a and B may be associated with different port numbers to establish communications with each other with the help of the system. The system may facilitate the connection between A and B by telling A, or a gateway server to A's network, or both, to use the open port it found into B's network.
  • the system may connect A to a media server 1 20 of the system and prompt user A to explicitly invite B to join a call.
  • User A may then reach out to the desired remote party B, such as by telephone, text message, email, or the like, and request that the remote user B connect a user terminal to the media server.
  • the system can set up communications between A and B through media server 120 over network paths 1 30, 140.
  • the media server may be a multi-point media server (MPMS) 120 able to perform processing necessary for multi-party calls, such as by setting up a respective communication channel between the MPMS and each of a plurality of parties to a call.
  • MPMS multi-point media server
  • parties A, B, C, D are all communicatively coupled to the MPMS server 1 20 through network paths 1 30, 140, 160, and 170, respectively, each of which paths includes at least a portion that is implemented over the Internet.
  • resources may include one or more of hardware, software, and bandwidth. In embodiments, some or all such additionally required resources may be, and/or be controlled by, interconnected equipment disposed in NOC 125.
  • the system requires four servers or server-based
  • the functions may be performed in a plurality of interconnected servers 120, or on a single server 120 providing multiple functions, or on various operatively coupled but not co-located computer resources such as processors and data storage devices.
  • the server(s) may reside on a back end disposed within an NOC 125.
  • server functionality is implemented using at least one tangible computer-readable storage device storing computer code, operatively coupled to at least one tangible computing processor that retrieves and executes the code to cause operatively connected system resources to cooperate to perform the desired function.
  • the four servers/functions include the following:
  • a registration server that monitors when each user terminal logs into or out of the system.
  • the registration server keeps track of which user terminals are logged onto the system in real time, and records the corresponding network address of each respective logged on user terminal. This function requires very little processing power and very little bandwidth. If both parties to a call are initially logged onto the system, a communication connection between them can be facilitated using only a modest allocation of system processing power and bandwidth.
  • a port scanning server which seeks and reaches out to target computers and may also look for open ports on one or more parties' respective network. This server then relays this information to the system software, telling it which ports to use to set up or facilitate a communication connection. If there are only two parties to a call, the communication data (Audio and Video Data) may then travel directly between the parties in a peer to peer mode.
  • the port scanning server function requires more processing power and network resources than the registration server function, and accordingly, if this function is used to attempt to set up the communication connection, additional processing power and bandwidth resources are allocated to the server function at the NOC.
  • a media server that acts as a bridge coupled to both the local user
  • the media server and corresponding connection method requires a still greater use of computing and bandwidth resources and accordingly, if this function is used to attempt to set up the communication connection, still further processing power and bandwidth resources are allocated to the server function at the NOC.
  • the Audio and Video Data is then routed through the media server, which may be disposed at the NOC.
  • a Multipoint Media Server similar in nature to the media server but containing or requiring even more computing resources, such as memory and processing power, and more network resources, such as bandwidth, to facilitate multi-party communications. Accordingly, the needed resources are allocated to the MPMS function at the NOC, and the Audio and Video Data for a multi-party audio/video conference is routed through the media server to and from each party.
  • MPMS Multipoint Media Server
  • the local user may also be shown on the local display, for example in a separate frame, or as a reduced size picture-in-picture (PiP) overlaid on a corner of the frame in which the remote user is shown. If so, in an embodiment, the PiP can be partially transparent so that the remote scene can be dimly seen through it.
  • a reduced size picture-in-picture PiP
  • the PiP can be partially transparent so that the remote scene can be dimly seen through it.
  • one or more frames showing one or more remote users may be embedded within the user interface of an application running on the local user's terminal.
  • Such application may be a standard web browser, or it may be another web-enabled application.
  • Such a web-enabled application may execute on the local user terminal, or it may execute on a remote server whereon the user interface is generated, or code for the user interface such as HTML code is generated, and conveyed from the server to the local user terminal for presentation on its display.
  • one or more frames may be generated as individually configurable free floating windows overlaid on the local machine's graphical user interface (GUI) "desktop", or the like.
  • GUI graphical user interface
  • one or more of the frames may include real-time video of a remote user's desktop instead of or in addition to the remote user's webcam video; or, one frame may include the real-time video of the remote user's desktop, and another frame may include the real-time video of the same remote user's webcam video.
  • Beneficial and novel use cases of the herein disclosed systems and methods are many and varied.
  • television viewers could interact with the host of a program, as well as with other viewers.
  • New television broadcast programming formats may be implemented using the system.
  • news providers could create video chat rooms in which users could directly provide and disseminate realtime video or newsworthy events, such as from a smart phone or the like, and the news provider could report on and solicit immediate viewer reaction to the news as it occurs.
  • News organizations could also link their networks of affiliates together to cover news as it happens using their combined resources.
  • programs may provide advertising opportunities for marketing products or services that are enhanced by the ability of viewers to interact with an advertiser spokesperson and/or with other viewers.
  • Inexpensive video conferencing particularly with the security available with a secure client peer-to-peer connection, is compatible with the needs of many industries.
  • the financial services industry which is concerned about the security of transmitting client information over the Internet, can use the system to communicate securely with clients.
  • the present practice of business and financial community conference calls reporting to the investment community can be greatly enhanced by providing audio/video interactivity among participants.
  • Novel advantages in the medical field of readily accessible real time audio/video communication and interactivity for patient diagnosis, monitoring, and other services between care givers, patients, and/or support staff may also be realized using the system.
  • benefits can be realized in situations ranging from video tutoring to "wide area" distributed virtual classrooms, in which students at a plurality of locations are able to maintain a personal, individual audio/visual interaction with the instructor.
  • Beneficiaries of the herein disclosed systems and methods include individuals who can use their existing computers and broadband Internet connections to communicate with other persons or groups, in high-quality real time audio/video calls and conferences.
  • the present disclosure describes an advanced managed computer- based system or platform for real-time video communications.
  • the system can connect two or more user terminals on demand, to provide secure, full featured well synchronized audio/video calls, multiparty conferences, and communications services.
  • the user terminals may consist of or include inexpensive or legacy equipment that individual and business users of all sizes are likely already in possession of, such as PCs and laptops with integrated or operatively coupled webcams, microphones, and speakers.
  • legacy network communications including existing LANs, WANs, and the Internet, provided that the available point to point network bandwidth is sufficient to support the requirements of the system.
  • a common web browser application such as Google Chrome for example, can be utilized as a user interface with one or more frames disposed within the browser interface, and configured to present within each frame audio/video of a different party to the communication.
  • Google Chrome In general, commonly available residential-grade "broadband" Internet connections are easily able to support full motion, high frame rate (e.g., 30-60 frames per second), low latency, well-synchronized sound and video in a large, easily viewable frame presented in a standard web browser interface on a typical computer display.
  • high frame rate e.g., 30-60 frames per second
  • low latency well-synchronized sound and video
  • sophisticated two-party audio/video calls and multi-user audio/video conferencing solutions can be provided using only inexpensive or legacy equipment, existing private and public networks, and broadband connections to the Internet, without requiring any new capital expenditures.
  • the present invention is a system for duplex enhanced quality bidirectional video transmission over an Internet backbone.
  • improvements in the duplex enhanced quality video transmission system of the present invention is better performance while eliminating the need for specialized hardware.
  • FIG. 3 shows an implementation of a system 100 featuring bidirectional transmission of video between two users, with each using a personal computer via an Internet path.
  • the system 100 includes: personal computers 1 1 0 and 1 1 5; a network access server 130; a network operation center (“NOC”) server 160; and an Internet network 150.
  • NOC network operation center
  • Personal computer 1 1 0 may be a desktop computer, laptop, workstation or router, that is capable of connecting into the network access server 130 to establish a session 120.
  • Personal computer 1 10 incorporates a secure client desktop software containing a computer network authentication protocol employing strong encryption, preferably IPsec.
  • IPsec Internet security protocol
  • This Internet security protocol allows for cryptographic key establishment and authenticating and/or encrypting each IP packet in a data stream.
  • IPsec or other like protocol is preferable because it functions at the network layer, which gives it more flexibility than many other security protocols in common use, such as SSH,
  • the secure client connects to a website hosted by network access server 130, where the user's identity is confirmed.
  • the network access server 130 is a computer, or a group of hardware or software components or processes that execute in one or more computer systems.
  • the secure client and network access server 130 then create an encrypted network tunnel from the user's computer to a Network Operation Center (“NOC") server 160, via a private Internet backbone 150.
  • NOC Network Operation Center
  • This preferential Internet backbone 150 is a preferential route supplied by arrangement with a data network provider, using fiberoptic lines that are reserved to carry specially routed traffic, thus providing for greater available bandwidth for the bidirectional transmission of video between two or more users of the system.
  • the network access server 130 controls remote access to the preferential
  • the NOC server 160 is a computer, or a group of hardware or software
  • the NOC server 1 60 performs authorization and authentication functions.
  • the NOC server 160 has a directory established for each user, containing identification and password information and a list of approved users to whom each user may connect.
  • the NOC server 1 60 utilizes the password forwarded by the network access server 130 to perform
  • the NOC server 1 60 presents the user with the list of approved users to whom he can connect.
  • the NOC server 160 also shows whether that user is online or not.
  • the NOC server locates the initiating user as well as the addressee.
  • the user at personal computer 1 10 wishes to communicate with the user at personal computer 1 15, but that user is not online.
  • the user at personal computer 1 10 needs to contact the user at personal computer 1 15, via telephone, e-mail, text message, etc., and request that he sign into the system.
  • the user at personal computer 1 1 5 goes through the identical process of connecting into the network access server 130 so as to establish his own session 1 25, and likewise being connected to the NOC server 160 via a preferential Internet backbone 150, along route 175.
  • the user at personal computer 1 1 0 will be able to request a bidirectional video session with personal computer 1 15.
  • the user at personal computer 1 15 must manually accept the bidirectional video session, whereas in a second embodiment, personal computer 1 15 may be set to an auto-answer mode, where the bidirectional video session will be established upon the request from personal computer 1 10.
  • the NOC server 160 will then communicate instructions to personal
  • a user may use a room system, which is a videoconferencing station that typically includes large monitors with a wide- angle camera and serves groups of people who meet in a room and conference with other groups at remote locations. If the room system has an Internet connection, it would interface to the videoconferencing system in the manner of Personal Computer 1 15.
  • a room system which is a videoconferencing station that typically includes large monitors with a wide- angle camera and serves groups of people who meet in a room and conference with other groups at remote locations. If the room system has an Internet connection, it would interface to the videoconferencing system in the manner of Personal Computer 1 15.
  • FIG. 4 shows an implementation in which one of the users is using a room system 220 which instead of having an Internet connection has a private network 230.
  • the privately networked room system 220 connects to the NOC server 160 through a hardware bridge 210.
  • the room system 220 communicates with the NOC server 160 along route 240, allowing for the NOC server 160 to perform the authorization and authentication functions.
  • the bridge 210 also serves as a conduit for the video signals, as they travel along path 180 between personal computer 1 10 and the room system 220. As path 180 is routed partially on the preferential Internet backbone 150 and partially on the private network 230, a high bandwidth is available, resulting in excellent image quality and reduced latency.
  • FIG. 5 shows another implementation, in which three or more users are participating in a videoconference.
  • the users log into a meeting room 360 which is provided at the NOC server 160.
  • the meeting room 360 multiplexes each incoming video signal, and sends a copy to the other participants.
  • Each participant will see each other participant in the video conference call, providing for enhanced security, as no one may monitor a video conference unless they are a participant and their presence is seen by all other participants.
  • One or more of the participants in a videoconference with three or more participants may be employing a room system with a private network that interfaces to the invention through a hardware bridge located at the NOC server, as previously described and shown in FIG. 4.
  • the secure client can be tailored for a particular application or industry, such as having the video only take up part of the screen, with the remainder of the screen dedicated to another task, such as displaying a user-completed form.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

A system for facilitating full duplex real time well synchronized audio/video communications over the Internet using low cost or legacy computer equipment as user terminals, and low cost or legacy broadband Internet access. The herein disclosed systems and methods comprise facilitating peer to peer or managed multiparty communications between user terminals that have elements for capturing and transmitting, and for receiving, reassembling, and presenting, audio signals, video signals, or both (hereinafter "audio/video"), in real time, such as in audio/video calls or conferences.

Description

FULL DUPLEX HIGH QUALITY AUDIO/VIDEO COMMUNICATION OVER
INTERNET
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001 ] The present invention claims priority to U.S. Provisional Patent Application
Serial No. 61 /991 ,101 , titled FULL DUPLEX HIGH QUALITY AUDIO/VIDEO COMMUNICATION OVER INTERNET, filed May 9, 2014, the entirety of which is incorporated by reference herein.
BACKGROUND
Field of the Invention
[0002] This invention relates to the transmission of data over communication links and, in particular, to the transmission of live, full duplex, high frame rate, synchronized audio and video over the Internet, using PCs or the like as user terminals.
Description of the Prior Art
[0003] The use of communications networks to transmit live video and audio is well known. However, prior art systems have typically required expensive specialized or dedicated equipment, or have had limitations such as annoyingly noticeable transmission delays, poor synchronization between live audio and corresponding live video, poor video resolution, slow frame rate resulting in video that appears choppy, or a combination of these. Smooth live full motion video call communications, such as two party or multiparty video conferences, requires the delivery of a substantial amount of data in a relatively uninterrupted stream, which has proven difficult to accomplish using commonly available legacy or low-cost PCs and networks with commonly available broadband Internet access. Even recently upgraded data networks of multi system operators (MSOs) have proven insufficient for prior art systems to deliver desirably smooth video calls and video conferencing.
[0004] A high-quality, low cost video call or multi-party conferencing solution has not been available that is accessible to users utilizing their existing or inexpensively available broadband Internet connections, and equipment such as PCs and laptop computers that have connected or built-in webcams, microphones and speakers. As is well known, the Internet operates by transferring data packets from a source to one or more destinations. The packets may be of equal or unequal length and contain information to indicate where they begin and end, how to assemble them at the destination, and source and destination information. Packet based communications over shared resources like the internet can incur latency, that is, delays in transmission and processing, and can result in degraded quality of the received transmission. Such delays and degradation are undesirable in applications requiring a high data transmission rate such as full-motion, live, full duplex video transmission. In the prior art, bidirectional video
communication using standard residential and small business broadband connections and legacy user terminals as described previously has suffered from one or more of severe latency, poor image quality, low frame rate, and inaccurate synchronization between audio and video.
[0005] A system is needed to provide live, full-motion, full duplex transmission of high quality voice/video calls and multiparty conferences, in which users utilize their existing PCs and the like as communication terminals, and their existing broadband Internet and/or other existing computer networks such as Ethernet-based LANs and WANs as communication channels. The herein disclosed systems and methods comprise facilitating peer to peer or managed multiparty communications between user terminals that have elements for capturing and transmitting, and for receiving, reassembling, and presenting, audio signals, video signals, or both (hereinafter "audio/video"), in real time, such as in audio/video calls or conferences. The
communications are conveyed over the Internet from senders to receivers as streams of audio/video data packets, and the system provides live, real time, high quality, high frame rate, well synchronized audio/video communications. Embodiments facilitate simultaneous full duplex communication between two parties in a peer to peer mode, or between two or more user terminals as two-party calls, or as multiparty video conferences. Thereby, real-time audio/video communications and conferencing including interactivity is provided between two or more parties.
Preferably each user terminal is equipped to capture audio and video in real time, such as using a built-in or coupled webcam and microphone, and to convert the audio and video signals into a digital signal suitable for sending over the Internet. In an embodiment, the video from one or more of the user terminals may be or include that terminal's graphical user interface (GUI) "desktop". The digital signal is packetized, that is, divided into segments that are formed into data packets. Each participating user terminal is also equipped to receive data packets and to convert the packets into a digital signal for processing into real-time audio and synchronized video, in which video is presented on a graphical display or the like, and audio is played on an integrated or connected speaker. Thus, the transmitted and received audio/video signals are conveyed through the Internet in packets. The packets may be of a predetermined length, which may be the same length for each packet. In addition to the encoded audio/video data, the packets include source and destination information; information indicating the length of the packet; and information indicating how to decode the audio/video data. The communications may be set up in a peer to peer mode between two user terminals, or may be set up in a managed communication mode between two or more user terminals, wherein the communications between all participating user terminals are handled by and pass through a server in a network operations center (NOC).
[0008] The present invention comprises means for capturing video images and associated audio, packeting the same, transmitting them as a smooth continuous stream of video and audio data over the Internet, decoupling the data laden packets, and reassembling the data as video images with associated audio at the desired recipient location. In an alternative embodiment, the present invention also permits a combination of the images from diverse locations, so that there is not only interactivity between a single sender and receiver, but, as well, with multiple parties.
[0009] To accomplish the duplex enhanced quality transmission of the video and audio data between source(s) and destination(s), the present invention is capable of functioning over a secure peer-to-peer backbone.
[0010] In its simplest form, a user has at his or her location a means to capture video imagery and the associated audio and convert it to a digital signal, and a means to convert a received digital signal into a presentation of video imagery and the associated audio. These means are generally well-known in the art and would include devices such as a microphone, a camera, a video/audio encoder/decoder, a monitor, and a speaker. In general the means are either integral in or can be made available by employing a personal computer ("PC"). In accordance with the present invention, the transmitted or received video signal is transmitted to or received from the Internet in packets of predetermined, preferably equal length. Each packet is generally encoded with the following information: (1 ) information indicating the beginning and the end of the packet, (2) information indicating the length of the packet, (3) information indicating the algorithm used to encode the audio/video data, and (4) the encoded audio/video data itself.
[001 1 ] Thus, even before the audio/video data leaves the location of the sender, it has been transformed into packets to be transmitted over the Internet or other computer network, and the audio/video data remains in packet form until it arrives at the location of the receiver.
[0012] The packet stream is routed to the recipient who has the apparatus, software or both designed in accordance with the present invention to capture the data and convert the packets to an audio/video signal which is in turn displayed on a video device with accompanying audio projected. This generally is a PC.
[0013] Since the transmission and reception of the packets is based on a first in/first out protocol, as packets are pulled out to be transmitted or received, they are immediately replaced with the next packets required to be transmitted or received. As a consequence, the sequence is maintained in a relatively uninterrupted manner. Although it is indeed preferable to transmit and receive all data without any error, the system of the present invention need not transmit and receive 100% of the data since an acceptable, indeed very high quality video signal will be enabled even with a loss of some data from the stream of data.
[0014] The present invention thus provides for the transmission and reception of full motion, full duplex, live video data and accompanying audio data over the Internet or other computer network, with the concomitant benefit of permitting parties at remote locations to visually and audibly communicate with each other. As well, if one of the parties to the connection is a television studio originating a video broadcast, the video communications received will be and can be transmitted and received between any two points served by the conventional telephone network, at a cost which does not deter widespread use.
[0015] This is accomplished through the use of a preferential Internet backbone, a preferential route supplied by arrangement with a data network provider, using fiberoptic lines that are carrying reduced data traffic. This provides for great bandwidth for the bidirectional transmission of video between two or more users.
[0016] When an authorized user logs on, a Network Operation Center ("NOC") verifies both the user and the other users authorized to receive video from and send video to each other. The NOC provides applicable addresses of other users to which each user may connect. If there are only two users, the system will place them into a peer-to-peer connection, for enhanced speed, reliability, and security.
[0017] In an additional embodiment, where more than two users are to be
connected, the system creates a meeting room, a virtual location where the users of the system "meet." By use of a multiplexing system the NOC permits all users to have the same transmission rates of data and as more specifically described permits certain supplementary enhancements, such as featuring a larger image of the person then speaking.
[0018] In an additional embodiment, a room system may also connect to the
videoconference system. A room system is a videoconferencing station that typically includes large monitors with a wide-angle camera and serves groups of people who meet in a room and conference with other groups at remote locations. If the room system has an Internet connection, it would connect to the videoconference system in a manner similar to that used by a personal computer. In such a case, if there was only one other user, the connection would be peer-to-peer, whereas if there were three or more total users, the connection would be made through a meeting room at the NOC. If the room system does not have an Internet connection, but instead has a private network, then it would connect to the NOC through a hardware bridge.
[0019] These and other objects and advantages of the present invention will
become more apparent to those of ordinary skill in the art upon
consideration of the attached drawings and the following description of the preferred embodiments which are meant by way of illustration and example only, but are not to be construed as in any way limiting the invention disclosed and claimed herein.
[0020] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory, and are intended to provide further explanation of the invention as claimed. BRIEF DESCRIPTION OF THE DRAWINGS
[0021 ] The drawings illustrate disclosed embodiments and/or aspects and, together with the description, serve to explain the principles of the invention, the scope of which is determined by the claims.
[0022] FIG. 1 is a schematic diagram showing full duplex peer to peer
communication of synchronized real time audio and video between two users communicating in a peer to peer mode, wherein each is using a conventional personal computer coupled to the Internet.
[0023] FIG. 2 is a schematic diagram showing an embodiment of full duplex
audio/video communications between a plurality of users, wherein each user is communicatively coupled through the Internet to a network operations center which manages the data flow between the user terminals.
[0024] FIG. 3 is a schematic diagram showing an embodiment of the bidirectional transmission of video between two users, where each is using a personal computer via an Internet path.
[0025] FIG. 4 is a schematic diagram showing an embodiment of the bidirectional transmission of video between two users, where one is using a personal computer with an Internet connection and the other is using a room system with a private network.
[0026] FIG. 5 is a schematic diagram showing an embodiment of the bidirectional transmission of video between three or more users, where each user accesses a meeting room. DETAILED DESCRIPTION
[0027] The present invention is a system for full duplex high quality audio/video communications using conventional PCs, laptops, or the like, as user terminals. Each of the terminals has a broadband connection to the Internet. The broadband connection may be through a private or public network to which a user terminal is connected, wherein that network is coupled to the Internet; or the broadband connection may be through a direct connection to the Internet using a conventional commonly available broadband data service, such as would be provided by a multisystem operator (MSO).
[0028] Figure 1 shows an implementation of a system 100 featuring bidirectional transmission of audio/video between two users, each using an Internet- connected personal computer (PC) as a user terminal, and the Internet as at least part of the communication path between them. System 100 includes an Internet-connected server computer 120 disposed in a network operations center (NOC) 125, to which two user terminals 1 10, 1 15 are communicatively coupled. The server may be or include a server computer or a plurality of interconnected server computers, preferably disposed at the NOC 125, each server containing at least tangible processor and storage devices and configured to perform one or more server functions that will be described. The exemplary communication paths shown, including path 130 between server 120 and PC 1 1 0, path 140 between server 1 20 and PC 1 15, and path 150 directly between PC 1 1 0 and PC 1 15, are set up over network paths that include the internet.
[0029] User terminals 1 10, 1 1 5 may each be a desktop computer, laptop or the like
(hereinafter "PC"), with Internet access, a graphical display, an embedded or connected video camera such as a webcam, a microphone, and a speaker. Each PC is running a web-enabled application such as a web browser, such as Google's Chrome browser (hereinafter "browser"). The user (user "A") at PC 1 1 0 wants to establish a real time full duplex audio/video connection with the user (user "B") at PC 1 15. To do so, user A launches his browser and uses it to link to a web server function in server 120 (hereinafter, every server function may be referred to as a "function server 120", such as "web server 1 20") in NOC 1 25, for example by typing its name into the address bar of the browser. The link may be a secure link, such as an SSL link. User A logs onto the system, for example with a username and password. In an embodiment, other or additional logon security measures may be included in the logon process, in any well known manner. Registration server 1 20 keeps track of which PCs are logged onto the system, and notes that A has logged on, and stores A's network address and any other information needed to include A in an audio/video communication session. Thereafter, when A attempts to call B, the system first checks to see whether B's PC is also currently online and logged in to the system by checking with registration server 120. If B is online and is logged in to the system, the system retrieves B's address from the registration server, and uses it to facilitate establishing a communication session between A and B over path 150, which may be secured by SSL or the like. A and B can thereafter communicate directly with each other in a peer to peer configuration.
In an embodiment, if B is not logged in to the system but may nevertheless be online, the system tries to locate B within A's local network, if any. If B is found there, the system obtains B's address directly from B, and makes the connection. A and B can thereafter engage in peer to peer communications. [0031 ] In an embodiment, if B is not found on A's local network, the system then connects to one or more other network servers, such as non-local servers on a WAN that A's LAN is connected to. The system then looks for B on the extended network. If B is found, the system obtains B's address directly from B, or from a LAN or WAN server to which B is communicatively coupled. The system then facilitates a connection between A and B on the WAN, and again A and B can communicate in a peer to peer mode.
[0032] If B still cannot be found, it may be because a needed port is not open on a network to which B is communicatively coupled, for example a port on a gateway to that network. In an embodiment, the system can look for one or more additional servers that are not local to A, that is, not on A's LAN, but are accessible to A, for example on the WAN that A's LAN is part of, in search of servers that have an open port available to other networks to which B may be coupled. If B is located, the system then facilitates a connection between A and B, after which A and B can communicate in a peer to peer mode. In an embodiment, A and B may be associated with different port numbers to establish communications with each other with the help of the system. The system may facilitate the connection between A and B by telling A, or a gateway server to A's network, or both, to use the open port it found into B's network.
[0033] If all of the foregoing procedures fail, the system may connect A to a media server 1 20 of the system and prompt user A to explicitly invite B to join a call. User A may then reach out to the desired remote party B, such as by telephone, text message, email, or the like, and request that the remote user B connect a user terminal to the media server. Thereafter, the system can set up communications between A and B through media server 120 over network paths 1 30, 140.
[0034] Referring now to figure 2, in an embodiment the media server may be a multi-point media server (MPMS) 120 able to perform processing necessary for multi-party calls, such as by setting up a respective communication channel between the MPMS and each of a plurality of parties to a call. Thereby, the system sets up an audio/video conference in which any party can communicate in real time with any other party. As shown, parties A, B, C, D are all communicatively coupled to the MPMS server 1 20 through network paths 1 30, 140, 160, and 170, respectively, each of which paths includes at least a portion that is implemented over the Internet.
[0035] In the exemplary embodiments, as the system proceeds through each of the foregoing procedures, progressively more resources may be required to complete a call. The resources may include one or more of hardware, software, and bandwidth. In embodiments, some or all such additionally required resources may be, and/or be controlled by, interconnected equipment disposed in NOC 125.
[0036] In an embodiment, the system requires four servers or server-based
functions. The functions may be performed in a plurality of interconnected servers 120, or on a single server 120 providing multiple functions, or on various operatively coupled but not co-located computer resources such as processors and data storage devices. The server(s) may reside on a back end disposed within an NOC 125. In general, server functionality is implemented using at least one tangible computer-readable storage device storing computer code, operatively coupled to at least one tangible computing processor that retrieves and executes the code to cause operatively connected system resources to cooperate to perform the desired function. The four servers/functions include the following:
[0037] 1 ) A registration server, that monitors when each user terminal logs into or out of the system. The registration server keeps track of which user terminals are logged onto the system in real time, and records the corresponding network address of each respective logged on user terminal. This function requires very little processing power and very little bandwidth. If both parties to a call are initially logged onto the system, a communication connection between them can be facilitated using only a modest allocation of system processing power and bandwidth.
[0038] 2) A port scanning server, which seeks and reaches out to target computers and may also look for open ports on one or more parties' respective network. This server then relays this information to the system software, telling it which ports to use to set up or facilitate a communication connection. If there are only two parties to a call, the communication data (Audio and Video Data) may then travel directly between the parties in a peer to peer mode. The port scanning server function requires more processing power and network resources than the registration server function, and accordingly, if this function is used to attempt to set up the communication connection, additional processing power and bandwidth resources are allocated to the server function at the NOC.
[0039] 3) A media server, that acts as a bridge coupled to both the local user
terminal and the remote user terminal, and through which the call's Audio and Video Data passes. The media server and corresponding connection method requires a still greater use of computing and bandwidth resources and accordingly, if this function is used to attempt to set up the communication connection, still further processing power and bandwidth resources are allocated to the server function at the NOC. The Audio and Video Data is then routed through the media server, which may be disposed at the NOC.
[0040] 4) A Multipoint Media Server (MPMS), similar in nature to the media server but containing or requiring even more computing resources, such as memory and processing power, and more network resources, such as bandwidth, to facilitate multi-party communications. Accordingly, the needed resources are allocated to the MPMS function at the NOC, and the Audio and Video Data for a multi-party audio/video conference is routed through the media server to and from each party.
[0041 ] In an embodiment, the resources necessary to complete the various
functions are determined dynamically and made available as needed, depending for example on the call types, server functions, and procedures used to complete each call.
[0042] In an embodiment, in a two party call a video frame is generated and
presented on each user's respective local display within which the remote user is shown. The local user may also be shown on the local display, for example in a separate frame, or as a reduced size picture-in-picture (PiP) overlaid on a corner of the frame in which the remote user is shown. If so, in an embodiment, the PiP can be partially transparent so that the remote scene can be dimly seen through it.
[0043] In a multi-party embodiment with more than two simultaneous users, a
separate frame can be generated and presented on every user's local display within which each respective remote user is shown. In an embodiment, a single frame can be divided into an array of small rectangular areas, and each remote user shown in a different rectangle. In an embodiment, one or more frames showing one or more remote users may be embedded within the user interface of an application running on the local user's terminal. Such application may be a standard web browser, or it may be another web-enabled application. Such a web-enabled application may execute on the local user terminal, or it may execute on a remote server whereon the user interface is generated, or code for the user interface such as HTML code is generated, and conveyed from the server to the local user terminal for presentation on its display. Alternatively, one or more frames may be generated as individually configurable free floating windows overlaid on the local machine's graphical user interface (GUI) "desktop", or the like. In an embodiment, one or more of the frames may include real-time video of a remote user's desktop instead of or in addition to the remote user's webcam video; or, one frame may include the real-time video of the remote user's desktop, and another frame may include the real-time video of the same remote user's webcam video.
Beneficial and novel use cases of the herein disclosed systems and methods are many and varied. In the field of entertainment, for example, television viewers could interact with the host of a program, as well as with other viewers. New television broadcast programming formats may be implemented using the system. For example, news providers could create video chat rooms in which users could directly provide and disseminate realtime video or newsworthy events, such as from a smart phone or the like, and the news provider could report on and solicit immediate viewer reaction to the news as it occurs. News organizations could also link their networks of affiliates together to cover news as it happens using their combined resources. Or, programs may provide advertising opportunities for marketing products or services that are enhanced by the ability of viewers to interact with an advertiser spokesperson and/or with other viewers.
[0045] Inexpensive video conferencing, particularly with the security available with a secure client peer-to-peer connection, is compatible with the needs of many industries. For example, in the financial services industry, which is concerned about the security of transmitting client information over the Internet, can use the system to communicate securely with clients. Moreover, the present practice of business and financial community conference calls reporting to the investment community can be greatly enhanced by providing audio/video interactivity among participants.
[0046] In the hotel and travel industry, the availability of interactive real time full duplex audio/video communications at hotels enhances the services available to a business traveler, who is thereby enabled to maintain frequent interactive point-to-point contact with home, office, and other business associates.
[0047] Interactive communication between government agencies and their
constituents also has novel advantages, particularly in emergency situations.
Novel advantages in the medical field of readily accessible real time audio/video communication and interactivity for patient diagnosis, monitoring, and other services between care givers, patients, and/or support staff may also be realized using the system. In education, benefits can be realized in situations ranging from video tutoring to "wide area" distributed virtual classrooms, in which students at a plurality of locations are able to maintain a personal, individual audio/visual interaction with the instructor. [0048] Beneficiaries of the herein disclosed systems and methods include individuals who can use their existing computers and broadband Internet connections to communicate with other persons or groups, in high-quality real time audio/video calls and conferences.
[0049] The foregoing examples are presented by way of example and not limitation, and many other applications and benefits will be apparent to those of ordinary skill in the art in view of the present disclosure.
[0050] Thus, the present disclosure describes an advanced managed computer- based system or platform for real-time video communications. The system can connect two or more user terminals on demand, to provide secure, full featured well synchronized audio/video calls, multiparty conferences, and communications services. The user terminals may consist of or include inexpensive or legacy equipment that individual and business users of all sizes are likely already in possession of, such as PCs and laptops with integrated or operatively coupled webcams, microphones, and speakers. Moreover, virtually any type of legacy network communications may be used, including existing LANs, WANs, and the Internet, provided that the available point to point network bandwidth is sufficient to support the requirements of the system. In an embodiment, a common web browser application, such as Google Chrome for example, can be utilized as a user interface with one or more frames disposed within the browser interface, and configured to present within each frame audio/video of a different party to the communication. In general, commonly available residential-grade "broadband" Internet connections are easily able to support full motion, high frame rate (e.g., 30-60 frames per second), low latency, well-synchronized sound and video in a large, easily viewable frame presented in a standard web browser interface on a typical computer display. Thereby, sophisticated two-party audio/video calls and multi-user audio/video conferencing solutions can be provided using only inexpensive or legacy equipment, existing private and public networks, and broadband connections to the Internet, without requiring any new capital expenditures.
[0051 ] The present invention is a system for duplex enhanced quality bidirectional video transmission over an Internet backbone. Among the improvements in the duplex enhanced quality video transmission system of the present invention is better performance while eliminating the need for specialized hardware.
[0052] FIG. 3 shows an implementation of a system 100 featuring bidirectional transmission of video between two users, with each using a personal computer via an Internet path. Generally, the system 100 includes: personal computers 1 1 0 and 1 1 5; a network access server 130; a network operation center ("NOC") server 160; and an Internet network 150.
[0053] Personal computer 1 1 0 may be a desktop computer, laptop, workstation or router, that is capable of connecting into the network access server 130 to establish a session 120. Personal computer 1 10 incorporates a secure client desktop software containing a computer network authentication protocol employing strong encryption, preferably IPsec. This Internet security protocol allows for cryptographic key establishment and authenticating and/or encrypting each IP packet in a data stream. IPsec or other like protocol is preferable because it functions at the network layer, which gives it more flexibility than many other security protocols in common use, such as SSH,
SSL/TL and Kerberos, which operate on the transport layer. This difference allows IPsec to secure packet flows. [0054] The secure client connects to a website hosted by network access server 130, where the user's identity is confirmed. The network access server 130 is a computer, or a group of hardware or software components or processes that execute in one or more computer systems.
[0055] The secure client and network access server 130 then create an encrypted network tunnel from the user's computer to a Network Operation Center ("NOC") server 160, via a private Internet backbone 150. This preferential Internet backbone 150 is a preferential route supplied by arrangement with a data network provider, using fiberoptic lines that are reserved to carry specially routed traffic, thus providing for greater available bandwidth for the bidirectional transmission of video between two or more users of the system.
[0056] The network access server 130 controls remote access to the preferential
Internet backbone 150 and to the NOC server 160, along route 170, forwarding the password that has been supplied by the user at personal computer 1 10 to the network access server 130.
[0057] The NOC server 160 is a computer, or a group of hardware or software
components or processes that execute in one or more computer systems. In part, the NOC server 1 60 performs authorization and authentication functions. The NOC server 160 has a directory established for each user, containing identification and password information and a list of approved users to whom each user may connect. The NOC server 1 60 utilizes the password forwarded by the network access server 130 to perform After authorizing and authenticating the user at personal computer 1 10, the NOC server 1 60 presents the user with the list of approved users to whom he can connect. For each approved user appearing in a contact list, the NOC server 160 also shows whether that user is online or not. As well, given the mobility of computers, the NOC server locates the initiating user as well as the addressee.
[0058] For example, the user at personal computer 1 10 wishes to communicate with the user at personal computer 1 15, but that user is not online. The user at personal computer 1 10 needs to contact the user at personal computer 1 15, via telephone, e-mail, text message, etc., and request that he sign into the system. The user at personal computer 1 1 5 goes through the identical process of connecting into the network access server 130 so as to establish his own session 1 25, and likewise being connected to the NOC server 160 via a preferential Internet backbone 150, along route 175. Once the personal computer 1 15 is online, the user at personal computer 1 1 0 will be able to request a bidirectional video session with personal computer 1 15. In one embodiment, the user at personal computer 1 15 must manually accept the bidirectional video session, whereas in a second embodiment, personal computer 1 15 may be set to an auto-answer mode, where the bidirectional video session will be established upon the request from personal computer 1 10.
[0059] The NOC server 160 will then communicate instructions to personal
computer 1 10 along route 1 70, and to personal computer 1 15 along route 175, providing each with a virtual address of the other. At this point, the secure clients in personal computers 1 10 and personal computers 1 15 will initiate a peer-to-peer connection over the preferential Internet backbone 150, along route 1 80. Personal computers 1 10 and 1 1 5 will exchange bidirectional video in this peer-to-peer mode along route 1 80. The personal computers 1 1 0 and 1 1 5 will remain in contact with the NOC server 160 along routes 170 and 1 75, respectively, so that the NOC server 160 may continue to provide control functions, but the video stream will not be sent to the NOC server 1 60. Eliminating the need to pass the video and audio data through the NOC server 160 provides for enhanced performance and security.
[0060] Instead of a personal computer, a user may use a room system, which is a videoconferencing station that typically includes large monitors with a wide- angle camera and serves groups of people who meet in a room and conference with other groups at remote locations. If the room system has an Internet connection, it would interface to the videoconferencing system in the manner of Personal Computer 1 15.
[0061 ] FIG. 4 shows an implementation in which one of the users is using a room system 220 which instead of having an Internet connection has a private network 230. The privately networked room system 220 connects to the NOC server 160 through a hardware bridge 210. The room system 220 communicates with the NOC server 160 along route 240, allowing for the NOC server 160 to perform the authorization and authentication functions. The bridge 210 also serves as a conduit for the video signals, as they travel along path 180 between personal computer 1 10 and the room system 220. As path 180 is routed partially on the preferential Internet backbone 150 and partially on the private network 230, a high bandwidth is available, resulting in excellent image quality and reduced latency. As well, the NOC can provide enhanced video and audio exchange capabilities by providing features such as automated or individually directed control room activities. For example, all users' images can be arranged around an enlarged central image provided for the speaker or a desired illustration. [0062] FIG. 5 shows another implementation, in which three or more users are participating in a videoconference. In this situation, the users log into a meeting room 360 which is provided at the NOC server 160. The meeting room 360 multiplexes each incoming video signal, and sends a copy to the other participants. Each participant will see each other participant in the video conference call, providing for enhanced security, as no one may monitor a video conference unless they are a participant and their presence is seen by all other participants. One or more of the participants in a videoconference with three or more participants may be employing a room system with a private network that interfaces to the invention through a hardware bridge located at the NOC server, as previously described and shown in FIG. 4.
[0063] In another embodiment, the secure client can be tailored for a particular application or industry, such as having the video only take up part of the screen, with the remainder of the screen dedicated to another task, such as displaying a user-completed form.
[0064] In the foregoing, the present invention has been described with reference to certain exemplary embodiments. It will, however, be evident that various modifications and changes can be made thereto without departing from the spirit and scope of the invention, the scope of which is defined by the appended claims.

Claims

CLAIMS What is claimed is:
1 . A system for sending over the Internet audio/video data from one first location, and receiving over the Internet a reproduction of said first audio/video data at a second location, and sending over the Internet audio/video data from said second location, and receiving over the Internet a reproduction of said second audio/video data at said first location, said system comprising:
a. a means for converting said first audio/video data in said first location into a first stream of packets to be sent over the Internet;
b. a means for converting said second audio/video data in said second location into a second stream of packets to be sent over the Internet;
c. a means for converting said first stream of packets in said second location into said reproduction of said first audio/video data to be presented in said second location;
d. a means for converting said second stream of packets in said first location into said reproduction of said second audio/video data to be presented in said first location;
e. a means for allowing said first location and said second location to exchange said first stream of packets and said second stream of packets over a preferential Internet backbone, said system being capable of sending said first audio/video data and said second audio/video data and receiving said reproduction of said first audio/video data and said reproduction of said second audio/video data in a full duplex, full motion, and live manner.
2. The system of claim 1 , a. wherein said means for converting said first and second audio/video data comprises:
i. an audio/video encoder/decoder producing an audio/video encoder output data stream;
ii. a packetizer accepting a processed form of said audio/video encoder output data stream after processing of said audio/video encoder output data stream between said audio/video encoder/decoder and said packetizer, said packetizer producing a stream of unbuffered packets; and
iii. a modem accepting said first and second streams of packets and sending said first and second streams of packets over said Internet, said first and second streams of packets being produced by buffering of said stream of unbuffered packets, and b. wherein said means for converting said first and second stream of packets comprises:
i. a modem receiving said first and second streams of packets from the
Internet; and
ii. an audio/video encoder/decoder receiving an audio/video decoder input data stream from said modem after processing of said first and second streams of packets between said modem and said audio/video encoder/decoder.
3. The system of claim 1 , wherein said first and second streams of packets are transmitted over said preferential Internet backbone in a peer-to-peer mode between said first location and said second location.
4. The system of claim 2, wherein said first and second streams of packets are transmitted over said preferential Internet backbone in a peer-to-peer mode between said first location and said second location.
5. The system of claim 3, wherein said means for allowing said first location and said second location to exchange said first stream of packets and said second stream of packets over a preferential Internet backbone, comprises a network access server which allows said first and second locations to access said preferential Internet backbone, and thereby access a NOC server, which is adapted to perform
authorization and authentication functions upon being accessed by an authorized user at each location, and which contains a database for those functions, said database also containing a list of at least one other authorized user that a particular authorized user may contact for a videoconference, said NOC server passing virtual address information between said first and second locations, allowing said first and second locations to thereafter communicate in a peer-to-peer mode over said preferential Internet backbone, without said first and second streams of packets passing through said NOC server.
6. The system of claim 4, wherein said means for allowing said first location and said second location to exchange said first stream of packets and said second stream of packets over a preferential Internet backbone, comprises a network access server which allows said first and second locations to access said preferential Internet backbone, and thereby access a NOC server, which performs authorization and authentication functions upon being accessed by an authorized user at each location, and which contains a database for those functions, said database also containing a list of each user that a particular authorized user may contact for a videoconference, said NOC server passing virtual address information between said first and second locations, allowing said first and second locations to thereafter communicate in said peer-to-peer mode over said preferential Internet backbone, without said first and second streams of packets passing through said NOC server.
7. A system for sending over the Internet audio/video data from one first location, and receiving over a private network a reproduction of said first audio/video data at a second location, and sending over a private network audio/video data from said second location, and receiving over the Internet a reproduction of said second audio/video data at said first location, said system comprising:
a. a means for converting said first audio/video data in said first location into a first stream of packets to be sent over the Internet;
b. a means for converting said second audio/video data in said second location into a second stream of packets to be sent over the private network;
c. a means for converting said first stream of packets in said second location into said reproduction of said first audio/video data to be presented in said second location;
d. a means for converting said second stream of packets in said first location into said reproduction of said second audio/video data to be presented in said first location; and
e. a means for allowing said first location and said second location to exchange said first stream of packets and said second stream of packets via the Internet and the private network, wherein the exchange of the streams of packets via the Internet is conducted over a preferential Internet backbone, said system being capable of sending said first audio/video data and said second audio/video data and receiving said reproduction of said first audio/video data and said reproduction of said second audio/video data in a full duplex, full motion, and live manner.
8. The system of claim 7, wherein the second location is a room system with a private network, which communicates with a NOC server through a hardware bridge, wherein said NOC server is adapted to perform authorization and authentication functions upon being accessed by an authorized user at each location, and which contains a database for those functions, said database also containing a list of at least one other authorized user that a particular authorized user may contact for a videoconference, said NOC server functioning as said means for allowing said first location and said second location to exchange said first stream of packets and said second stream of packets.
9. A system for sending over the Internet audio/video data from three or more locations, and receiving over the Internet at each location a reproduction of audio/video data from each other location, said system comprising:
a. a means for converting said audio/video data in each location into a stream of packets to be sent over the Internet;
b. a means for allowing said three or more locations to transmit streams of packets over a preferential Internet backbone, wherein said means comprises a network access server which allows said three or more locations to access said preferential Internet backbone, and thereby access a NOC server, which said NOC server is adapted to perform authorization and authentication functions upon being accessed by an authorized user at each location, and which contains a database for those functions, said database also containing a list of at least one other authorized user that a particular authorized user may contact for a videoconference, said NOC server creating a virtual meeting room in which said three or more locations may conduct a videoconference, with the NOC server multiplexing audio/video data from each of the three or more locations, and preparing for each location a stream of packets comprising a presentation of the audio/video data from each other location; d. a means for transmitting from the NOC to each location said stream of packets comprising a presentation of the audio/video data from each other location; e. a means for converting at each location said stream of packets comprising a presentation of the audio/video data from each other location into a reproduction of said presentation of said audio/video data from each other location; and
f. said system being capable of sending said audio/video data from each location and receiving said reproduction of said presentation of said audio/video data from each other location in a full duplex, full motion, and live manner.
10. A system for sharing audio/video data from three or more locations, with one or more locations connected to the Internet, and with one or more locations connected to a private network, each sending its associated audio/video data to every other location, and receiving a reproduction of audio/video data from every other location, said system comprising:
a. a means for converting said audio/video data in each location into a stream of packets;
b. a means for transmitting said stream of packets from said one or more locations connected to the Internet via a preferential Internet backbone, wherein said means comprises a network access server which allows said one or more locations connected to the Internet to access said preferential Internet backbone, and thereby access a NOC server;
c. a means for transmitting said stream of packets from said one or more locations connected to a private network, wherein said private network communicates with the NOC server via a hardware bridge;
d. said NOC server being adapted to perform authorization and authentication functions upon being accessed by an authorized user at each location, and which contains a database for those functions, said database also containing a list of at least one other authorized user that a particular authorized user may contact for a videoconference, said NOC server creating a virtual meeting room in which said three or more locations may conduct a videoconference, with the NOC server multiplexing audio/video data from each of the three or more locations, and preparing for each location a stream of packets comprising a presentation of the audio/video data from each other location;
e. a means for transmitting from the NOC to each location said stream of packets comprising a presentation of the audio/video data from each other location; f. a means for converting at each location said stream of packets comprising a presentation of the audio/video data from each other location into a reproduction of the presentation of the audio/video data from each other location; and
g. said system being capable of sending said audio/video data from each location and receiving said reproduction of said audio/video data from each other location in a full duplex, full motion, and live manner.
1 1 . The system of claim 9, wherein the presentation of the audio/video data from each other location includes an enlarged image of the video data from one of said each other locations.
12. The system of claim 10, wherein the presentation of the audio/video data from each other location includes an enlarged image of the video data from one of said each other locations.
13. The system of claim 1 1 , wherein the system selects the video data for the enlarged image from one of said each other locations when a user at that location begins to speak.
14. The system of claim 12, wherein the system selects the video data for the enlarged image from one of said each other locations when a user at that location begins to speak.
15. A system for multiuser audio/video communications comprising: a server computer comprising at least a tangible processor operatively coupled to a computer-readable data storage device storing instructions which, when executed on the processor, cause the server to implement: a registration server, that monitors when each of a plurality of user terminals logs into or out of the system, keeps track of which user terminals are logged onto the system in real time, and records the corresponding network address of each respective logged on user terminal; a port scanning server, that receives information from a first user terminal about at least one second user terminal invited to engage in audio/video communications with the first user terminal, and that seeks the second user terminals on one or more networks accessible to the first user terminal, and to identify at least one open ports on a respective network to which one or more of the user terminals is communicatively coupled, the identified network port(s) for use in facilitating an audio/video
communication connection between the first and the second user terminals; wherein, if only the first user terminal and one second user terminal are engaging in audio/video communications, the audio/video communications between the first and second terminals occur in a peer to peer mode;
a media server, that acts as a bridge coupled to the first user terminal and at least one second user terminal, through which audio/video communications to and from each of the coupled user terminals passes; and
a resource monitor operatively coupled to the registration server, the port scanning server, and the media server, that monitors their demand for computing and network resources, including at least processing, storage, and bandwidth resources, and allocates available computing and network resources to the servers in accordance with their demand.
16. A method of facilitating multiuser audio/video communications comprising: by a registration server:
monitoring, when each of a plurality of user terminals logs into or out of a system for facilitating multi-party audio/video communications;
keeping track of which user terminals are logged onto the system in real time; and
recording the corresponding network address of each respective logged on user terminal;
by a port scanning server:
receiving information from a first user terminal about at least one second user terminal invited to engage in audio/video communications with the first user terminal; seeking the second user terminal(s)s on one or more networks accessible to the first user terminal; identifying at least one open port on a respective network to which one or more of the user terminals is communicatively coupled, the identified network port(s) for use in facilitating an audio/video communication connection between the first and the second user terminals;
wherein, if only the first user terminal and one second user terminal are engaging in audio/video communications, the audio/video communications between the first and second terminals occur in a peer to peer mode; and
by a media server, bridging the first user terminal and the at least one second user terminal, through which audio/video communications to and from each of the bridged user terminals passes; and
by a resource monitor operatively coupled to the registration server, the port scanning server, and the media server:
monitoring the registration server, the port scanning server, and the media server demand for computing and network resources, including at least processing, storage, and bandwidth resources; and
allocating available computing and network resources to the registration server, the port scanning server, and the media server in accordance with their demand.
PCT/US2015/030061 2014-05-09 2015-05-11 Full duplex high quality audio/video communication over internet WO2015172126A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201461991101P 2014-05-09 2014-05-09
US61/991,101 2014-05-09

Publications (1)

Publication Number Publication Date
WO2015172126A1 true WO2015172126A1 (en) 2015-11-12

Family

ID=54393081

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2015/030061 WO2015172126A1 (en) 2014-05-09 2015-05-11 Full duplex high quality audio/video communication over internet

Country Status (1)

Country Link
WO (1) WO2015172126A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105700846A (en) * 2016-03-18 2016-06-22 北京小米移动软件有限公司 File sharing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100005497A1 (en) * 2008-07-01 2010-01-07 Michael Maresca Duplex enhanced quality video transmission over internet
US20100064333A1 (en) * 2008-09-05 2010-03-11 Skype Limited Communication system and method
US20120194635A1 (en) * 2011-01-27 2012-08-02 Audio Properties, Inc. System and Method for Real-Time Media Data Review
US20130174231A1 (en) * 2011-12-28 2013-07-04 Arbitron, Inc. Media exposure data collection and security

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100005497A1 (en) * 2008-07-01 2010-01-07 Michael Maresca Duplex enhanced quality video transmission over internet
US20100064333A1 (en) * 2008-09-05 2010-03-11 Skype Limited Communication system and method
US20120194635A1 (en) * 2011-01-27 2012-08-02 Audio Properties, Inc. System and Method for Real-Time Media Data Review
US20130174231A1 (en) * 2011-12-28 2013-07-04 Arbitron, Inc. Media exposure data collection and security

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105700846A (en) * 2016-03-18 2016-06-22 北京小米移动软件有限公司 File sharing method and device

Similar Documents

Publication Publication Date Title
US20100005497A1 (en) Duplex enhanced quality video transmission over internet
US9055312B2 (en) System and method for interactive synchronized video watching
US20250184169A1 (en) Video conference acceleration
US20230008701A1 (en) Enabling sub-meetings in encrypted video conferences
JP2008022552A (en) CONFERENCE METHOD AND CONFERENCE SYSTEM
US9374232B2 (en) Method and a device for optimizing large scaled video conferences
US20250063074A1 (en) Controlling presentations in video conferences
US11800058B2 (en) Generating composite presentation content in video conferences
US12388887B2 (en) Distributing compressed video frames in a video conference
US20250301102A1 (en) Recording virtual whiteboards during virtual meetings
US9013537B2 (en) Method, device, and network systems for controlling multiple auxiliary streams
WO2012059280A2 (en) System and method for multiperspective telepresence communication
WO2015172126A1 (en) Full duplex high quality audio/video communication over internet
CN110830752A (en) Video conference host
US9967345B2 (en) Split screen teleconferencing
US20240037371A1 (en) Detecting audible reactions during virtual meetings
US12170577B2 (en) Videoconference audiovisual controller
US12335059B1 (en) Enriching event assets for video conferences via aggregating content in a lifecycle of a video conference
US20240372904A1 (en) Enabling virtual conferencing within a virtual desktop environment
WO2006088181A1 (en) Streaming delivery system and relay center used therein
Fehr Collaboration versus Conferencing-Where We Are and How We Got Here

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15789940

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15789940

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: FESTSTELLUNG EINES RECHTSVERLUSTS NACH REGEL 112(1) EPUE (EPA FORM 1205A VOM 19.04.2017)

122 Ep: pct application non-entry in european phase

Ref document number: 15789940

Country of ref document: EP

Kind code of ref document: A1