[go: up one dir, main page]

HK1143874B - Voicemail filtering and transcription - Google Patents

Voicemail filtering and transcription Download PDF

Info

Publication number
HK1143874B
HK1143874B HK10110280.6A HK10110280A HK1143874B HK 1143874 B HK1143874 B HK 1143874B HK 10110280 A HK10110280 A HK 10110280A HK 1143874 B HK1143874 B HK 1143874B
Authority
HK
Hong Kong
Prior art keywords
message
user
type
transcription
voicemail
Prior art date
Application number
HK10110280.6A
Other languages
Chinese (zh)
Other versions
HK1143874A1 (en
Inventor
J‧U‧斯卡科巴耶克
C‧W‧菲兹格拉德
Original Assignee
阿瓦亚公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US11/709,513 external-priority patent/US8107598B2/en
Application filed by 阿瓦亚公司 filed Critical 阿瓦亚公司
Publication of HK1143874A1 publication Critical patent/HK1143874A1/en
Publication of HK1143874B publication Critical patent/HK1143874B/en

Links

Description

Voicemail filtering and transcription method
Cross-referencing
This application relates to the following U.S. patent applications:
voice mail Filtering and Transcription (Voice mail Filtering and Transcription), U.S. application No. ____
[ attorney docket No.30519.716.202], inventors: jens Ulrik Skakkebaek and Cary w.fitzgerald, filed concurrently with the present application; and
voice mail Filtering and Transcription (Voice mail Filtering and Transcription), U.S. application No. ____
[ attorney docket No.30519.716.203], inventors: jens Ulrik Skakkebaek and Cary w.
Technical Field
The present disclosure relates generally to integrated communication and messaging systems, and more particularly to voicemail transcription in such systems.
Background
Today, almost everyone uses more than one communication technology or media for multiple communications each day. Communication media includes electronic mail ("email") messaging, short message messaging ("SMS") services, voice messaging, and so forth. Users receive and send messages over wired and wireless networks via a variety of devices such as desktop computers, wired telephones, wireless devices (e.g., telephones and personal digital assistants ("PDAs)), and the like.
Currently, the method can be used inReceives an e-mail on a mobile telephone device. Notification of voicemail via email may also be received on any email-enabled device. In some systems, the email notification includes a playable audio file of the message (e.g., such as a WAV file) so that the user can hear the message without calling into the voicemail system. By allowing voice callers to be redirected from the user's "old" telephone number to the commercial provider of their system,such a voicemail/email collection is available to individual users. Alternatively, some providers offer a different number to the user for voicemail-to-email processing. Additionally, a full integration of communication media within an enterprise is available from Adomo corporation. For example, the Adomo "unified communications" approach integrates tightly with existing enterprise communications and data management systems to provide employees with seamless access to all types of messages on all devices, regardless of the physical location of the employee.
With the proliferation of functional devices and systems, users are increasingly demanding that their messages be readily available, if not immediately, on all of their devices, regardless of message type or source. At the same time, the popularity of high-functionality communication devices only seems to increase the expectations of message recipients that their messages will be received, understood, and responded to appropriately very quickly. Even though users have more and faster access to voice mail and email than ever before, there are still areas where the time for understanding and/or responding to messages may be relatively slow. For example, a user may receive an email notification that a voicemail has been received, but may not be able to immediately access and/or listen to the voicemail. If the user is in a meeting, it may be acceptable to swipe his or her device at a glance to see which message was received, but not to listen to the voicemail. In some cases, the identity of the voicemail sender may be known by a notification that may provide some degree of information about the urgency of the message. In many cases, there is no information other than the notification itself.
To eliminate the inconvenience and delay of having to listen to voice mail, user devices have been developed (e.g., such as) A method of providing a voicemail transcript rather than an audio data file or a link to an audio data file. For example,limited provides a voicemail transcription service that requires a mobile network provider to install specific voicemail software. The network provider sends the voicemail data to the transcriber, which plays and transcribes the voice message and then sends the transcribed text to the user's device. This approach has the disadvantage of lacking full integration with the user's email system. Part of the communication is the original voice call, which is disconnected from the later email sending the transcribed text. Thus, the entire communication history is not readily available, for example, for archival or auditing purposes. Another disadvantage is that each voicemail is processed in the same manner, regardless of whether any user benefit is gained from performing the transcription.
SimulScribeTMCompanies provide another traditional example of voicemail transcription. Simul ScribeTMProviding services that include redirecting a user's caller to an intermediate voicemail system that performs transcription of all voicemails and forwards the text results to the user's phone. This method has the same disadvantages as described above. Other disadvantages of many prior art methods include requiring the user to give the caller a different number in order to receive the transcription, and the lack of privacy or confidentiality assurance for the caller who may not wish to transcribe their voicemail, or may wish to control the disposition of the transcript.
Is incorporated by reference
All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Drawings
FIG. 1 is a block diagram of a system including an integrated communication system ("ICS"), according to one embodiment.
FIG. 2 is a flow diagram of filtering a voicemail and generating a rough transcription, according to one embodiment.
FIG. 3 is a flow diagram of filtering a voicemail and generating a rough transcription, according to one embodiment.
FIG. 4 is a block diagram of a system including an integrated communication system ("ICS") and illustrates a flow of a process for obtaining a refined transcription, according to one embodiment.
FIG. 5 is a flow diagram of a process of obtaining a refined transcription, according to one embodiment.
FIG. 6 is a block diagram of a system including an integrated communication system ("ICS"), and illustrates a flow of a process for obtaining a refined transcription, according to one embodiment.
FIG. 7 is a block diagram of a web page for listening to and transcribing a voicemail, according to one embodiment.
FIG. 8 is a flow diagram of a process of obtaining a refined transcription according to one embodiment, such as the embodiments of FIGS. 6 and 7.
FIG. 9 is a block diagram of a system including an ICS, according to one embodiment.
FIG. 10 is a block diagram of a system showing further details of a communication server, according to one embodiment.
FIG. 11 is a block diagram of a system including a communication server and interface module and a messaging server, according to one embodiment.
FIG. 12 is a block diagram illustrating interactions between components of an interface module ("IM") and a messaging server ("MSERV") environment, according to one embodiment.
FIG. 13 is a block diagram of a system including an integrated communication system ("ICS") with a form-based user interface ("FBUI"), according to one embodiment.
FIG. 14 is an example FBUI displayed on a client device according to one embodiment.
In the drawings, like reference numbers identify identical or substantially similar elements or acts. To facilitate identification of the discussion of any particular element or act, one or more of the most significant digits in a reference number refer to the figure number in which that element is first introduced (e.g., element 110 is first introduced and discussed with respect to FIG. 1).
Detailed Description
Systems and methods for voicemail filtering and transcription are described herein. According to various embodiments, the integrated communication system performs filtering and transcription of the voicemail and forwards the voicemail to the user's email function device via email. For example, an email is sent to a system including an email server, from which the email is sent to the user's device. In one embodiment, the filter/transcribe module filters the user-received voicemail automatically or on user-specified requirements. Filtering includes looking up predetermined words in the voicemail. One filtering result is to determine the relative urgency of the voicemail message. The integrated communication system further performs a rough transcription of the voicemail, either automatically or upon user-specified requirements. The rough transcription is not intended to be word-by-word, but rather provides sufficient message content to allow the user to very quickly review the rough transcription and determine the appropriate action to take to respond to the voicemail. According to an embodiment, the rough transcription is entered as text in an email sent to the user. In various embodiments, the audio file of the original voicemail is an attachment to an email. Further, if the voicemail message is determined (by filtering) to be urgent, a priority flag indicating a high priority is attached to the email. In one embodiment, the user may request a refined transcription of the voicemail by pressing a button on the user device. The refined transcription is a highly accurate voicemail transcription. In one embodiment, the rough transcription is replaced with the refined transcription in the original email, and the original email is marked as "unread" on the user's device in the user's email inbox.
As used herein, an "integrated communication system" or "ICS" integrates different types of messaging such that a user of the ICS can access multiple types of messages (e.g., voicemail messages, electronic mail, email messages, instant messaging messages, SMS (short messaging system) messages, MMS (multimedia messaging system) messages, etc.) using a single message interface. The ICS of an embodiment mitigates reliance on a voicemail system when providing integrated messaging functionality via a single message interface, for example, by providing users with access to voicemail messages and the ability of the voicemail system through a local groupware application and an email messaging system. The systems and methods described herein are applicable to any ICS. In one embodiment, the ICS is part of an enterprise system and is integrated with an enterprise groupware application, although the claimed invention is not so limited. In other embodiments, the ICS is not part of an enterprise system, but is accessible to users, e.g., via the internet and/or a wireless communication network.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of, and enabling description for, the filtering and transcription embodiments. One skilled in the relevant art will recognize, however, that the embodiments can be practiced without one or more of the specific details, or with other components, systems, etc. In other instances, well-known structures or operations are not shown, or are not described in detail, to avoid obscuring aspects of the disclosed embodiments.
FIG. 1 is a block diagram of a system 110 including a networked environment 102. The networked environment 102 includes one or more networks of any kind over which data can be communicated, including a local area network ("LAN"), a wide area network ("WAN"), the internet, and any wired or wireless communication network, in any combination. System 110 also includes ICS110 and messaging server/messaging store 124. ICS110 communicates with a public branch exchange ("PBX") 120 to receive telephone calls, including voice mails, for users. ICS110 further includes a filter/transcription module ("F/T module") 112. F/T module 112 accesses audio file 114 of the voicemail message as described further below. In various embodiments, the audio file is generated by any conventional method typically employed by a voicemail system, such as a voicemail system that is part of ICS 110. The audio files 114 may be generated on other devices in the networked environment 102, such as a mobile device. Waveform audio format file ("WAV file") 114 is shown as an example, but embodiments are not so limited. For example, in alternative embodiments, the audio file may have any other electronic audio data format. In a further embodiment, the source of audio data is not a voicemail, but any other audio data transmitted over a network, including, for example, audio files whose source is an internet website.
F/T module 112 also performs filtering of the voicemail messages, which includes searching for predetermined words in the voicemail. In one embodiment, the word being searched is in a word list that contains certain default words that mean urgency, such as "urgent," "important," "immediate," "ASAP," and the like. However, filtering may include searching for any word having any meaning, including words or names indicating that the message is "non-urgent". Additionally, in some embodiments, the user may add words to the word list, including the name of the person who may be the caller or the subject of the voicemail message. The user may specify that these added words are always included in the rough transcription if they are found in the search. Furthermore, the user may specify that if a particular word from the word list is found in the search, other words are included in the rough transcription. For example, if "company X" is found, then "highest priority" is included in the rough transcription. The rough transcription facilitates the user in determining an appropriate response to the voice message. In yet another embodiment, ICS110 is integrated with an enterprise groupware application, and users are members of the enterprise. In this case, all of the enterprise data is available to F/T module 112, including contact lists, user voicemail preferences, user email preferences, and the like. In yet another embodiment, the user may specify other sources of words used in the search in the user's networked environment. Web-based Consumer Relationship Management (CRM) applications, consumer support systems, and internal accounting systems are just a few examples, but many other sources exist.
Any such information may be used to filter the voicemail as desired. If a predetermined word is found in the voicemail, a priority email flag is generated. The priority email flag conforms to the user device 122 and is a visual cue of urgency of an email message in the message list, although embodiments are not so limited. The priority flag may also include an audio notification or alert, rather than a visual flag; or the priority flag may include an audio notification or warning in addition to the visual flag.
In one embodiment, F/T module 112 further includes an Intelligent Voicemail Handler (IVH). The IVH actively requests a refined transcription, for example by applying rules to the information. As an example, the information available to the IVH includes a user's calendar. This allows the IVH to automatically set a particular non-intrusive notification type for emails containing voicemail messages when the IVH knows that the user is in a meeting. Also, when the user is in a meeting, the IVH automatically requests a refined transcription. The IVH in some embodiments includes an adaptive rules engine that modifies its behavior based on history, including which words appear more in emails for which the user requests a refined transcription, and so on.
F/T module 112 performs a rough transcription of the voicemail, either automatically or on user-specified requirements. The rough transcription is not intended to be word-by-word, but rather provides sufficient message content to allow the user to very quickly review the rough transcription and determine the appropriate action to take to respond to the voicemail. According to one embodiment, the rough transcription is entered as text 118 in an email 116 sent to the user via a messaging server/messaging store 124 (as shown by arrow 1). In one embodiment, audio file 114 is also attached to email 116, and the user can listen to the voicemail by playing audio file 114 on user device 122.
FIG. 2 is a flow diagram of a process 200 of filtering a voicemail and generating a rough transcription, according to one embodiment. At 202, a voicemail is received from the PBX 120. At 204, F/T module 112 accesses audio file 114 and filters and coarsely transcribes the voicemail at 206. If the filtering indicates any urgency in the voicemail at 208 (or if the degree of importance is determined to be high), a priority flag is generated at 210. At 212, the priority flag, audio file, and rough transcription are sent to the device via the messaging server/messaging storage.
If the filtering does not indicate urgency, the audio file and rough transcription are sent to the device via the messaging server/messaging memory at 214. In various embodiments, the actual order of events may not be as shown in FIG. 2. Process 200 is merely one example of a claimed process. In other embodiments, the email may be stored in a messaging server/messaging storage before filtering and/or transcription is performed. In yet another embodiment, filtering and storing may occur prior to transcription. Many other variations on the order of the acts described are within the scope of the claims.
FIG. 3 is a flow diagram of a process 300 for filtering voicemails and generating a rough transcription, according to one embodiment. At 302, ICS110 determines whether a filter/transcription component (feature) is open. In various embodiments, the member may be completely closed or configured to operate in various ways. If the filter/transcribe component is not on, the voicemail is not filtered or transcribed, as shown at 306. If the filter/transcribe component is open, the recipient of the voicemail, also referred to herein as the user, is identified at 304. According to one embodiment, the identity of the recipient is used to search for any data in the system related to the recipient. For example, the user may specify preferences that constitute the behavior of the filtering/transcription module, as described further herein. Additionally, in embodiments that include enterprise ICS, user data of enterprise directory systems (e.g., such as contact lists) and other enterprise sources can be used to add to word lists and/or determine preferences.
At 310, filtering is performed using word recognition (for the word list) and recipient data. At 312, the email is transmitted to the user device via the messaging server/messaging store using the recipient data. For example, recipient data may include sending instructions (e.g., send all voicemails using default procedures (coarse transcription, WAV, and flag; if a particular word is found, send at the same time for fine transcription, etc.). Refined transcription is explained further below.
FIG. 4 is a block diagram of a system 400 including an integrated communication system ("ICS") 110, and illustrates a flow of a process to obtain a refined transcription, according to one embodiment. System 400 includes networked environment 102 and networked environment 402. Environments 102 and 402 may be the same networked environment, such as different areas of a LAN or WAN, although embodiments are not so limited. Alternatively, environments 102 and 402 are distinct networked environments. The networked environment 102 includes a messaging server/messaging store 124 that currently includes the email message 116. The email message 116 includes the audio file 114 and the text body 118 as attachments. In an alternative embodiment, text 118 may also be an attachment, but it is generally preferable for the user to view text 118 as an email message body.
ICS110 includes F/T module 112. At least one network, represented herein as network 404, is coupled to environments 102 and 402. As described herein, "network" always means any one or more of the network types listed previously. Networked environment 402 includes a computer 406, which will be referred to herein as a transcriber computer. As used herein, transcriber computer 406 or transcriber 406 encompasses both computers and manual transcribers that perform transcription using speech recognition software. In some embodiments, computer 406 is a device that performs refined transcription, while in other embodiments, computer 406 is a machine used by a manual transcriber. In either case, computer 406 is a device at which a request for a refined transcription is received from F/T module 112, as described further herein. The file server 408 is coupled to the environments 102 and 402 via the network 404. In other embodiments, file server 408 is not a "web server" coupled as shown, but is a file server included in a networked environment, such as environment 102 or environment 402. In general, the file server 408 is accessible to both the environment 102 and the environment 402.
With reference to the numbered arrows on the chart, the process for obtaining a refined transcription will now be described. As indicated by arrow 1, the email message 116 is displayed on the user device 122. If applicable, the user may open and view the email message 116 in the email inbox of the device 122 along with the priority flag. When email 116 is open, text 118 may be considered the body of email 116. The user can quickly review the rough transcription provided by text 118 and decide whether a refined transcription is necessary. The user can determine at a glance from text 118 at least the following: no immediate action has to be taken to respond to the voicemail message; known actions should be taken and urgency is known; or the urgency or ambiguity of the text 118 requires refined transcription. As used herein, "refined transcription" refers to a transcription of the audio file of the original voicemail message that will be determined by the user to be more complete and more accurate.
If the user decides that a refined transcription is required, the user sends a request to ICS110, as shown by arrow 2. In one embodiment, the user simply presses a button on the device 122 to make this request. Optionally, the request is made, for example, by voice command, or always automatically. The request is received by F/T module 112, and F/T module 112 responds by retrieving audio file 114 from messaging server/messaging storage 124 and placing it on file server 408 via network 404 as indicated by arrow 3. In one embodiment, F/T module 112 generates the request for the refined transcription in the form of instant message 410. Instant message 410 is sent to computer 406 as shown by arrow 4. Computer 406 receives instant message 410. In the case of a manual transcriber, the transcriber reads instant message 410, and instant message 410 includes instructions as to where to access file server 408 to retrieve audio file 114. The transcriber fetches audio file 114, as indicated by arrow 5. The transcriber listens to audio file 114 while typing the refined transcription into a designated area of the user's screen in the usual manner for instant messaging. As indicated by arrow 6, the completed refined transcription is sent back to ICS110 via instant message 410. In an alternative embodiment, the transcriber sends the completed refined transcription via any electronic message, including but not limited to an email message. Instant messaging is just one example of an electronic message that may be used for this purpose.
F/T module 112 replaces the rough transcription in text 118 with the refined transcription and marks the original email as "unread" as indicated by arrow 7. F/T module 112 then sends a notification (arrow 8) to the user to indicate that the request for refined transcription has been satisfied. The user is now able to view the original e-mail marked unread and containing the refined transcription in the device 122 inbox.
FIG. 5 is a flow diagram of a process 500 of obtaining a refined transcription, according to one embodiment. At 502, F/T module 112 receives a request from a user to obtain a refined transcription. In various embodiments, the user may send the request by pushing a button on the device 122. In alternative embodiments, the request may be automatically generated based on user preferences, based on finding a particular word in the voicemail, or the like. Many alternatives are within the scope of the claimed invention. For example, a refined transcription may always be automatically requested, a refined transcription may be automatically requested without performing a rough transcription, but performing filtering, etc. At 504, F/T module 112 sends instant message 410 to computer or transcriber 406 with the instruction (indication) to audio file 114. The computer or transcriber 406 takes the audio file 410 and listens to the file while typing the refined transcription into the transcriber's screen area; the refined transcription is then sent back to ICS110 via instant message 410 at 506.
F/T module 112 receives the refined transcription via instant message 410, updates the rough transcription in original email message 116 by replacing text 118 with the refined transcription, and marks email message 116 as "unread" at 508. At 510, F/T module 112 signals to user equipment 122 that the transcription request has been satisfied. In various embodiments, signaling may include one or more different forms of notification, including special email flags, audio alerts, and the like.
FIG. 6 is a block diagram of a system 600 that includes ICS110, and illustrates a flow of a process for obtaining a refined transcription, according to one embodiment. System 600 includes networked environment 102 and networked environment 402. The networked environment 102 includes a messaging server/messaging store 124 that currently includes the email message 116. The email message 116 includes the audio file 114 as an attachment, and a body of text 118. In an alternative embodiment, text 118 may also be an attachment, but it is generally preferable for the user to view text 118 as an email message body.
ICS110 includes F/T module 112. At least one network, indicated herein as network 404, is coupled to environments 102 and 402. Networked environment 402 includes a transcriber computer. Transcriber computer 406 or transcriber 406, as used herein, encompasses both computers and manual transcribers that perform transcription using speech recognition software. In some embodiments, computer 406 is a device that performs refined transcription, while in other embodiments, computer 406 is a machine used by a manual transcriber. In either case, computer 406 is the device where the request for the refined transcription is received from F/T module 112, as described further herein. The file server 408 is coupled to the environments 102 and 402 via the network 404. As described above, file server 408 may be any file server in any location accessible by environment 102 and environment 402, and is not limited to the configuration shown.
With reference to the numbered arrows on the chart, the process for obtaining a refined transcription will now be described. The email message 116 is displayed on the user device 122 as indicated by arrow 1. If applicable, the user can open and view the email message 116 in the email inbox of the device 122 along with the priority flag. When email 116 is opened, text 118 is considered the body of email 116. The user can quickly review the rough transcription provided by text 118 and decide whether a refined transcription is necessary. The user can determine at a glance from text 118 at least the following: no immediate action has to be taken to respond to the voicemail message; known actions should be taken and urgency is known; or the urgency or ambiguity of the text 118 requires refined transcription.
If the user decides that a refined transcription is required, the user sends a request to ICS110, as shown by arrow 2. In one embodiment, the user simply presses a button on the device 122 to make this request. The request is received by F/T module 112, and F/T module 112 responds by retrieving audio file 114 from messaging server/messaging storage 124 and placing it on file server 408 via network 404, as indicated by arrow 3. In one embodiment, F/T module 112 generates a request for a refined transcription in the form of notification 602. In various embodiments, notification 602 includes instant messages, emails, SMS, and voice messages, although embodiments are not so limited. As indicated by arrow 4, a notification 602 is sent to computer 406. Computer 406 receives notification 602. In the case of a manual transcriber, the transcriber reads the notification 602, and the notification 602 includes instructions as to where to access the file server 408 to retrieve the web page that includes the audio file 114. The transcriber navigates to the web page and retrieves audio file 114, as indicated by arrow 5. The transcriber listens to the audio file 114 while typing the refined transcription onto the web page (as further illustrated with reference to fig. 7). When the refined transcription is complete, the transcriber clicks a button or link on the web page to send the refined transcription to F/T module 112, as shown by arrow 6.
F/T module 112 replaces the rough transcription in text 118 with the refined transcription and marks the original email as "unread" as indicated by arrow 7. F/T module 112 then sends a notification (arrow 8) to the user to indicate that the request for refined transcription has been satisfied. The user now views the original email marked unread and containing the refined transcription in the device 122 inbox. The method shown and described with reference to fig. 6 is only one example of an embodiment. Alternatively, for example, the email is not marked as "unread," but rather a warning is sent to the user. Also optionally, the refined transcription is appended to the rough transcription, rather than replacing it. As another alternative, a second email containing the refined transcription is sent to the user.
In the embodiments described herein, voicemail messages are transcribed for email users without using more than one email message. For embodiments in which the original email message containing the audio file is the same email used by all of the processes described herein, the ease of tracking the history of the message is greatly enhanced. It is easier for the user to track the history of the message thread. It is also easier for a user and/or business manager to archive message threads that include transcription processes. There is an increasing need for complete and accurate archiving of messages to comply with auditing procedures, legal discovery procedures, the U.S. Security Exchange Commission (SEC) procedures, and the like.
FIG. 7 is a block diagram of a web page 700 on a computer 406 for listening to and transcribing a voicemail, according to one embodiment. Web page 700 is one embodiment of the web page mentioned above with reference to fig. 6. Web page 700 includes information about the voicemail, such as the identity of requester 708 and the time of request 710. There is a region 702 for knock-in of the refined transcription. The audio file of the voicemail is played by clicking the "PLAY" button 704. When the refined transcription is complete, the refined transcription is sent back to ICS110 by clicking the "SEND" button 706.
FIG. 8 is a flow diagram of a process 800 of obtaining a refined transcription according to one embodiment, such as the embodiments of FIGS. 6 and 7. At 802, F/T module 112 receives a request to obtain a refined transcription. The request may be sent by the user pushing a button on the user's mobile device. Alternatively, the request may be sent automatically based on user preferences such as the recognition of a particular word or name in the voicemail. In response to the request, F/T module 112 places the audio file of the voicemail on the file server at 804.
At 806, F/T module 112 sends a notification to the transcriber requesting transcription. The transcriber goes to the indicated web site at 808. For example, the notification may include a hyperlink to an appropriate web page. The transcriber listens to the audio file while typing it into the area provided in the web page and then clicks "SEND".
At 810, the F/T module retrieves the refined transcription from the file server 408. At 812, F/T module 112 updates the rough transcription in the original message with the refined transcription and marks the original email message as "unread". The F/T module 112 then signals 814 the user mobile device that the transcription request has been satisfied. This signal may include a special e-mail flag on the original e-mail that appears in the message list. The signaling may include an audio alert instead of or in addition to the flag, an audio alert, etc.
In some embodiments, F/T module 112 is adaptive in order to increase the accuracy and usefulness of the transcription process as described herein. For example, the list of words used for filtering may automatically adapt to include or exclude words over time based on which words are included in the voicemail that requests the refined transcription.
Fig. 9 is a block diagram of a system 900 that includes ICS110 and performs the processes shown and described above. System 900 includes a networked environment 902. Networked environment 902 includes one or more networks of any type over which data can be communicated, including a local area network ("LAN"), a wide area network ("WAN"), the Internet, and any wired or wireless communication network, in any combination.
According to one embodiment, system 900 further includes ICS 110. ICS110 includes, but is not limited to, a communication server 910, an interface module ("IM") 920, and a cache system 930 (also referred to as a "cache"). The communication server 910 couples to any number of components of the network 960 using any of a number of communication protocols. Network 960 and networked environment 902 may be of the same or different types. Network 960 and networked environment 902 allow for the transfer of information between multiple client devices 970 and 999, also referred to as user devices 970 and 999.
IM920 of ICS110 couples to communication server 910 to transfer information or data. In addition, IM920 couples to transfer information with one or more components of messaging server 940, where transferring information includes one or more of pull, receive, retrieve, poll, transmit, and push operations, to name a few. As an example of information transfer between IM920 and messaging server 940, IM920 pulls user information from messaging server 940 and makes the pulled user information available to other components of ICS110, where the user information includes information related to at least networked environment 902.
The components of the messaging server 940 may include, for example, one or more processors 942, which may also be referred to as "central processing units" or "CPUs," and one or more databases 944 coupled to the CPUs 942. In one embodiment, IM920 may be hosted on messaging server 940 or run under the control of messaging server 940, but is not limited to such a configuration. Further, messaging server 940 may be a component of networked environment 902 hosted by communication server 910, but is not so limited. For example, messaging server 940 may be a host of groupware applications (e.g., Microsoft Exchange, LotusNotes, etc.) of networked environment 902.
Cache 930 is coupled to communication server 910 and communicates with one or more components of communication server 910, IM920, and messaging server 940 to transfer information, as described below. The cache 930 may also be coupled to other components (not shown) of the network 950.
As an example of information transfer between the cache 930 and the communication server 910, the cache 930 may receive caller information (e.g., voicemail messages, caller identity, etc.) from the client device 999 via the communication server 910. Examples of information transfers between cache 930 and messaging server 940 include transfers in which cache 930 receives user information from messaging server 940, where the user information may be routed from messaging server 940 via IM920 and/or communication server 910. Another example of information transfer between cache 930 and messaging server 940 includes transfer in which messaging server 940 receives information from cache 930 routed from cache 930 via communication server 910 and/or IM 920.
Examples of information transfers between cache 930 and IM920 include transfers of user information pulled from messaging server 940 by IM920 and directed to cache 930, and transfers in which IM920 utilizes the user information to direct messages from at least one of messaging server 940 and cache 930 to at least one device on network 960 and networked environment 902. In the above example, the cache 930 holds or temporarily stores the received information.
Network 960 and networked environment 902 include a plurality of network components (not shown) of one or more communication service providers or carriers, but are not so limited. Further, network 960 and networked environment 902, and corresponding network components, may be any number or combination of network types known in the art for providing communications between coupled devices 970 and 999, including, but not limited to, for example, a private network, a local area network ("LAN"), a metropolitan area network ("MAN"), a wide area network ("WAN"), a backend network, a public switched telephone network ("PSTN"), the Internet, and other public networks. Additionally, networks 950 and 960 may include, for example, a hybrid network that uses a private network for some portions of the communication route and one or more different public networks for other portions of the communication route.
Client devices 970 and 999 include communication devices such as telephones, cellular telephones, and radio telephones. Client devices 970 and 999 also include processor-based devices such as portable computers ("PCs"), portable computing devices, personal digital assistants ("PDAs"), communication devices, cellular telephones, portable communication devices, and user devices or units. Client devices can include so-called multi-modal devices, where a user can interact with the device and/or the ICS through any form of input and output, such as text input, speech recognition, text output, speech synthesis, graphics, recorded files, and video. In such a device, speech recognition and speech synthesis generation may be performed partly in the device and partly in the ICS. Sound and/or video may be generated by the ICS by a continuous stream of sound and/or video data sent to the device. Client devices may include all such devices and equivalents and are not limited to any particular type of communication and/or processor-based device. In one embodiment, client devices 970 are client devices operating in a private network environment, such as enterprise network 902, while client devices 999 are client devices operating in a different private network environment or under any number of public networks. As used herein, the term "client device" encompasses a user device, as described above, or a user mobile device.
Fig. 10 is a block diagram of a system 1000 showing further details of a communication server 910, according to one embodiment. Communication server 910 is coupled to at least one messaging server 940 via IM 920. IM920 runs under messaging server 940, but is not limited to running under this server. The messaging server is also coupled to one or more databases 944. In one embodiment, database 944 includes a messaging store as described above. In one embodiment, the networked environment is an enterprise network environment, although embodiments are not so limited. Messaging server 940 of one embodiment supports the messaging capabilities of networked environment 1001 using a groupware application (e.g., microsoft exchange) (not shown) along with other applications that suit the size and type of networked environment 1001.
Communications server 910 is coupled to any number of client devices 999 external to networked environment 1001 via one or more networks (not shown). Similarly, the communications server 910 is coupled to any number of client devices 970 that are local to the networked environment 1001.
The communication server 910 includes an operating system 1018 as well as a number of components or subsystems. These components include, but are not limited to, one or more F/T modules and voice applications 1012, an execution engine 1014 and any number of mobile application modules 1016, or any other type of application module.
FIG. 11 is a block diagram of a system 1100 including an ICS including a CS 1110, an interface module, and a messaging server, according to one embodiment. CS 1110 may be highly scalable. In accordance with one embodiment of the present invention, CS 1110 may be configured as a substantially self-contained modular "appliance" and may be packaged, for example, in a stackable "pizza box" style server. The ICS also includes an IM 1120 (also referred to herein as an "IM") and a management console 1160. The IM 1120, which in one embodiment operates under the control of a messaging server 1140 (also referred to herein as "MSERV 1140" or "MSERV"), couples to the components of the CS, MSERV, and database 1144 (also referred to herein as a "database") in a number of sequences as described herein and as appropriate to the enterprise network system 1100. The IM 1120 is also coupled to the CS management console 1160. The CS and MSERV are coupled to the LAN to communicate with other components (not shown) of the system 1100.
In one embodiment, CS 1110 includes an "operating system" along with an "execution engine," one F/T module and some number of F/T modules ("F/T") and "voice applications," and some number of "mobile applications. The operating system includes, for example, a Linux kernel with a journaling file system that provides integrity of file system tables and data structures. The storage on the CS may be configured in a RAID (redundant array of independent disks) configuration to provide highly reliable access to software and data. The operating system supports the operation of a number of other components of the CS.
With respect to the operating system, the CS includes a "phone interface" that couples calls to and connects callers and users to/from the CS. The telephony interface couples call information to/from, for example, a private branch exchange ("PBX") (not shown), which is a component of system 1100. The telephone interface is coupled to the PBX using a plurality of telephone integrations including one or more analog, simplified message desk interface ("SMDI"), Tl/El, voice over internet protocol ("VOIP"), and digital set Emulation ("DSE") signals, but may be coupled using other signal/signaling protocols. When receiving a call, for example, from a PBX, the CS receives data for the incoming call from the PBX, where the data includes called party information, the reason for the call transfer (e.g., called party line busy, called party not answered, called party using call transfer, etc.), and calling party information (caller ID, etc.).
The "driver" couples information received at the telephony interface to the "telephony services" component of the CS. The driver may perform appropriate low level signaling and/or data conversion on the received signal. The telephony service includes one or more components for processing received signals. These components include, for example, voice processing, switching/control, and PBX signaling, but are not limited to these components.
The CS of an embodiment includes at least one "voice browser" that receives voice information for a call when the CS receives the call. The voice browser controls the use of automatic speech recognition ("ASR") for speech recognition and DTMF recognition. The voice browser of one embodiment is coupled to a cache or other temporary storage that holds voice recordings and/or name grammars ("voice recordings/grammars") (in one embodiment, the name grammars are cached after being generated from the names in the user list). In one embodiment the memory also contains a list of words for filtering using the F/T module as described herein. In one embodiment, a default word list is applied unless a user-specific word list has been generated and stored for the user. In one embodiment ASR is used to perform rough transcription.
ASR may use information of name syntax. In addition, the voice browser controls the use of speech synthesis ("TTS") and the playback of any number of pre-recorded prompts (e.g., WAV formatted files). The voice browser uses voice extensible markup language ("VXML") but is not limited to this protocol. Alternative embodiments of the CS may not include a voice browser. As an alternative to a voice browser, the CS may communicate directly, or use other software or processes, to communicate between the voice application and the telephony service and/or driver.
The virtual machine, voice applications, and execution engines form a hierarchical state machine framework in which the virtual machine runs many APIs and modules. Thus, the voice application may include one component that controls a user interface ("UI") to the CS, and another component that handles lower-level communications with the module. Using the loose coupling between the module and the voice browser provided by the state machine framework allows independence between the languages used in the different modules and the voice browser. The state machine framework can receive hypertext transfer protocol ("HTTP") requests, e.g., from a voice browser, and generate VXML or voice application language tags ("SALT") (SALT extends existing markup languages such as hypertext markup language ("HTML"), extensible hypertext markup language ("HTML"), and extensible markup language ("XML"), and enables multimodal and telephony-capable access to information, applications, and web services from devices such as PCs, phones, and PDAs).
The voice applications of one embodiment include many components including an automated attendant, a caller interface, a user interface, and a system main menu, but may also include other types of voice applications. The automated attendant has voice capability, but may have dual tone multi-frequency ("DTMF") capability. Automated operators, which may be enabled or disabled, use information of contact lists (e.g., user lists) in the cache.
The voice applications also include at least one voicemail application. Voicemail applications use cached information (e.g., user lists, global address lists, public folders, personal contact folders) in operations that include sending new voicemails and/or transferring received voicemails. In one embodiment, the F/T module accesses the cache information during filtering, such as searching for names or information in the voicemail that matches the cache information.
The voicemail application also uses the cache information to support voicemail networking where voicemails and corresponding information are exchanged with the groupware application of the system 1100.
The voicemail application is coupled to the CS state machine framework via one or more application programming interfaces ("APIs"). The API handles different data formats/types in use by the enterprise network system 1100 (e.g., greeting data, PIN (personal identification number) code data, voicemail message data, system parameters, etc.). Similarly, a cache is also coupled to the state machine framework, where the cache includes one or more local caches and a distributed cache. Thus, communication between the voicemail application, the cache, and the MSERV is via the state machine framework and APIs as appropriate for the state of the MSERV (e.g., offline, online).
In addition to voice applications, the modules running under the virtual machine of one embodiment include mobile applications. The mobile application provides access to the user's information via the mobile device, where the access may include transferring email messages, calendars, and/or transfers to the user's mobile client device via electronic messages (e.g., SMS, MMS, and/or pager).
The CS also includes a "management/configuration" manager. The management/configuration manager provides access and control to the CS' unified configuration file. The management/configuration manager uses the information of the unified configuration file to provide separate configuration files to one or more components of the CS as appropriate. The unified configuration file may be copied from the CS and stored for backup purposes. In addition, a predetermined configuration file may be uploaded to the CS to provide the CS with an appropriate configuration. A browser interface to the administration/configuration manager allows remote access to the CS.
The CS also includes, for example, an "autonomous maintenance monitor" or reliability server that monitors CS components and restarts failed processes if necessary. In addition, the CS also includes "security restrictions" for controlling CS/port security.
As described above, the CS of one embodiment interfaces with the MSERV via the IM. The CS communicates with the IM via, for example, a groupware connector, but is not so limited. The groupware connector of one embodiment comprises a "Web server," but is not so limited. The MSERV acts as a messaging and collaboration server. In one embodiment, the IM is an interface running under the MSERV to provide communication and information transfer between components of the CS and components of the MSERV. In other embodiments, the IM may operate, for example, under the control of the CS. The IM includes and/or is coupled to a management console 1160 as well as a diagnostic component ("diagnostics component") and/or a runtime component ("RTC") (not shown).
The management console 1160 supports access to the CS by a system administrator of the system 1100 for purposes of managing user access. Thus, the management console 1160 allows a system administrator to enable new users to have the integrated messaging functionality of the ICS and to manage and monitor one or more CSs.
The diagnostic component of the IM supports dynamic diagnostic collection, calculation, and/or compilation of pre-specified diagnostic information or parameters from the MSERV. In this manner, the CS may provide diagnostic information and the user may provide dynamically updatable diagnostic information.
The RTC translates communications between components of the CS and components of the MSERV. By way of example, the RTC may be used to retrieve user information from a directory service (e.g., active directory) of the groupware application in response to a request from the CS, as described below. The communication between the components of the RTC and the CS uses XML and Web services, for example. The communication between the RTC and the MSERV may use one or more APIs of the MSERV (e.g., APIs, collaborative data objects ("CDO"), Web-based distributed authoring and versioning ("WebDAV"), etc.).
The MSERV of one embodiment represents a messaging and collaboration server. The messaging and collaboration server includes a groupware application that runs on one or more servers and enables users to send and/or receive electronic mail and other forms of interactive communication over a computer network via local client devices. The CS of one embodiment interactively operates with groupware applications including, but not limited to, Microsoft Exchange servers, although alternative embodiments may use other types of messaging and collaboration servers. Thus, the CS of one embodiment interactively operates with client device applications ("client applications") such as Microsoft Outlook, as well as with other email client applications (e.g., Microsoft Outlook Express).
The MSERV sends and receives email messages through mobile devices such as personal computers, workstations, or including mobile phones or PDAs, commonly referred to as client devices. The client devices are typically connected to a LAN, which may include any number and/or combination of servers or mainframe computers, in which email mailboxes and public folders are stored. The central server is connected to a number of other types of networks (e.g., private or proprietary, and the internet) to transmit and receive email messages to and from other email users. Thus, in one embodiment, the CS uses the MSERV to store and transfer email messages.
The MSERV is also coupled to a directory service (not shown), which is a database of information about each user account in the enterprise network system. Access to the directory service may use, for example, lightweight directory access protocol ("LDAP").
With respect to client device access functionality, the MSERV provides integrated collaborative messaging components, such as scheduling, contact and task management capabilities. As an example MSERV configuration, when the MSERV is Microsoft Exchange, the MSERV runs on a version of the Microsoft Windows Server operating system. One version of Microsoft office Outlook runs on a Windows-based local client device and communicates with the MSERV via the Messaging application Programming interface ("MAPI") protocol. The MSERV also allows access by other client devices by supporting one or more of post office protocol 3 ("POP 3") and Internet information access protocol 4 ("IMAP 4") protocols, as well as simple mail transfer protocol ("SMTP"). Using the same MSERV configuration example, the CS of one embodiment, along with Microsoft outlook Web Access (a service in Microsoft Exchange), allows a Web browser-based Access client, also referred to as a thin client.
The MSERV collaboration component supports information sharing between users. Collaboration scenarios include maintaining a shared address list that all users can view and edit, scheduling meetings that include personnel and meeting rooms by viewing an associated free or busy schedule, and allowing the ability of other personnel, such as administrators, to access user mailboxes on behalf of the users.
As described above, the IM serves as an interface for information transfer between components of the CS and components of the MSERV. Transferring information includes, for example, pulling, receiving, retrieving, polling, transmitting, and pushing operations, to name a few. As an example of information transfer between the CS and the MSERV, the IM pulls information from one or more components of the MSERV and makes the pulled information available to, for example, the CS cache. The IM also pushes information from one or more components of the CS to the MSERV.
When used as an interface between the CS and the MSERV, a component of the IM (e.g., the RTC) transfers communications between a component of the CS (e.g., a virtual machine, a cache, etc.) and a component of the MSERV environment. As an example, the IM retrieves user information from a component of a directory service (e.g., active directory) in response to a request from the CS/cache.
Embodiments of the IM may include one or more of the following components: an RTC, a management console, a desktop component, a messaging action control component, a diagnostic component, and/or a message waiting indication component. The desktop component allows the user to configure aspects of the user's integrated messaging account, such as voice message greetings, extended absence greetings, PIN code data, and presence information. In one embodiment, the desktop component allows a user to configure the behavior of the F/T module. For example, filtering and transcription may be turned off for all voicemail messages. As another example, a refined transcription is automatically requested for a voicemail message from a determined caller. Many other behaviors are possible based on all cache information available within the system 1100.
The messaging action control component receives and responds to user-generated requests from a form-based user interface ("FBUI") to take actions such as playing, replaying, and transferring voice messages, requesting refined transcription, calling the sender of voicemail messages, and the like. The message wait indication component receives an event from the user's message inbox folder and requests a corresponding action from the PBX or other aspect of the telephone system, such as opening a message wait indicator on the user's device(s). The message waiting indication component may send the notification via SMS, MMS, and/or pager.
FIG. 12 is a block diagram illustrating interactions between an interface module ("IM") 1220 and components of a messaging server ("MSERV") environment 1240, according to one embodiment. The components of the MSERV environment 1240 include the MSERV and one or more databases as described above. The database for one embodiment includes a directory service 1242.
Directory service 1242 provides a location for storing information about network-based entities, such as applications, files, and printers, to name a few. Directory service 1242 also stores information about individuals, also referred to as users, and this information is referred to herein as "user information". In this manner, directory service 1242 provides a consistent way to name, describe, locate, access, manage, and secure information with respect to personal resources in an enterprise network environment. Directory service 1242 uses the stored information as the primary switch of the enterprise network operating system and is thus the central authority that manages the relationships between the identities and the distributed resources that coordinate the enterprise network, thus enabling the resources to work together. Directory service 1242 of an embodiment may be Microsoft Active Directory ("AD"), but is not so limited.
In embodiments that include an AD, there is a user object stored in the AD database for each enterprise user. For example, a USER object for Enterprise USER 2(USER 2) is shown as USER 2 object 1202. The user object includes a number of fixed attributes such as the user's name, the user's phone number, the user's mailbox location, and the user's email address.
The user object further includes a number of "custom attributes". The number of custom attributes is small compared to the number of fixed attributes, e.g., fifteen. Custom attributes can be used to store information not provided in the predetermined fixed attributes. In one embodiment, the custom attributes store user-specific data used by the F/T module and the voice application. Examples of such user-specific data include user-specific word lists, and user preferences regarding the behavior of the F/T module. Further examples of user-specific data include a class of service ("COS") for the user, a voicemail extension for the user, whether voicemail is enabled for the user, and the like. The data is stored as a data stream in custom attributes with a maximum capacity of 2048 bytes. In an alternative embodiment, the user-specific data used by the F/T module and the voice application is stored as personal data entries in fixed attributes by extending the AD in a known manner.
The user mailbox location fix attribute indicates where the user's email mailbox is stored in the enterprise. In some large enterprises, there may be many MSERVs, each including a database that stores many user mailboxes. As shown, the mailbox location fix attribute points to USER 2 mailbox 1204 on the MSERV, referred to as MSERV 1.
User mailbox 1204 stores email messages sent to the user, as well as outgoing messages and other items, for a predetermined period of time. In one embodiment, the message may be of at least two types, one of which is a "normal" message that the user can routinely access. Another type of message is a "hidden" message that the user cannot access in a routine manner through a normal user email interface. In one embodiment, the hidden message is used to store data used by the F/T module and the voice application. However, in contrast to the data stored in the custom attributes, the data stored in the hidden message may be larger than the 2048 byte limit of the custom attributes. In one embodiment, among the data stored in the hidden message are audio files stored as attachments to the hidden message, such as a "busy" greeting for the user's voicemail mailbox, a "no answer" greeting for the user's voicemail mailbox, and a recorded name for the user's voicemail mailbox.
An example of CS accessing MSERV environment 1240 via IM 1220 is a telephone caller calling the voicemail mailbox of USER 2 when USER 2 picks up a call. The CS transmits the action via the IM 1220 with a request to "play busy greeting". The transmission includes information that accesses the USER 2 object 1202 fixed attribute to determine the USER's email mailbox location. In addition, the transfer includes information to access the USER 2 object 1202 custom attributes and transfer the contents of the custom attributes to the CS via the IM 1220. When the user's email mailbox is accessed, the hidden message is opened to transfer the appropriate audio file (in this case a "busy" greeting) to the CS for playing over the phone for the caller. In many cases, it may not be necessary to transfer custom attributes or audio files from the MSERV environment 1240 because the current custom attributes and audio files are cached on the CS.
As described above, the operation of the voice application and virtual machine couples the cache and other components of the CS to the components of the MSERV via the IM. Thus, the CS and IM support information transfer between the cache and backend network components (e.g., MSERV and databases). This configuration provides transparency between the voice application and the data stored in the database when the information of the database is used to support the voicemail messaging function of the CS, as described below.
The transfer of information between the cache and the MSERV, along with the use of custom attributes and hidden messages as described above, allows the ICS to overcome the need for an external database for storing information stored by typical voicemail systems. This is because the information used by the CS is pulled by the CS from the MSERV via the IM while providing voicemail messaging capability integrated with the email messaging capability of the enterprise network. The pulling or retrieving may be performed periodically, continuously, on demand, and/or in response to certain events (e.g., an update of information in the MSERV), but is not so limited. The information pulled by the CS includes information for "global address list" ("GAL"), information for one or more "public folders," information for "personal contacts," and "user list.
The GAL includes all user information that has access privileges in the enterprise network including the use of e-mail. The common folder includes information (e.g., contacts, calendars, etc.) of the network enterprise that is shared with all users. The personal contacts include contact information for each user.
The user list includes user information for a subset of users in the GAL, each user having access privileges including using the ICS. The user list is thus a subset of the GAL and is retrieved and/or cached as a separate list or stream, thereby improving communication efficiency and minimizing delays associated with having the CS search the entire contents of the GAL for information used in performing user-requested actions on voicemail messages. The user list of one embodiment includes one or more of the following parameters corresponding to each user, but is not limited to these parameters: site identification, mailbox number, vocable name, office extension, COS, automated attendant status (e.g., enabled, disabled), voicemail status (e.g., enabled, disabled), voice user interface ("VUI") status (e.g., enabled, disabled), mobile access status (e.g., enabled, disabled), invalid login, deadlock, attendant destination, mandatory changes to PIN code, mobile gateway identification, full name, last name, first name, user name, home phone number, office phone number, cellular phone number, identification card, email address, department, active greeting status, time and date claims, voicemail notification status (e.g., enabled, disabled), mailbox status, PIN codes in encrypted or raw form, no answer greeting, busy greeting, extended absence greeting, recorded name, and system greeting.
Rather than storing the information pulled from the MSERV in a separate voicemail database, as would be done in a typical voicemail system, the pulled information is pushed by the IM to the CS and held in cache. The CS uses the pulled information in subsequent voicemail message manipulation operations as described below. This pulling and caching of information by the CS increases the speed and efficiency of voicemail message operations and avoids unnecessary burdens on the MSERV due to the nearly continuous stream of read request data to the MSERV database in typical messaging systems.
Pulling information from the MSERV by the CS includes pulling and caching information including GALs, common folders, and user lists. The pulled information is cached by the CS on a system or non-personal basis as the information is applied to the entire enterprise. This information is periodically pulled and cached, for example, at 24-hour intervals (e.g., at 2:00am each morning), or may be loaded as needed, but is not so limited.
In contrast, the CS pulls and caches information for individual contacts on a per-user basis, as the information is different for each user. The personal contacts may be requested and cached by the CS periodically, or as needed (e.g., upon user login into the ICS, in response to modification of the personal contacts, etc.).
In operation to provide integrated messaging capability, the CS and IM function to forward caller-originated calls to the user and to receive and route voicemail messages left by the caller in the event that the user is not available. The CS and IM also function to provide users with access to voicemail messages using the messaging servers of the enterprise email system. Voicemail access supports both online and offline modes of a messaging server.
By way of example of CS call routing, and with further reference to fig. 11, the CS receives and probes calls at the telephone interface. The call data (e.g., called party information, calling party information, call forwarding cause, etc.) invokes the voice browser. The voice browser transfers the request to the voice application in response to the call data.
The scheduler component of the voice application routes the call to one or more other voice application components based on the information of the user list. As an example, the dispatcher identifies a target user for the call and determines whether an automated attendant of the target user is enabled. If the automated attendant is enabled, the automated attendant receives the call request and provides the caller with one or more call routing options (e.g., the caller selects a call route by selecting and/or speaking an extension number, selecting and/or speaking a name, etc.) and forwards the call based on the caller's input.
As an example, one or more voice applications determine an active greeting currently specified by a user for use in responding to a call (e.g., a system greeting, an unanswered greeting, a busy greeting, an extended absent greeting, etc.) and retrieve the specified active greeting from one of the cache or the MSERV in a manner appropriate to the state of the MSERV. The respective application(s) play the greeting, initiate a "record mode" to record the caller's voicemail message, and provide the caller with other options available for call and/or message routing (e.g., message marking options, message transmission options, send the message, route the message to other users, etc.). Upon completion of the recording and/or selection of message routing options by the caller, the respective application(s) terminates the call (hangs up) and transfers the recorded voicemail message to the F/T module and to one or more locations (e.g., mailboxes) in the cache and/or MSERV corresponding to the user. Alternatively, the voicemail message may be transferred before the application terminates the call.
FIG. 13 is a block diagram of a system 1300 including an integrated communication system ("ICS") 1310 having a form-based user interface ("FBUI"), according to one embodiment. As described above, the user's voicemail can be coarsely transcribed, and the coarsely transcribed using the voicemail's audio file is sent to the user's email-capable device as a "normal" email with one or more attachments.
As described further below, the FBUI is an optional mechanism for delivering the rough transcription and voicemail audio files via the email system. System 1300 includes a networked environment 1301 that provides integrated voicemail and email messaging through the use of ICS 1310. Networked environment 1301 includes a LAN coupled to components of ICS networked environment 1301 and messaging server environment 1340. ICS1310 includes CS1310, IM 1320, and FBUI1380, but is not so limited. FBUI1380 is further provided to a USER (e.g., USER Z) via one or more processor-based devices 1399, such as PDA 1399.
The messaging server environment 1340 includes the MSERV and a database 1344, but is not so limited. The LAN is coupled to any number of other networks 1350 and 1360 using any of a number of communication protocols, where the networks 1350 and 1360 may be of the same or different types. By way of example, the networks may include a public communication network 1350 and a private communication network 1360. Private communication network 1360 may be, for example, a PBX coupled to a LAN of an enterprise network. Networks 1350 and 1360 allow information transfer between client device 1370, which is local to networked environment 1301, and client device 1399, which is external to networked environment 1301. The client devices may alternatively be referred to as "user devices" 1370 and 1399.
In one embodiment where networked environment 1301 is an enterprise network, ICS1310 replaces the voicemail server typically found in an enterprise network with at least one CS1310, although embodiments are not so limited. CS1310 is coupled to a private communication network (e.g., PBX) of each network enterprise. Although only one CS is shown in this example system 1300, the enterprise network may include multiple CSs 1310 coupled to the enterprise network in an "N + 1" configuration, where "N" is any number 1, 2.
For security reasons, communication to and from the CS is restricted in one embodiment. The CS communicates with the IM server, the private communication network, other CSs, and selected client devices. According to one embodiment of the invention, communication with the CS may be limited to network components having a particular known address. Additionally or alternatively, communication with the CS may require authentication by password or other security measures for a particular type of access, such as access to an administrator. Security may also or alternatively be encrypted and/or provided by requiring a physical connection between the CS and other components, such as in the case of a connection between the CS and a private communications network through a direct cable connection. Limiting communications to and from the CS provides confidentiality of voicemails and voicemail transcripts, as described herein.
The CS via the FBUI provides the form to the client device, typically from a first server (e.g., messaging server, MSERV, etc.) via a network connection. The form includes data or code that, when executed by the receiving client device, causes the FBUI to be presented on the display of the client device. The FBUI includes a number of buttons or icons that allow a user to select an action on an item via a second server (e.g., a communications server, CS, etc.), where the item is stored on a first and/or second server, and the first and second servers are different servers. The FBUI of an embodiment uses a web browser embedded in the form as a means of coupling and/or communicating with a corresponding browser control of the second server. The communication between the client device and the second server thus avoids security and/or other network policy issues that would prohibit the client device from communicating with the second server via the network coupling between the client device and the first server.
As described above, the FBUI operates as a form-based messaging interface to transfer a first message (e.g., a voicemail message) from a communication server (e.g., CS) to a messaging server (e.g., MSERV) via a first coupling (e.g., IM). The messaging server generates a second message (e.g., an email message) in response to the type of the first message and transfers the second message to the client device via a second coupling (e.g., a LAN). The type of the first message is specified by the communication server using message properties regarding the recognition of the message as a "voicemail type" ("VMT") message. The second message is of a different type and includes data of the first message, but is not so limited. The communication server also transfers form data corresponding to the first message to the client device. The client device uses the form data to establish a third coupling (e.g., a browser link) between the client device and the communication server. The user may use the form data to direct actions from the client device regarding the first message via the third coupling.
The ICS of one embodiment provides the FBUI1380 to the user via his/her local or external client device. The FBUI is provided to the client device using a FBUI form, where the structure of the FBUI form conforms to the message structure of the messaging server environment. For example, when the messaging server environment includes the use of Microsoft Exchange and Microsoft Outlook, the FBUI form is generated to conform to the Microsoft format suitable for Exchange and Outlook.
The information used to generate the FBUI form is provided by the CS to the messaging server environment via IM, and the code used for FBUI form generation is managed by the MSERV in one embodiment. The FBUI form of an embodiment includes code that generates information for the FBUI display as well as the button display. The FBUI form further includes embedded browser controls for establishing communication between, for example, a client device displaying the FBUI form and a web server (e.g., CS, IM, other server). The embedded browser control thus allows the host client device to couple and communicate with a server other than the MSERV via a communication channel outside the enterprise network LAN. Thus, the FBUI form allows a communication channel to be initiated between the local client device currently executing the form and a component such as the CS and/or IM despite network policy issues that may otherwise prohibit the client device from communicating outside the enterprise network messaging infrastructure.
Using the FBUI, a user is able to access/view and take a number of actions on his/her voicemail messages within the email framework of the host enterprise network system. By way of example, when the CS of an embodiment receives a voicemail message, it transfers the voicemail message to the MSERV, as described above. When transferring a voicemail message to the MSERV, the CS specifies the nature of the message that identifies the message as a "voicemail type" ("VMT") message. Using the same memory and retrieval structures as used by other message types, such as email messages, the MSERV receives the message and stores it as a VMT message.
When, for example, a user wishes to access his/her messages via his/her client device, the active message browser of the client device receives the VMT message along with any other mail messages currently stored in his/her electronic mailbox. The message browser corresponds to the message structure of the messaging server environment (e.g., Outlook in the Microsoft environment). Upon receipt of the message, the message browser identifies the message as a VMT message. Because the code that implements the FBUI form is stored on the MSERV, the implementation of the functionality and/or features associated with the FBUI form uses communication between the user's client device and the MSERV via the LAN. For example, the client device message browser requests a FBUI form from the MSERV in response to authenticating the message as a VMT message, as this is the form corresponding to the VMT message type. The MSERV transfers the FBUI form to the requesting client device, and the client device message browser launches the form in response to the user selecting the VMT message for viewing.
The message browser uses the data or code of the FBUI form to display the FBUI on the user's client device. FIG. 14 is a sample FBUI 1400 that is played on a client device, for example, according to one embodiment. The FBUI 1400 includes three areas 1402, 1404, and 1406 that present information to the user. The areas include a folder area 1402, a content area 1404, and a functions/information area 1406, but are not limited to these areas as the UI of alternative embodiments may present any number and/or type of areas. In alternative embodiments, all three regions 1402-1406 can be presented simultaneously, as shown in the FBUI 1400, or multiple subsets of the three regions can be presented simultaneously in various combinations.
Folder area 1402 presents one or more folders to which a user accesses via FBUI 1400 and the client device. The "INBOX" may contain a list of voicemail messages in the same list as other messages including email messages. Alternatively, the Inbox may include a subfolder ("Voice MESSAGES") that includes voicemail MESSAGES, and selection of this folder results in the presentation of the voicemail MESSAGES of the user's mailbox in the content area 1404.
The content area 1404 generally presents the contents of the folder selected using the folder area 1402. By way of example, when the INBOX or VOICE MESSAGES folder is selected, the content area 1404 presents information corresponding to any number of voicemail MESSAGES in the user's mailbox. Content area 1404 allows the user to select a particular voicemail MESSAGE, for example, by placing a cursor over "VOICE MESSAGE 1 INFORMATION". Function/information area 1406 may be displayed by (double-clicking) on a message in content area 1404 or otherwise instructing the message browser to display a voice message.
The function/information area 1406 of the FBUI 1400 presents a rough transcription as shown. Function/information area 1406 further presents one or more "voicemail action buttons" 1408 (also referred to herein as "buttons"), each of which represents an action that the user may select for a voicemail message. In this example, the VOICE MESSAGES folder is selected, and selecting a message in the content area 1404 allows the user to take action on the selected message using the buttons shown. The cursor of the content area 1404 is placed over a particular message and an action is selected on the selected message with a button to invoke an operation on the message via a component of the ICS (e.g., CS, cache, IM). The buttons of one embodiment include a "play on the phone" button, a "get refined transcription" button, a "call sender" button, a "reply by voicemail" button, and a "forward by voicemail" button, but the embodiment is not limited to the same number of buttons or buttons providing the same functionality.
In other embodiments, the presentation of the area or information of the FBUI can be changed in a variety of ways. For example, in one embodiment, the action button appears after user selection (e.g., by double-clicking a particular voice message from the content area 1404). The action button may also appear when the user right-clicks on a particular voice message in the content area 1404.
Folder area 1402 may also include subfolders ("Voice message System (VOICE MESSAGE SYSTEM)") below the public folder. In this way, the VOICEMESSAGE SYSTEM folder may not be considered an actual folder but rather a uniform resource locator ("URL") that, when selected, sends an HTTP request to the web server and launches/displays the ICS browser inside the client device message browser. The Web server may be, for example, a component of the CS and/or IM, but is not so limited. An ICS browser is an embedded or hidden browser that displays a function/information area 1406 in the area of the client device message browser where email would normally occur, while voicemail messages are displayed in the function/information area 1406.
By way of example, the function/information area 1406 is displayed in the content area 1404 of one embodiment. Function/information area 1406 may be served by the IM and may contain any information related to the user-specific voice messaging system. In one embodiment, function/information area 1406 displays a user login prompt where the user enters the user's name and PIN code. The system then displays the user's configuration date, such as PIN code, operator extension, greeting type, and other applicable information.
The hidden browser enables an HTTP link and communicates, for example, with an IM, which in turn relays communication (via HTTP) with the CS, for example, via a CS Web server. Thus, while typical messaging servers and LANs use security policies that restrict the use of "special" code in the form data, using a hidden browser embedded in the host system's local form structure can overcome this restriction because the browser is not detected or treated as special code. Thus, the use of a hidden browser supports communication with corresponding browser controls in the CS and/or IM, thereby allowing integration of voicemail messaging provided by the CS with email messaging systems of the enterprise network.
A "voicemail message" in ICS is generally any message that is generated using a client device that generates an audio stream. A "voicemail message" is also any voice-type message, such as those generated using the "reply by voice message" and "forward by voice message" buttons of the FBUI. An "email" is any message that is generated using a button of a host mail message system for generating a reply message or for forwarding a message in response to receipt of the message, even if a voicemail message is being replied to or forwarded. The ICS of one embodiment presents voicemail messages to users in an email message system using the FBUI as a presentation form.
The components of the ICS described above include any collection of computing components and devices working together. A component of an ICS may also be a component or subsystem within a larger computer system or network. The ICS components can also be coupled in any number of combinations between any number of components (not shown), such as other buses, controllers, memory devices, and data input/output (I/O) devices. Further, components of the ICS may be distributed among any number/combination of other processor-based components. Further details of a System including an ICS and FBUI and adapted to embody the invention claimed herein are described in U.S. patent application No.11/053,271 entitled "Integrated multimedia Communication System" filed on 7/2/2005, which is incorporated herein by reference.
Aspects of the systems and methods described herein may be implemented as functionality programmed into any of a variety of circuitry, including Programmable Logic Devices (PLDs), such as Field Programmable Gate Arrays (FPGAs), Programmable Array Logic (PAL) devices, electrically programmable logic and memory devices and standard battery-based devices, as well as Application Specific Integrated Circuits (ASICs). Some other possibilities for implementing aspects of the system include: microcontrollers with memory, such as electrically erasable programmable read-only memory (EEPROM), embedded microprocessors, firmware, software, and the like. Further, aspects of the system may be embodied in microprocessors having software-based circuit emulation, discrete logic (sequential and combinatorial), custom devices, fuzzy (neural) logic, quantum devices, and hybrids of any of the above device types. Of course, potential device technologies may be provided in a number of component types, for example, Metal Oxide Semiconductor Field Effect Transistor (MOSFET) technologies like Complementary Metal Oxide Semiconductor (CMOS), bipolar technologies like Emitter Coupled Logic (ECL), polymer technologies (e.g., silicon conjugated polymer and metal conjugated polymer metal structures), mixed analog and digital technologies, and so on.
It should be noted that the various functions or processes disclosed herein may be described as data and/or instructions embodied in the form of various computer readable media, in terms of their behavior, register transfer, logic component, transistor, layout geometry, and/or other characteristics. Computer-readable media in which such formatted data and/or instructions may be embodied include, but are not limited to, non-volatile storage media in various forms (e.g., optical, magnetic or semiconductor storage media) and non-volatile storage media that may be used throughout the specification and claims to transmit data wirelessly or optically, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense unless the context clearly requires otherwise; that is, it means "including but not limited to". Words using the singular or plural number also include the plural or singular number, respectively. Additionally, the words "herein," "hereinafter," "above," "below," and words of similar import refer to this application as a whole and not to any particular portions of this application. When the word "or" is used to refer to a list having two or more entries, the word encompasses all of the following word interpretations: any entry in the list, all entries in the list, and any combination of entries in the list.
The above description of illustrated embodiments of the systems and methods is not intended to be exhaustive or to limit the systems and methods to the precise form disclosed. Although specific embodiments of, and examples for, the F/T module are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the systems and methods, as those skilled in the relevant art will recognize. The teachings of the systems and methods provided herein are not only for the above-described systems and methods, but are also applicable to other processing systems and methods.
The elements and acts of the various embodiments described above can be combined to provide further embodiments. These and other changes can be made to the systems and methods in light of the above-detailed description.
In general, in the following claims, the terms used should not be construed to limit the systems and methods to the specific embodiments disclosed in the specification and the claims, but should be construed to include all processing systems that operate in accordance with the claims. Accordingly, the system and method are not limited by the disclosure, but instead the scope of the system and method is to be determined entirely by the claims.
While certain aspects of the systems and methods are presented below in certain claim forms, the inventors contemplate the various aspects of the systems and methods in any number of claim forms. For example, while only one aspect of the systems and methods may be expressed as embodied in a machine-readable medium, other aspects may likewise be embodied in a machine-readable medium. Accordingly, the inventors reserve the right to add additional claims after filing the application to pursue such additional claim forms for other aspects of the systems and methods.

Claims (23)

1. An integrated messaging method, comprising:
receiving audio data via a first network, wherein the audio data comprises a first type of message sent by a caller to a user;
converting the audio data to a first format;
filtering the audio data, the filtering step including searching for a predetermined word in the audio data;
converting the audio data into a second format, wherein converting audio data comprises generating a rough transcription based on predetermined words in the audio data, wherein the rough transcription facilitates the user in determining an appropriate response to the first type of message;
generating a second type of message, wherein the second type of message includes the converted audio data having the second format and the audio data having the first format;
sending the message of the second type to the user via a second network;
receiving a request from the user to provide the user with a refined transcription of the first type of message;
placing audio data having the first format on a file server in response to the request; and
sending a notification to a transcriber, wherein the notification includes the request and an instruction to the file server.
2. The integrated messaging method of claim 1, further comprising:
generating a priority flag based on the results of the filtering step; and
including the priority flag in the second type of message.
3. The integrated messaging method of claim 1, wherein the receiving step comprises determining whether to filter the audio data based on previously set user preferences.
4. The integrated messaging method of claim 1, wherein the predetermined word is included in a word list.
5. The integrated messaging method of claim 4, wherein the word list includes a predetermined set of words and words added by the user.
6. The integrated messaging method of claim 1, wherein the filtering step further comprises accessing data about the user from a plurality of sources within an enterprise, the plurality of sources including a user list, a global address list, a public folder, and a personal contacts folder.
7. The integrated messaging method of claim 1, further comprising sending the message of the second type to one or more recipients other than the user based on previously set user preferences.
8. The integrated messaging method of claim 1, wherein the first type of message is a voicemail message, and wherein the second type of message is an email message.
9. The integrated messaging method of claim 8, wherein the first format is an electronic audio format, including a WAV format.
10. The integrated messaging method of claim 8, wherein the second format is text.
11. The integrated messaging method of claim 1, further comprising: in response to the request, sending a third type of message to a transcriber, wherein the third type of message includes the request and the audio data having the first format.
12. The integrated messaging method of claim 11, wherein the third type of message is an instant message, and wherein the transcriber is capable of listening to audio data in the first format and typing the refined transcription into the instant message.
13. The integrated messaging method of claim 12, further comprising:
receiving the instant message with the refined transcription from the transcriber; and
editing the second type of message to include the refined transcription in place of the rough transcription.
14. The integrated messaging method of claim 13, further comprising:
resetting the status of the second type of message to "unread".
15. The integrated messaging method of claim 14, further comprising:
sending a predetermined notification to the user with the message of the second type, wherein the predetermined notification notifies the user that the message of the second type now includes the refined transcription.
16. The integrated messaging method of claim 15, wherein the notification comprises one or more of a priority flag and an audio alert.
17. The integrated messaging method of claim 1, further comprising presenting a web page to the transcriber, wherein the web page includes a user interface that enables the transcriber to listen to the audio data and type the refined transcription into the web page.
18. The integrated messaging method of claim 17, wherein the user interface further enables the transcriber to send the refined transcription to the file server.
19. The integrated messaging method of claim 18, further comprising:
retrieving the refined transcription from the file server; and
editing the second type of message to include the refined transcription in place of the rough transcription.
20. The integrated messaging method of claim 19, further comprising:
resetting the status of the second type of message to "unread".
21. The integrated messaging method of claim 20, further comprising sending a predetermined notification to the user with the message of the second type, wherein the predetermined notification notifies the user that the message of the second type now includes the refined transcription.
22. The integrated messaging method of claim 21, wherein the notification comprises one or more of a priority flag and an audio alert.
23. The integrated messaging method of claim 6, wherein the first type of message is a voicemail message and the second type of message is an email message, the method further comprising accessing a voicemail privacy policy associated with the user and applying the policy to the second type of message.
HK10110280.6A 2007-02-21 2008-01-11 Voicemail filtering and transcription HK1143874B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US11/709,513 2007-02-21
US11/709,513 US8107598B2 (en) 2007-02-21 2007-02-21 Voicemail filtering and transcription
PCT/US2008/050842 WO2008103507A1 (en) 2007-02-21 2008-01-11 Voicemail filtering and transcription

Publications (2)

Publication Number Publication Date
HK1143874A1 HK1143874A1 (en) 2011-01-14
HK1143874B true HK1143874B (en) 2014-07-25

Family

ID=

Similar Documents

Publication Publication Date Title
EP2126684B1 (en) Voicemail filtering and transcription
US8160212B2 (en) Voicemail filtering and transcription
EP2126683B1 (en) Voicemail filtering and transcription system
US7321655B2 (en) Caching user information in an integrated communication system
US8175233B2 (en) Distributed cache system
US8233594B2 (en) Caching message information in an integrated communication system
US7564954B2 (en) Form-based user interface for controlling messaging
US7346150B2 (en) Controlling messaging actions using form-based user interface
US7808980B2 (en) Integrated multi-media communication system
US20060177015A1 (en) Message data access in multi-media integrated communication system
HK1143874B (en) Voicemail filtering and transcription
HK1144505A (en) Voicemail filtering and transcription
HK1144474A (en) Voicemail filtering and transcription system