US6928407B2 - System and method for the automatic discovery of salient segments in speech transcripts - Google Patents
System and method for the automatic discovery of salient segments in speech transcripts Download PDFInfo
- Publication number
- US6928407B2 US6928407B2 US10/109,960 US10996002A US6928407B2 US 6928407 B2 US6928407 B2 US 6928407B2 US 10996002 A US10996002 A US 10996002A US 6928407 B2 US6928407 B2 US 6928407B2
- Authority
- US
- United States
- Prior art keywords
- segments
- segmentation
- features
- speech
- segment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime, expires
Links
- 238000000034 method Methods 0.000 title claims abstract description 123
- 230000011218 segmentation Effects 0.000 claims abstract description 106
- 230000002123 temporal effect Effects 0.000 claims abstract description 20
- 230000008569 process Effects 0.000 claims description 73
- 239000000203 mixture Substances 0.000 claims description 2
- 238000004590 computer program Methods 0.000 claims 5
- 230000000007 visual effect Effects 0.000 claims 1
- 238000004422 calculation algorithm Methods 0.000 abstract description 14
- 238000013518 transcription Methods 0.000 abstract description 10
- 230000035897 transcription Effects 0.000 abstract description 10
- 238000012549 training Methods 0.000 description 16
- 238000000605 extraction Methods 0.000 description 12
- 230000005236 sound signal Effects 0.000 description 11
- 238000002372 labelling Methods 0.000 description 8
- 230000007704 transition Effects 0.000 description 8
- 230000008859 change Effects 0.000 description 7
- 238000001514 detection method Methods 0.000 description 7
- 238000013459 approach Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000010801 machine learning Methods 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 241000408659 Darpa Species 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000004821 distillation Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 239000003607 modifier Substances 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 238000013138 pruning Methods 0.000 description 2
- 238000011524 similarity measure Methods 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 229920000049 Carbon (fiber) Polymers 0.000 description 1
- 238000007476 Maximum Likelihood Methods 0.000 description 1
- 241000220317 Rosa Species 0.000 description 1
- 125000002015 acyclic group Chemical group 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 238000013476 bayesian approach Methods 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 239000004917 carbon fiber Substances 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 230000001143 conditioned effect Effects 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000013213 extrapolation Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- VNWKTOKETHGBQD-UHFFFAOYSA-N methane Chemical compound C VNWKTOKETHGBQD-UHFFFAOYSA-N 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000013707 sensory perception of sound Effects 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99937—Sorting
Definitions
- Statistical Machine Learning literature refers to this task as text categorization, and partitions it into supervised and unsupervised methods.
- Supervised text categorization refers to the automatic assignment of topics to text collections when sample training data is available for each topic in a predefined topic set.
- Unsupervised text categorization methods do not use a predefined topic set with sample training data; instead, new documents are assigned topics following an unsupervised training phase.
- Query-driven topic identification often referred to as Topic Distillation has received a lot of attention with the ubiquity of the Web.
- the second phase it is possible to modify the features (n-grams or technical terms) based on the genre of the input signal.
- the minimum features (technical terms) within a segment could be set to, for example 15 to 20 n-grams for a video that is approximately 1 hour long.
- the minimum features in the segment are set equal to all the technical terms used plus a predefined number of top content bearing words, for examples 5 to 15 words. In this case the duration of the source video could be shorter than one hour long.
- the automatic speech transcript module 230 provides automated speech transcription, to automatically provide a textual representation from the audio input.
- the automatic speech transcript module 230 can also be referred to as automatic speech recognition (ASR) module.
- ASR automatic speech recognition
- the pruning module 245 improves the feature set generated by the feature extractor 240 , by eliminating less content bearing words or less differentiating features.
- process 900 returns to step 902 as explained earlier to analyze the next dense segment. Otherwise, at step 944 , process 900 selects the sub-segment with the best fit, for example S k ⁇ 3 and tries to extend the fit left and right (expanding the beginning and end of S k ⁇ 3 ) towards the adjacent sub-segments to define the new knots identified by this extension. These new knots divide the current segment into sub-segments of smaller size.
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
- “Adj” is an adjective, but not a determiner;
- “Noun” is a noun, but not a pronoun;
- “Prep” is a preposition;
- “*” means zero or more occurrences; and
- “+” means at least one occurrence.
Si is the i-th speech segment,
R(Si) is the rank of the i-th speech segment,
Li is the length of the i-th speech segment,
fk is the k-th feature within segment Si
The foregoing equation provides a score for each segment where the highest scoring segment corresponds to the most important topic discussed in the input source.
Claims (22)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US10/109,960 US6928407B2 (en) | 2002-03-29 | 2002-03-29 | System and method for the automatic discovery of salient segments in speech transcripts |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US10/109,960 US6928407B2 (en) | 2002-03-29 | 2002-03-29 | System and method for the automatic discovery of salient segments in speech transcripts |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20030187642A1 US20030187642A1 (en) | 2003-10-02 |
| US6928407B2 true US6928407B2 (en) | 2005-08-09 |
Family
ID=28453204
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US10/109,960 Expired - Lifetime US6928407B2 (en) | 2002-03-29 | 2002-03-29 | System and method for the automatic discovery of salient segments in speech transcripts |
Country Status (1)
| Country | Link |
|---|---|
| US (1) | US6928407B2 (en) |
Cited By (39)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020196679A1 (en) * | 2001-03-13 | 2002-12-26 | Ofer Lavi | Dynamic natural language understanding |
| US20050102135A1 (en) * | 2003-11-12 | 2005-05-12 | Silke Goronzy | Apparatus and method for automatic extraction of important events in audio signals |
| US20050160449A1 (en) * | 2003-11-12 | 2005-07-21 | Silke Goronzy | Apparatus and method for automatic dissection of segmented audio signals |
| US20060155706A1 (en) * | 2005-01-12 | 2006-07-13 | Kalinichenko Boris O | Context-adaptive content distribution to handheld devices |
| US20070083367A1 (en) * | 2005-10-11 | 2007-04-12 | Motorola, Inc. | Method and system for bandwidth efficient and enhanced concatenative synthesis based communication |
| US20070156392A1 (en) * | 2005-12-30 | 2007-07-05 | International Business Machines Corporation | Method and system for automatically building natural language understanding models |
| US20070185702A1 (en) * | 2006-02-09 | 2007-08-09 | John Harney | Language independent parsing in natural language systems |
| US20070239445A1 (en) * | 2006-04-11 | 2007-10-11 | International Business Machines Corporation | Method and system for automatic transcription prioritization |
| US20070250777A1 (en) * | 2006-04-25 | 2007-10-25 | Cyberlink Corp. | Systems and methods for classifying sports video |
| US20080154897A1 (en) * | 2006-11-20 | 2008-06-26 | Siemens Medical Solution Usa, Inc. | Automated Interpretation and Replacement of Date References in Unstructured Text |
| US20090063150A1 (en) * | 2007-08-27 | 2009-03-05 | International Business Machines Corporation | Method for automatically identifying sentence boundaries in noisy conversational data |
| US20090112588A1 (en) * | 2007-10-31 | 2009-04-30 | International Business Machines Corporation | Method for segmenting communication transcripts using unsupervsed and semi-supervised techniques |
| US20090157384A1 (en) * | 2007-12-12 | 2009-06-18 | Microsoft Corporation | Semi-supervised part-of-speech tagging |
| US20090313025A1 (en) * | 2002-03-29 | 2009-12-17 | At&T Corp. | Automatic Segmentation in Speech Synthesis |
| US20110013756A1 (en) * | 2009-07-15 | 2011-01-20 | Google Inc. | Highlighting of Voice Message Transcripts |
| US20110093263A1 (en) * | 2009-10-20 | 2011-04-21 | Mowzoon Shahin M | Automated Video Captioning |
| US8005676B2 (en) * | 2006-09-29 | 2011-08-23 | Verint Americas, Inc. | Speech analysis using statistical learning |
| US20110238416A1 (en) * | 2010-03-24 | 2011-09-29 | Microsoft Corporation | Acoustic Model Adaptation Using Splines |
| US20110246183A1 (en) * | 2008-12-15 | 2011-10-06 | Kentaro Nagatomo | Topic transition analysis system, method, and program |
| US20120011109A1 (en) * | 2010-07-09 | 2012-01-12 | Comcast Cable Communications, Llc | Automatic Segmentation of Video |
| US20120209605A1 (en) * | 2011-02-14 | 2012-08-16 | Nice Systems Ltd. | Method and apparatus for data exploration of interactions |
| US20120209606A1 (en) * | 2011-02-14 | 2012-08-16 | Nice Systems Ltd. | Method and apparatus for information extraction from interactions |
| US20130054612A1 (en) * | 2006-10-10 | 2013-02-28 | Abbyy Software Ltd. | Universal Document Similarity |
| US8417223B1 (en) | 2010-08-24 | 2013-04-09 | Google Inc. | Advanced voicemail features without carrier voicemail support |
| US20140025370A1 (en) * | 2008-11-10 | 2014-01-23 | Apple Inc. | Data detection |
| US20140101171A1 (en) * | 2012-10-10 | 2014-04-10 | Abbyy Infopoisk Llc | Similar Document Search |
| US8700404B1 (en) * | 2005-08-27 | 2014-04-15 | At&T Intellectual Property Ii, L.P. | System and method for using semantic and syntactic graphs for utterance classification |
| US20140129212A1 (en) * | 2006-10-10 | 2014-05-08 | Abbyy Infopoisk Llc | Universal Difference Measure |
| US8892447B1 (en) * | 2011-10-25 | 2014-11-18 | Nuance Communications, Inc. | Quality assessment of text derived from an audio signal |
| US9495358B2 (en) | 2006-10-10 | 2016-11-15 | Abbyy Infopoisk Llc | Cross-language text clustering |
| US9626353B2 (en) | 2014-01-15 | 2017-04-18 | Abbyy Infopoisk Llc | Arc filtering in a syntactic graph |
| US9626358B2 (en) | 2014-11-26 | 2017-04-18 | Abbyy Infopoisk Llc | Creating ontologies by analyzing natural language texts |
| US9633005B2 (en) | 2006-10-10 | 2017-04-25 | Abbyy Infopoisk Llc | Exhaustive automatic processing of textual information |
| US9734820B2 (en) | 2013-11-14 | 2017-08-15 | Nuance Communications, Inc. | System and method for translating real-time speech using segmentation based on conjunction locations |
| US9740682B2 (en) | 2013-12-19 | 2017-08-22 | Abbyy Infopoisk Llc | Semantic disambiguation using a statistical analysis |
| US9817818B2 (en) | 2006-10-10 | 2017-11-14 | Abbyy Production Llc | Method and system for translating sentence between languages based on semantic structure of the sentence |
| US10446135B2 (en) * | 2014-07-09 | 2019-10-15 | Genesys Telecommunications Laboratories, Inc. | System and method for semantically exploring concepts |
| US11138978B2 (en) | 2019-07-24 | 2021-10-05 | International Business Machines Corporation | Topic mining based on interactionally defined activity sequences |
| US11475668B2 (en) | 2020-10-09 | 2022-10-18 | Bank Of America Corporation | System and method for automatic video categorization |
Families Citing this family (59)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7253919B2 (en) | 2000-11-30 | 2007-08-07 | Ricoh Co., Ltd. | Printer with embedded retrieval and publishing interface |
| US7747655B2 (en) * | 2001-11-19 | 2010-06-29 | Ricoh Co. Ltd. | Printable representations for time-based media |
| US7424129B2 (en) | 2001-11-19 | 2008-09-09 | Ricoh Company, Ltd | Printing system with embedded audio/video content recognition and processing |
| US7861169B2 (en) | 2001-11-19 | 2010-12-28 | Ricoh Co. Ltd. | Multimedia print driver dialog interfaces |
| US20040148170A1 (en) * | 2003-01-23 | 2004-07-29 | Alejandro Acero | Statistical classifiers for spoken language understanding and command/control scenarios |
| US8335683B2 (en) * | 2003-01-23 | 2012-12-18 | Microsoft Corporation | System for using statistical classifiers for spoken language understanding |
| US20040177317A1 (en) * | 2003-03-07 | 2004-09-09 | John Bradstreet | Closed caption navigation |
| US6931374B2 (en) * | 2003-04-01 | 2005-08-16 | Microsoft Corporation | Method of speech recognition using variational inference with switching state space models |
| US7275159B2 (en) | 2003-08-11 | 2007-09-25 | Ricoh Company, Ltd. | Multimedia output device having embedded encryption functionality |
| US7505163B2 (en) | 2003-09-25 | 2009-03-17 | Ricoh Co., Ltd. | User interface for networked printer |
| US7508535B2 (en) | 2003-09-25 | 2009-03-24 | Ricoh Co., Ltd. | Stand alone multimedia printer with user interface for allocating processing |
| JP2005108230A (en) | 2003-09-25 | 2005-04-21 | Ricoh Co Ltd | Audio / video content recognition / processing function built-in printing system |
| US8077341B2 (en) | 2003-09-25 | 2011-12-13 | Ricoh Co., Ltd. | Printer with audio or video receiver, recorder, and real-time content-based processing logic |
| US7573593B2 (en) | 2003-09-25 | 2009-08-11 | Ricoh Company, Ltd. | Printer with hardware and software interfaces for media devices |
| US7511846B2 (en) | 2003-09-25 | 2009-03-31 | Ricoh Co., Ltd. | Printer having embedded functionality for printing time-based media |
| US7864352B2 (en) | 2003-09-25 | 2011-01-04 | Ricoh Co. Ltd. | Printer with multimedia server |
| US7528977B2 (en) | 2003-09-25 | 2009-05-05 | Ricoh Co., Ltd. | Printer with hardware and software interfaces for peripheral devices |
| US7528976B2 (en) | 2003-09-25 | 2009-05-05 | Ricoh Co., Ltd. | Stand alone printer with hardware/software interfaces for sharing multimedia processing |
| US7440126B2 (en) | 2003-09-25 | 2008-10-21 | Ricoh Co., Ltd | Printer with document-triggered processing |
| US7570380B2 (en) | 2003-09-25 | 2009-08-04 | Ricoh Company, Ltd. | Printer user interface |
| WO2005069158A2 (en) * | 2004-01-16 | 2005-07-28 | Nec Corp | Text-processing method, program, program recording medium, and device thereof |
| US8274666B2 (en) | 2004-03-30 | 2012-09-25 | Ricoh Co., Ltd. | Projector/printer for displaying or printing of documents |
| US7603615B2 (en) | 2004-03-30 | 2009-10-13 | Ricoh Co., Ltd. | Multimedia projector-printer |
| JP4220449B2 (en) * | 2004-09-16 | 2009-02-04 | 株式会社東芝 | Indexing device, indexing method, and indexing program |
| US7551312B1 (en) | 2005-03-17 | 2009-06-23 | Ricoh Co., Ltd. | Annotable document printer |
| US8433558B2 (en) | 2005-07-25 | 2013-04-30 | At&T Intellectual Property Ii, L.P. | Methods and systems for natural language understanding using human knowledge and collected data |
| US8036889B2 (en) * | 2006-02-27 | 2011-10-11 | Nuance Communications, Inc. | Systems and methods for filtering dictated and non-dictated sections of documents |
| US8631005B2 (en) | 2006-12-28 | 2014-01-14 | Ebay Inc. | Header-token driven automatic text segmentation |
| JP2008197229A (en) * | 2007-02-09 | 2008-08-28 | Konica Minolta Business Technologies Inc | Speech recognition dictionary construction device and program |
| US8428360B2 (en) * | 2007-11-01 | 2013-04-23 | International Business Machines Corporation | System and method for real-time new event detection on video streams |
| US8422787B2 (en) * | 2007-12-27 | 2013-04-16 | Nec Corporation | Apparatus, method and program for text segmentation |
| US20110136085A1 (en) * | 2009-12-09 | 2011-06-09 | Gondy Leroy | Computer based system and method for assisting an interviewee in remembering and recounting information about a prior event using a cognitive interview and natural language processing |
| US8494852B2 (en) | 2010-01-05 | 2013-07-23 | Google Inc. | Word-level correction of speech input |
| US8180778B1 (en) * | 2010-02-05 | 2012-05-15 | Google Inc. | Generating action trails from web history |
| US8788260B2 (en) * | 2010-05-11 | 2014-07-22 | Microsoft Corporation | Generating snippets based on content features |
| US9412372B2 (en) * | 2012-05-08 | 2016-08-09 | SpeakWrite, LLC | Method and system for audio-video integration |
| US9311914B2 (en) * | 2012-09-03 | 2016-04-12 | Nice-Systems Ltd | Method and apparatus for enhanced phonetic indexing and search |
| US20140214402A1 (en) * | 2013-01-25 | 2014-07-31 | Cisco Technology, Inc. | Implementation of unsupervised topic segmentation in a data communications environment |
| US9396256B2 (en) | 2013-12-13 | 2016-07-19 | International Business Machines Corporation | Pattern based audio searching method and system |
| IL230741B (en) * | 2014-01-30 | 2019-11-28 | Verint Systems Ltd | Systems and methods for keyword spotting using alternating search algorithms |
| US9892194B2 (en) * | 2014-04-04 | 2018-02-13 | Fujitsu Limited | Topic identification in lecture videos |
| FR3028969A1 (en) * | 2014-11-24 | 2016-05-27 | Orange | NAVIGATION METHOD IN SOUND CONTENT, CORRESPONDING COMPUTER DEVICE AND PROGRAM. |
| US10019514B2 (en) * | 2015-03-19 | 2018-07-10 | Nice Ltd. | System and method for phonetic search over speech recordings |
| US10353905B2 (en) * | 2015-04-24 | 2019-07-16 | Salesforce.Com, Inc. | Identifying entities in semi-structured content |
| EP3089159B1 (en) | 2015-04-28 | 2019-08-28 | Google LLC | Correcting voice recognition using selective re-speak |
| US20170092277A1 (en) * | 2015-09-30 | 2017-03-30 | Seagate Technology Llc | Search and Access System for Media Content Files |
| US20170294185A1 (en) * | 2016-04-08 | 2017-10-12 | Knuedge Incorporated | Segmentation using prior distributions |
| US11328159B2 (en) * | 2016-11-28 | 2022-05-10 | Microsoft Technology Licensing, Llc | Automatically detecting contents expressing emotions from a video and enriching an image index |
| US10372816B2 (en) * | 2016-12-13 | 2019-08-06 | International Business Machines Corporation | Preprocessing of string inputs in natural language processing |
| US11322148B2 (en) * | 2019-04-30 | 2022-05-03 | Microsoft Technology Licensing, Llc | Speaker attributed transcript generation |
| CN111918145B (en) * | 2019-05-07 | 2022-09-09 | 华为技术有限公司 | Video segmentation method and video segmentation device |
| CN110704633B (en) * | 2019-09-04 | 2023-07-21 | 平安科技(深圳)有限公司 | Named entity recognition method, named entity recognition device, named entity recognition computer equipment and named entity recognition storage medium |
| US11270061B2 (en) * | 2020-02-25 | 2022-03-08 | International Business Machines Corporation | Automatic generation of training data for scientific paper summarization using videos |
| CN111510765B (en) * | 2020-04-30 | 2021-10-22 | 浙江蓝鸽科技有限公司 | Audio label intelligent labeling method and device based on teaching video and storage medium |
| KR102314564B1 (en) * | 2021-03-08 | 2021-10-19 | 주식회사 덴컴 | Method for managing chart using speech recognition and apparatus using the same |
| US11682415B2 (en) * | 2021-03-19 | 2023-06-20 | International Business Machines Corporation | Automatic video tagging |
| KR102781076B1 (en) * | 2021-09-02 | 2025-03-18 | 아주대학교산학협력단 | Method and device of segmenting topics of contents |
| US12346361B2 (en) * | 2023-11-16 | 2025-07-01 | Adobe Inc. | Hierarchical segmentation of unstructured text using neural networks |
| US20250298835A1 (en) * | 2024-03-19 | 2025-09-25 | Tubertdata LLC | Methods And Systems For Personalized Transcript Searching And Indexing Of Online Multimedia |
Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5371807A (en) | 1992-03-20 | 1994-12-06 | Digital Equipment Corporation | Method and apparatus for text classification |
| US5664227A (en) * | 1994-10-14 | 1997-09-02 | Carnegie Mellon University | System and method for skimming digital audio/video data |
| US5708767A (en) | 1995-02-03 | 1998-01-13 | The Trustees Of Princeton University | Method and apparatus for video browsing based on content and structure |
| US5732260A (en) | 1994-09-01 | 1998-03-24 | International Business Machines Corporation | Information retrieval system and method |
| US5835667A (en) * | 1994-10-14 | 1998-11-10 | Carnegie Mellon University | Method and apparatus for creating a searchable digital video library and a system and method of using such a library |
| US5990980A (en) | 1997-12-23 | 1999-11-23 | Sarnoff Corporation | Detection of transitions in video sequences |
| US6104989A (en) * | 1998-07-29 | 2000-08-15 | International Business Machines Corporation | Real time detection of topical changes and topic identification via likelihood based methods |
| US6314399B1 (en) * | 1998-06-12 | 2001-11-06 | Atr Interpreting Telecommunications Research | Apparatus for generating a statistical sequence model called class bi-multigram model with bigram dependencies assumed between adjacent sequences |
| US6424946B1 (en) * | 1999-04-09 | 2002-07-23 | International Business Machines Corporation | Methods and apparatus for unknown speaker labeling using concurrent speech recognition, segmentation, classification and clustering |
| US6529902B1 (en) * | 1999-11-08 | 2003-03-04 | International Business Machines Corporation | Method and system for off-line detection of textual topical changes and topic identification via likelihood based methods for improved language modeling |
| US6636238B1 (en) * | 1999-04-20 | 2003-10-21 | International Business Machines Corporation | System and method for linking an audio stream with accompanying text material |
| US6714909B1 (en) * | 1998-08-13 | 2004-03-30 | At&T Corp. | System and method for automated multimedia content indexing and retrieval |
-
2002
- 2002-03-29 US US10/109,960 patent/US6928407B2/en not_active Expired - Lifetime
Patent Citations (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5371807A (en) | 1992-03-20 | 1994-12-06 | Digital Equipment Corporation | Method and apparatus for text classification |
| US5732260A (en) | 1994-09-01 | 1998-03-24 | International Business Machines Corporation | Information retrieval system and method |
| US5664227A (en) * | 1994-10-14 | 1997-09-02 | Carnegie Mellon University | System and method for skimming digital audio/video data |
| US5835667A (en) * | 1994-10-14 | 1998-11-10 | Carnegie Mellon University | Method and apparatus for creating a searchable digital video library and a system and method of using such a library |
| US5708767A (en) | 1995-02-03 | 1998-01-13 | The Trustees Of Princeton University | Method and apparatus for video browsing based on content and structure |
| US5990980A (en) | 1997-12-23 | 1999-11-23 | Sarnoff Corporation | Detection of transitions in video sequences |
| US6314399B1 (en) * | 1998-06-12 | 2001-11-06 | Atr Interpreting Telecommunications Research | Apparatus for generating a statistical sequence model called class bi-multigram model with bigram dependencies assumed between adjacent sequences |
| US6104989A (en) * | 1998-07-29 | 2000-08-15 | International Business Machines Corporation | Real time detection of topical changes and topic identification via likelihood based methods |
| US6714909B1 (en) * | 1998-08-13 | 2004-03-30 | At&T Corp. | System and method for automated multimedia content indexing and retrieval |
| US6424946B1 (en) * | 1999-04-09 | 2002-07-23 | International Business Machines Corporation | Methods and apparatus for unknown speaker labeling using concurrent speech recognition, segmentation, classification and clustering |
| US6636238B1 (en) * | 1999-04-20 | 2003-10-21 | International Business Machines Corporation | System and method for linking an audio stream with accompanying text material |
| US6529902B1 (en) * | 1999-11-08 | 2003-03-04 | International Business Machines Corporation | Method and system for off-line detection of textual topical changes and topic identification via likelihood based methods for improved language modeling |
Non-Patent Citations (5)
| Title |
|---|
| James Allan, "Topic Detection and Tracking Pilot Study Final Report," DARPA Broadcast News Transciption and Understanding Workshop, Feb. 1998. |
| S. Dharanipragada, M. Franz, J.S. McCarley, S. Roukos, T. Ward, "Story Segmentation and Topic Detection for Recognized Speech", {\em Proceedings of Eurospeech}, Budapest, Hungary, Sep. 1999, pp. 2435-2438. |
| Slaney et al., "Hierarchical segmentation using latent semantic indexing in scale space," 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, May 7-11, 2001, vol. 3, pp. 1437 to 1440. * |
| Takao et al., "Segmentation and classification of TV news articles based on speech dictation," TENCON 99. Proceedings of the IEEE Region 10 Conference. Sep. 15-17, 1999, vol. 1, pp. 92 to 95. * |
| Yamron et al., "Event tracking and text segmentation via hidden Markov models," 1997 IEEE Workshop on Automatic Speech Recognition and Understanding, Dec. 14-17, 1997, pp. 519 to 526. * |
Cited By (70)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20020196679A1 (en) * | 2001-03-13 | 2002-12-26 | Ofer Lavi | Dynamic natural language understanding |
| US20080154581A1 (en) * | 2001-03-13 | 2008-06-26 | Intelligate, Ltd. | Dynamic natural language understanding |
| US7840400B2 (en) | 2001-03-13 | 2010-11-23 | Intelligate, Ltd. | Dynamic natural language understanding |
| US7216073B2 (en) * | 2001-03-13 | 2007-05-08 | Intelligate, Ltd. | Dynamic natural language understanding |
| US20070112555A1 (en) * | 2001-03-13 | 2007-05-17 | Ofer Lavi | Dynamic Natural Language Understanding |
| US20070112556A1 (en) * | 2001-03-13 | 2007-05-17 | Ofer Lavi | Dynamic Natural Language Understanding |
| US20090313025A1 (en) * | 2002-03-29 | 2009-12-17 | At&T Corp. | Automatic Segmentation in Speech Synthesis |
| US8131547B2 (en) * | 2002-03-29 | 2012-03-06 | At&T Intellectual Property Ii, L.P. | Automatic segmentation in speech synthesis |
| US8635065B2 (en) * | 2003-11-12 | 2014-01-21 | Sony Deutschland Gmbh | Apparatus and method for automatic extraction of important events in audio signals |
| US7962330B2 (en) * | 2003-11-12 | 2011-06-14 | Sony Deutschland Gmbh | Apparatus and method for automatic dissection of segmented audio signals |
| US20050160449A1 (en) * | 2003-11-12 | 2005-07-21 | Silke Goronzy | Apparatus and method for automatic dissection of segmented audio signals |
| US20050102135A1 (en) * | 2003-11-12 | 2005-05-12 | Silke Goronzy | Apparatus and method for automatic extraction of important events in audio signals |
| US20060155706A1 (en) * | 2005-01-12 | 2006-07-13 | Kalinichenko Boris O | Context-adaptive content distribution to handheld devices |
| US7606799B2 (en) * | 2005-01-12 | 2009-10-20 | Fmr Llc | Context-adaptive content distribution to handheld devices |
| US9905223B2 (en) | 2005-08-27 | 2018-02-27 | Nuance Communications, Inc. | System and method for using semantic and syntactic graphs for utterance classification |
| US8700404B1 (en) * | 2005-08-27 | 2014-04-15 | At&T Intellectual Property Ii, L.P. | System and method for using semantic and syntactic graphs for utterance classification |
| US9218810B2 (en) | 2005-08-27 | 2015-12-22 | At&T Intellectual Property Ii, L.P. | System and method for using semantic and syntactic graphs for utterance classification |
| US20070083367A1 (en) * | 2005-10-11 | 2007-04-12 | Motorola, Inc. | Method and system for bandwidth efficient and enhanced concatenative synthesis based communication |
| US20070156392A1 (en) * | 2005-12-30 | 2007-07-05 | International Business Machines Corporation | Method and system for automatically building natural language understanding models |
| US7835911B2 (en) * | 2005-12-30 | 2010-11-16 | Nuance Communications, Inc. | Method and system for automatically building natural language understanding models |
| US20070185702A1 (en) * | 2006-02-09 | 2007-08-09 | John Harney | Language independent parsing in natural language systems |
| US8229733B2 (en) | 2006-02-09 | 2012-07-24 | John Harney | Method and apparatus for linguistic independent parsing in a natural language systems |
| US8121838B2 (en) | 2006-04-11 | 2012-02-21 | Nuance Communications, Inc. | Method and system for automatic transcription prioritization |
| US20070239445A1 (en) * | 2006-04-11 | 2007-10-11 | International Business Machines Corporation | Method and system for automatic transcription prioritization |
| US8407050B2 (en) | 2006-04-11 | 2013-03-26 | Nuance Communications, Inc. | Method and system for automatic transcription prioritization |
| US8682654B2 (en) * | 2006-04-25 | 2014-03-25 | Cyberlink Corp. | Systems and methods for classifying sports video |
| US20070250777A1 (en) * | 2006-04-25 | 2007-10-25 | Cyberlink Corp. | Systems and methods for classifying sports video |
| US8005676B2 (en) * | 2006-09-29 | 2011-08-23 | Verint Americas, Inc. | Speech analysis using statistical learning |
| US20130054612A1 (en) * | 2006-10-10 | 2013-02-28 | Abbyy Software Ltd. | Universal Document Similarity |
| US9892111B2 (en) * | 2006-10-10 | 2018-02-13 | Abbyy Production Llc | Method and device to estimate similarity between documents having multiple segments |
| US9235573B2 (en) * | 2006-10-10 | 2016-01-12 | Abbyy Infopoisk Llc | Universal difference measure |
| US9817818B2 (en) | 2006-10-10 | 2017-11-14 | Abbyy Production Llc | Method and system for translating sentence between languages based on semantic structure of the sentence |
| US9633005B2 (en) | 2006-10-10 | 2017-04-25 | Abbyy Infopoisk Llc | Exhaustive automatic processing of textual information |
| US9495358B2 (en) | 2006-10-10 | 2016-11-15 | Abbyy Infopoisk Llc | Cross-language text clustering |
| US20140129212A1 (en) * | 2006-10-10 | 2014-05-08 | Abbyy Infopoisk Llc | Universal Difference Measure |
| US20080154897A1 (en) * | 2006-11-20 | 2008-06-26 | Siemens Medical Solution Usa, Inc. | Automated Interpretation and Replacement of Date References in Unstructured Text |
| US20090063150A1 (en) * | 2007-08-27 | 2009-03-05 | International Business Machines Corporation | Method for automatically identifying sentence boundaries in noisy conversational data |
| US8364485B2 (en) * | 2007-08-27 | 2013-01-29 | International Business Machines Corporation | Method for automatically identifying sentence boundaries in noisy conversational data |
| US7912714B2 (en) | 2007-10-31 | 2011-03-22 | Nuance Communications, Inc. | Method for segmenting communication transcripts using unsupervised and semi-supervised techniques |
| US20090112571A1 (en) * | 2007-10-31 | 2009-04-30 | International Business Machines Corporation | Method for segmenting communication transcripts using unsupervised and semi-supervised techniques |
| US20090112588A1 (en) * | 2007-10-31 | 2009-04-30 | International Business Machines Corporation | Method for segmenting communication transcripts using unsupervsed and semi-supervised techniques |
| US8275607B2 (en) | 2007-12-12 | 2012-09-25 | Microsoft Corporation | Semi-supervised part-of-speech tagging |
| US20090157384A1 (en) * | 2007-12-12 | 2009-06-18 | Microsoft Corporation | Semi-supervised part-of-speech tagging |
| US9489371B2 (en) * | 2008-11-10 | 2016-11-08 | Apple Inc. | Detection of data in a sequence of characters |
| US20140025370A1 (en) * | 2008-11-10 | 2014-01-23 | Apple Inc. | Data detection |
| US20110246183A1 (en) * | 2008-12-15 | 2011-10-06 | Kentaro Nagatomo | Topic transition analysis system, method, and program |
| US8670978B2 (en) * | 2008-12-15 | 2014-03-11 | Nec Corporation | Topic transition analysis system, method, and program |
| US8300776B2 (en) | 2009-07-15 | 2012-10-30 | Google Inc. | Highlighting of voice message transcripts |
| US8588378B2 (en) | 2009-07-15 | 2013-11-19 | Google Inc. | Highlighting of voice message transcripts |
| US20110013756A1 (en) * | 2009-07-15 | 2011-01-20 | Google Inc. | Highlighting of Voice Message Transcripts |
| US20110093263A1 (en) * | 2009-10-20 | 2011-04-21 | Mowzoon Shahin M | Automated Video Captioning |
| US8700394B2 (en) | 2010-03-24 | 2014-04-15 | Microsoft Corporation | Acoustic model adaptation using splines |
| US20110238416A1 (en) * | 2010-03-24 | 2011-09-29 | Microsoft Corporation | Acoustic Model Adaptation Using Splines |
| US9177080B2 (en) | 2010-07-09 | 2015-11-03 | Comcast Cable Communications, Llc | Automatic segmentation of video |
| US20120011109A1 (en) * | 2010-07-09 | 2012-01-12 | Comcast Cable Communications, Llc | Automatic Segmentation of Video |
| US8423555B2 (en) * | 2010-07-09 | 2013-04-16 | Comcast Cable Communications, Llc | Automatic segmentation of video |
| US8417223B1 (en) | 2010-08-24 | 2013-04-09 | Google Inc. | Advanced voicemail features without carrier voicemail support |
| US8498625B2 (en) | 2010-08-24 | 2013-07-30 | Google Inc. | Advanced voicemail features without carrier voicemail support |
| US20120209605A1 (en) * | 2011-02-14 | 2012-08-16 | Nice Systems Ltd. | Method and apparatus for data exploration of interactions |
| US20120209606A1 (en) * | 2011-02-14 | 2012-08-16 | Nice Systems Ltd. | Method and apparatus for information extraction from interactions |
| US8892447B1 (en) * | 2011-10-25 | 2014-11-18 | Nuance Communications, Inc. | Quality assessment of text derived from an audio signal |
| US20140101171A1 (en) * | 2012-10-10 | 2014-04-10 | Abbyy Infopoisk Llc | Similar Document Search |
| US9189482B2 (en) * | 2012-10-10 | 2015-11-17 | Abbyy Infopoisk Llc | Similar document search |
| US9734820B2 (en) | 2013-11-14 | 2017-08-15 | Nuance Communications, Inc. | System and method for translating real-time speech using segmentation based on conjunction locations |
| US9740682B2 (en) | 2013-12-19 | 2017-08-22 | Abbyy Infopoisk Llc | Semantic disambiguation using a statistical analysis |
| US9626353B2 (en) | 2014-01-15 | 2017-04-18 | Abbyy Infopoisk Llc | Arc filtering in a syntactic graph |
| US10446135B2 (en) * | 2014-07-09 | 2019-10-15 | Genesys Telecommunications Laboratories, Inc. | System and method for semantically exploring concepts |
| US9626358B2 (en) | 2014-11-26 | 2017-04-18 | Abbyy Infopoisk Llc | Creating ontologies by analyzing natural language texts |
| US11138978B2 (en) | 2019-07-24 | 2021-10-05 | International Business Machines Corporation | Topic mining based on interactionally defined activity sequences |
| US11475668B2 (en) | 2020-10-09 | 2022-10-18 | Bank Of America Corporation | System and method for automatic video categorization |
Also Published As
| Publication number | Publication date |
|---|---|
| US20030187642A1 (en) | 2003-10-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US6928407B2 (en) | System and method for the automatic discovery of salient segments in speech transcripts | |
| Chelba et al. | Retrieval and browsing of spoken content | |
| US8775174B2 (en) | Method for indexing multimedia information | |
| US9542393B2 (en) | Method and system for indexing and searching timed media information based upon relevance intervals | |
| KR100388344B1 (en) | Method and apparatus for retrieving audio information using content and speaker information | |
| US8572088B2 (en) | Automated rich presentation of a semantic topic | |
| Allan et al. | Topic detection and tracking pilot study final report | |
| Purver | Topic segmentation | |
| Inkpen et al. | Semantic similarity for detecting recognition errors in automatic speech transcripts | |
| Soares et al. | Automatic topic segmentation for video lectures using low and high-level audio features | |
| Lin et al. | Enhanced BERT-based ranking models for spoken document retrieval | |
| Viswanathan et al. | Retrieval from spoken documents using content and speaker information | |
| Mamou et al. | Combination of multiple speech transcription methods for vocabulary independent search | |
| Ma et al. | A detection-based approach to broadcast news video story segmentation | |
| Guinaudeau et al. | Accounting for prosodic information to improve ASR-based topic tracking for TV broadcast news | |
| Zhu et al. | Video browsing and retrieval based on multimodal integration | |
| Ponceleon et al. | Automatic discovery of salient segments in imperfect speech transcripts | |
| Rigoll | The ALERT system: Advanced broadcast speech recognition technology for selective dissemination of multimedia information | |
| Wang et al. | The SoVideo Mandarin Chinese broadcast news retrieval system | |
| Chen et al. | Automatic Identification of Textual Topic Structure-Topic Segmentation | |
| Wu et al. | A new passage ranking algorithm for video question answering | |
| Jacobs et al. | Automatic, context-of-capture-based categorization, structure detection and segmentation of news telecasts | |
| Chaudhuri | Structured Models for Audio Content Analysis | |
| Chua et al. | TREC 2003 video retrieval and story segmentation task at NUS PRIS | |
| Zidouni et al. | Semantic annotation of transcribed audio broadcast news using contextual features in graphical discriminative models |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PONCELEON, DULCE BEATRIZ;SRINIVASAN, SAVITHA;REEL/FRAME:012808/0134 Effective date: 20020327 |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| FPAY | Fee payment |
Year of fee payment: 4 |
|
| AS | Assignment |
Owner name: NUANCE COMMUNICATIONS, INC., MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL BUSINESS MACHINES CORPORATION;REEL/FRAME:022354/0566 Effective date: 20081231 |
|
| FPAY | Fee payment |
Year of fee payment: 8 |
|
| FPAY | Fee payment |
Year of fee payment: 12 |
|
| AS | Assignment |
Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NUANCE COMMUNICATIONS, INC.;REEL/FRAME:065552/0934 Effective date: 20230920 |