LaBB-CAT Code

A linguistic annotation store

Brought to you by: robertfromont
[r3031]: / CHANGELOG.txt Maximize Restore History
526 lines (485 with data), 24.6 kB

================================================================================
Version 20200615:
================================================================================

* Add API support for SDKs:
  - R nzilbb.labbcat package v. 0.5-1 (https://github.com/nzilbb/labbcat-R)
  - Python labbcat library v. 0.2.2 (https://github.com/nzilbb/labbcat-py)
  - JavaScript @nzilbb/labbcat package v. 1.1.2 (https://github.com/nzilbb/labbcat-js)
  - Java nzilbb.java package v. 20200608.1944 (https://github.com/nzilbb/labbcat-java)
* Add support for extraction of annotations by time only (i.e. ignoring participant)
* Filtering by participant attribute value from participant page
* EMU-webapp version 1.3.1
* Rename 'who' layer 'participant' for clarity and consistency.
* Improvements to online help
* UX improvements, in particular the search matrix
* Performance improvements
* Bug fixes

================================================================================
Version 20200110:
================================================================================

* Add support for searching from R
  - https://cran.r-project.org/web/packages/nzilbb.labbcat/
* Change layer 'short description' to be called 'name' instead
* Add support for elicitation app reminder schedule configuration
* Migration to nzilbb.ag API export format modules, expansion of formats available
* Improvements to online help
* UX improvements, in particular the search matrix
* Performance improvements
* Bug fixes

================================================================================
Version 20191031:
================================================================================

* Selectable gender attribute for Praat processing
* Support for batch export to TextGrids, etc. on transcripts page
* Elicitation tasks: more configurable reminder schedule (for mobile app)
* Simplify user authentication configuration so that changes to server.xml
  are no longer required
* Database connection configuration now in META-INF/context.xml
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20190724:
================================================================================

* Annotation and alignment correction with EMU-webApp
* Time-interval annotation from CSV file upload
* Personality Insights Layer Manager:
   * update to use API version 2017-10-13 and support API keys
   * group participants for aggregate analysis
   * configurable sampling order, limit to 250Kb (imposed by web service)
   * option for annotating selected corpora only
* Frequency Layer Manager: frequency lists available as registerd dictionaries
* Topic Modeller: included in default distribution
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes
* Now requires at least Java 8

================================================================================
Version 20190425:
================================================================================

* Add support for nzilbb.labbcat R package:
  https://cran.r-project.org/web/packages/nzilbb.labbcat/index.html
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20181217:
================================================================================

* Support for Unicode IPA dictionary export from search results
* Dictionaries page for selected-entry export
* Make participant attributes accessible to script on Process with Praat page
* Topic Modelling layer manager
* Elicitation tasks support for episode layers
* Improve search speed
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20180809:
================================================================================

* Get installer working with latest versions of MySQL and Tomcat

================================================================================
Version 20180626:
================================================================================

* Configurable dashboard links and statistics
* Elicitation tasks support for:
   * copying tasks
   * two tags attributes instead of just one
   * specifying 'other' as an option for select attributes to allow participant
     to enter their own value
* Bulk export of original transcript files
* Batch TextGrid uploader is purely browser-based and no longer requires Java
* Add support for searching for the *absence* of an annotation, by searching for
   "NOT .+"
* CSV-based tagging of segments as well as words
* LabelMapper: for shortest edit path between representations of segments.
* LIWC:
   * hierarchical wordlist support
   * support for reference corpora
* Functionality to add single entries on dictionary page
* Make CELEX Cobuild frequency dictionaries available for keyword analysis by
   FrequencyLayerManager.
* Ensure Unisyn dictionaries are editable, and use utf8mb4 encoding for lexicon.
* Ensure unicode characters beyond the Basic Multilingual Plane (e.g. emoji) are
   supported correctly.
* Orthography conversion to handle smart quotes and m-dash
* Support custom Praat scripts for 'process with praat' page
* Improve Praat browser integration:
   * no longer requires entering the username/password again
   * native host doesn't need Java on OS X
* Improvements to online help (special thanks to proofreading by Andy Gibson)
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20171220:
================================================================================

* Elicitation tasks support for:
  * digit span task
  * ability to delay next button
  * email and URL attribute elicitation
  * specifying a large batch of steps via CSV file
* Statistics layer manager:
  * add support for targeting freeform layers for summary
  * standard deviation calculation for labels and durations
  * save transcript/participant measures in attributes
* Frequency layer manager: support for computing 'keyness'
* Javascript and Python layer managers: support for saving/loading local script
* Partition manager: Add support for partitioning by time as well as token count
* Reaper layer manager: load F0 estimations into annotation store.
* LIWC (Linguistic Inquiry and Word Count) layer manager
* Filtering speakers to search by name list file
* More finely-grained filtering on transcripts and participant pages
* Export Media on participant export page
* Export Media or Attributes (CSV) on transcripts page
* Upload transcript data via CSV file
* Support for custom Praat scripts on process with praat page
* Media/Transcript censorship based on selected layer and pattern
* Support for skinning
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20170613:
================================================================================

* Elicitation tasks support for:
  * pre-registered participants
  * suppressing the "Next" button during recording
  * transcript attribute entry in elicitation tasks
  * attribute entry throughout task, instead of only at the beginning
  * scriptable attribute entry validation
  * conditional step execution, based on entered attribute values
* Partition layer manager: by-speaker partitioning
* Frequency layer manager:
  * if filter layer is a meta layer, filter by speaker
  * allow filter layer to exclude, as well as include, tokens
  * support generating frequency list but not annotating tokens
* Reaper layer manager: don't run reaper on files that it's already been run on
* Add handling for known IPA diacritics to DISC/HTK conversion, so they're not
  taken as standalone phonemes during forced-alignment
* Add support for converting MP4 to WAV on demand
* Orthography transformations:
  * change 'smart' apostrophes to plain ones
  * strip all non alphanumerics except ~ - :
* Batch uploader is purely browser-based and no longer requires Java
* Transcript upload now uses nzilbb.ag API
* Add support for not keeping uploaded transcript file (to save disk space)
* Re-implement layered search for speed improvements in very large corpora
* Add by-corpus statistics page
* Support for MySQL 5.7 and Tomcat 8
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20161003:
================================================================================

* BAS web services layer manager
* Reaper layer manager
* Javascript layer manager: support label editing by language
* FlatFileDictionary: Add support for stripping out syllable/stress marking
* PartitionLayerManager: allow middle-n annotation for graph and freeform layers
* Elicitation tasks: Add support for an abritrary-depth hierarchy of steps
* Plain text import: allow non-time-based transcripts to have participants (e.g. forum posts)
* Automatic conversion from WMV to MP4 when ffmpeg is installed
* Support UTF-8 encoded TextGrids
* Ensure all pages are encoded UTF-8
* Security: allow marking of users to reset their password at next login
* Deploy layer configuration without java applets for Stanford Parser layer manager
* Replace tree editing applet with Javascript tree visualisation (read-only for now)
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20160707:
================================================================================

* Make orthography layer the default for searching
* Add Python layer manager that allows flexible automated annotation by specifying 
   scripts (separate download)
* CMU Dictionary layer manager: short-hesitation pronunciation tagging
* Context layer manager: sum since last mention
* Add centre of gravity option for batch processing with Praat
* Uploading multiple media files with the same suffix
* Remove media download links for view-only users
* Elicitation tasks support for:
  * video stimulus
  * randomised step presentation
  * downsample to 16kHz
  * rich text prompts
  * annotation of step transcripts
* Support by-series selection for batch export
* Standalone installer works:
  * for most recent version of Java
  * on OS X as well as Windows
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20160329:
================================================================================

* Split CSV match text from its prior and following context
* Ensure 'all utterances' excludes utterances with no words
* Spanish Phonology layer manager
* Implement layer manager for the IBM Watson Personality Insights service
* Deploy layer configuration without java applets for (almost) all layer managers
* Implement series organiser for moving transcripts between corpora and series'
* Re-implement Praat integration for Mozilla Firefox, by using browser extension
* Statistics layer manager - support for:
  * pause threshold
  * hapax legomena count
  * computation of vocabulary growth rate using hapax legomena
* Unisyn layer manager: Implement syllable recovery
* Frequency layer manager - support for:
  * meta layers
  * segment layers
  * filtering by another layer
* HTK layer manager: add support for saving normalised acoustic scores for phone alignments
* Partition Manager - support for:
  * 'graph' and 'who' bounded partitioning
  * transcripts to be excluded by type
  * copying word layer labels for first-n partititions
* Improve support for importing TEI files
* ELAN: Allow transcript convention annotation to be disabled
* Improve elicitation task participant form layout
* Implement application/installer for a standalone local installation
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20151207:
================================================================================

* Support in-situ transcript editing, and add a warning when downloading the 
   original file, about the possibility that LaBB-CAT's version of the transcript
   may have diverged from the original file
* Ensure that generated Transcriber, ELAN, and TextGrid files can be re-imported
   without loss of auxiliary annotations (lexical, pronouns, noise, and comment),
   so that originally-uploaded file is no longer required.
* Allow multiple participant records to be merged into a single record
* Support export of media files with converted transcripts
* Support download of binary annotations
* Add option to restrict results to only one per transcript
* Flat File Dictionary layer manager: export option
* Context layer manager: add support for pause detection configurably within the 
   same turn or between turns
* CELEX English layer manager: use syllable/stress-marked dictionary entries in 
   suggestions
* Statistics layer manager: allow annotation based on Freeform annotation bounds
* Statistics layer manager: add support for type/token ratio, and to target other 
   Meta layers
* HTK layer manager: allow selection of only one channel for training/alignment
* Add MATLAB layer manager that allows MATLAB scripts to be applied to utterance
   audio, and the results saved as annotations
* Add Partition layer manager that breaks transcripts into chunks of configurable size
* Elicit speech: add support for background upload
* Elicit speech: support for display/signing consent, and downloading as PDF
* Elicit speech: support interstep pause/countdown
* Elicit speech: display participant ID at the end of the task
* Elicit speech: internationalization
* Formal media track definitions
* Re-implement Praat integration for Google Chrome, by using browser extension
* Include OS X sendpraat library, so that sendpraat no longer needs to be 
   installed separately on Macs
* Support deletion of annotations (as well as addition) for CSV-based annotation
* Improve support for ingesting TEI texts
* Change licence from GPL to AGPL
* Make javascript references compatible with LibreJS browser plugin
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20150906:
================================================================================

* Add Javascript layer manager that allows flexible automated annotation by specifying scripts
* Add finer-grained access control mechanism, allowing users to be granted access on the basis
   of selected transcript attribute values
* Add support for defining elicitation tasks and steps, and for speech elicitation, recording, 
   and upload directly from the browser (also support for forthcoming elicitation mobile apps)
* Implement TEI P5 import
* Implemented export converter for TeX files
* CELEX English layer manager: Split possible-linking-r ~R pronunciations into two pronunciations,
   without-r ~ and with-r ~r, so that vowel-final searches include possible-linking-r results
   without having to think about R, and no special handling is required for HTK forced alignment
   or syllable recovery
* Plain text upload to determine sound file length in order to set anchor offset range even when
   the text doesn't include timestamps
* Ensure speakerless plain text transcripts are give a speaker named after the transcript series
* Allow empty transcripts
* Ensure Flat File dictionary lookups are case/accent sensitive
* Add ability to limit previous/next word data to the utterance (as well as the turn) in 
   'insert data' page
* Add images as a possible media type for associating to transcripts
* Improve Praat integration
* Improvements to online help
* UX improvements
* Performance improvements
* Bug fixes

================================================================================
Version 20150120:
================================================================================

* Move read-only upload operations to a read-only-user accessible place
* Implement generic "flat file" dictionary layer manager, for handling arbitrary
    dictionaries
* Use centred KWIC centred display for results
* Highlight clicked match with green in transcript, to differentiate from other
    matches (yellow) in the same transcript
* Allow plain-text export of speech-only (i.e. no speaker names, etc.), for
    easy import into corpus linguistics tools like AntConc
* Add support for inserting data about neighboring segment annotations into
    CSV results files
* When recovering syllables, ignore possible-linking-R
* Improve pronunciation suggestions by including syllable boundary
* Improve syllable/stress recovery by ordering monosyllabics with their
    stress-marked version first
* Improve HTK forced-alignment user interface
* Remove 8000 and 16000 p2fa models (which are not used by the HTK layer manager)
    to reduce installer size
* Improve HTK error handling
* Enable conversion by HTK layer manager from P2FA ARBAbet encoding to DISC,
    depending on layer type
* Improve support for very large CSV results server-side processing with Praat
* Allow statistics layer manager to filter target annotations by regular
    expression
* Improve transcript data export
* Language filtering for PatternMatcher
* Ensure CMUDict pronunciation suggestions assume ARPAbet instead of DISC
* Add start/end times to extractIntervals results
* Plain Text transcript upload and handling of non-time-synchronized transcripts
* Improve TEI conversion (inspired by Schmidt (2011))
* Use HTML5 for media playback (or Windows Media Player for Windows XP /
   Internet Explorer users)
* Use JSON (instead of XHTML) and JQuery for LaBB-CAT client communication,
   results page, and task tracking
* Only use Java applet on transcript page for Praat integration (no Java is
   necessary on this page if no Praat integration is required)
* Remove requirement for "Install components for sound to work"
* Integrate with ffmpeg for WAV to MP3 conversion (for Internet Explorer users)
* Improve installation procedure
* Replace Oster Miller CSV utils with Apache Commons CSV API
* Re-implement extractIntervals so it's task-threaded (better for very large
    CSV files)
* Fix security problem affecting LDAP users
* Include sendraat library for Linux/64bit systems
* Use all-permissions for applets (for Linux / Iced Tea support)
* Bug fixes
* Updates to online help, in particular for setting up Praat integration on OS X
* Reduce size of install set
* Add CHANGELOG.txt

================================================================================
Version 20140328:
================================================================================

* Selectable sorting and filter by transcript count
* Improvements to handling of Buckeye/Switchboard files
* Resolve issues with Java applet security
* Improve sendpraat installation for OS X
* Bug fixes

================================================================================
Version 20140313:
================================================================================

* Update syllable count to use syllable boundary markers in CELEX PhonStrsDISC,
    so that custom words can be captured
* Lookup raw entries for dictionary editing, so that syllable/stress markers are
    included in custom entries by default
* TranscriberAG (TAG) handling
* Switchboard file handling
* Improve DISC Helper by adding classes VOWEL and CONSONANT, and making it a
    popup panel
* Improve support for very large CSV results download
* Improve guessing of default TextGrid structure and layer mappings
* Allow word alignments in TextGrids to be ignored (so that annotated TextGrids
    can be uploaded even if the word alignments are outdated)
* Implement inclusion of noise annotations in HTK forced-alignment
* Add P2FA models to HTK Layer Manager (so that they can be used instead of
    LaBB-CAT training its own models)
* Standardise annotation-by-convention handling for EAF/TextGrid files
* Stream and paginate search results for less waiting around on big searches
* Remember csv-exported layers from one results set to the next
* Add support for CSV filter file for word-pair frequency extraction in Frequency
    layer manager
* Make noise and comment layers easily exportable
* Add previous starting/ending meta-annotation function for CSV search results
    extraction
* Add previous word function to insert data page
* Add layer-manager filter to layers pages
* Implement n-gram frequency annotation to Frequency layer manager
* Ensure CELEX dictionary editors can edit syllable/stress-marked representations,
    even for layers whose configurations strip them out (so that custom entries
    can include them)
* Use .TextGrid instead of .textgrid for saved files
* Support comma, tab, and semilcolon as selectable CSV file delimiters, and
    auto-detect delimiter used in uploaded CSV files
* Break down transcripts by utterance instead of turn when exporting to PDF/text,
    to ensure overlapping speech is more tightly synchronized, by line start time
    instead of turn start time
* Resolve issues with Java applet security
* Improve transcript change merging during upload
* Jython integration
* Add support for browser-based javascript scripting
* Add Support for scripting searches
* Correct unlinked annotations
* Improved AGTK integration
* User interface improvements
* Bug fixes
* Update online help

================================================================================
Version 20130611:
================================================================================

* Implement re-load from original file
* Filter out blank-headered columns in participant CSV files
* Add support for upgrading directly from the zip file downloaded from the
    downloads site
* User interface improvements
* Bug fixes
* Update online help

================================================================================
Version 20130517:
================================================================================

* Improve support for Buckeye corpus uploading
* Implement media upload optional for batch uploader
* Unisyn layer manager
* Enable re-upload of transcripts of a different type to original transcript
* Improve transcript change merging
* Improvements to HTK layer manager
* Add support for annotating spanning layers with type/token annotations
* Ensure pre-selected layers are used on search form
* Implement noise/comment layer search
* Extend Context layer manager to support token-ratio-in-context annotations
* Set sole speaker as main participant instead of asking
* Implement basic plain-text uploading
* Add function for querying list of frequencies by CSV file
* Add support for renaming generic speakers like 'Interviewer' to something more
   specific, by suffixing them with the series name
* Ignore no-speaker turns/lines
* Trim whitespace from speaker names on turn/utterance labels
* PDF transcript export
* Improve TextGrid converter
* SubRip converter for subtitles
* Exmaralda converter (prototype)
* Emu converter (prototype)
* SALT transcript converter (prototype)
* Make line start/end and turn end border conditions faster for searching
* Kirshenbaum/DISC conversion
* Move default language setting from system attributes to corpus
* Add non-word 'words' after events, to ensure that event ordering is preserved
   after conversion to annotation graph, if there's only punctuation between
   non-speech annotation
* Standardise layer selectiong UI
* Update mysql connector version
* Prevent applet loading from triggering a username/password request
* Keep search results in the database instead of memory, to prevent big searches
   using all available memory
* Upgrade/Install from same file
* User interface improvements
* Bug fixes
* Update online help

================================================================================
Version 20120816
================================================================================