[go: up one dir, main page]

Menu

/ddc/trunk Commit Log


Commit Date  
[r1113] by mukau

* fixed ddc_split in presence of legacy anonymized tokens (those beginning with -#)
- workaround (hack): all types with 0 occurrences in source corpus are assumed to by anonymized,
and are added to the output vocabulary of *every* partition -- this is a bit bloated but should work
* added ddc_index option --anonymize (-a): enable legacy anonymized tokens (disabled by default)
- fixes mantis #31586

2018-10-23 13:32:29 Tree
[r1112] by mukau

* trim *._con_prefix

2018-10-23 10:34:03 Tree
[r1109] by mukau

* added AC_DEFINE()s for __STDC_LIMIT_MACROS, __STDC_CONSTANT_MACROS
- workaround for old C++ compilers (e.g. debian wheezy)
- fixes compile error for ddcTime.cpp: 'SIZE_MAX' was not declared in this scope

2018-10-19 07:25:31 Tree
[r1108] by mukau

- mantis #31439: added missing Trim() for m_CommonFilePrefix in CConcIndexator::LoadCorpusFiles()

2018-10-15 07:17:11 Tree
[r1107] by mukau

+ updated version

2018-10-12 12:16:18 Tree
[r1106] by mukau

+ GetHitStrings fix

2018-10-12 12:14:01 Tree
[r1103] by mukau

* v2.1.17 release

2018-09-26 10:09:24 Tree
[r1100] by mukau

* improved error messages in CConcSession::GetTokensFromStorageByBreak(), CConcSession::GetFileSnippets()
- was: concord_daemon_log("Error! Cannot read hit no %i \n", BreakNo); return false;
- now: throw CExpc(errReadSourceFile, "CConcSession::... cannot read $%s storage data for break #%i")
* improved error reporting in CDDCBranchServer::GetHitContexts()
- truncated *._storage_Token file was giving unhelpful legacy "\x{01}cannot read a source file" message
- new code trims \001 separator and prepends subcorpus identifier
* removed error return clauses in CConcSession::GetHits(): now just re-throw exceptions (they should now be caught higher up)

2018-09-25 13:27:54 Tree
[r1098] by mukau

* added rml.ini key "Software\Dialing\Lemmatizer\German\ExtEncoding" = "UTF8"
- allow legacy latin-1 indices to be handled gracefully

2018-09-24 12:13:42 Tree
[r1097] by mukau

* removed old hard-codced 500MB limit on size of *._TOKATTR (string-buffer) in IndexSetForLoadingStage.cpp
- now use macro MAX_STRINGBUFFER_SIZE = (DWORD_MAX & (~AllFlags)) --> 2GB
- fixed limit check (the actual exception was previously only catching on 1st insertion after overflow)
* debugging for v1.x->v2.x index upgrade

2018-09-24 11:04:20 Tree
Older >