[go: up one dir, main page]

Menu

Commit [r1113]  Maximize  Restore  History

* fixed ddc_split in presence of legacy anonymized tokens (those beginning with -#)

- workaround (hack): all types with 0 occurrences in source corpus are assumed to by anonymized,
and are added to the output vocabulary of *every* partition -- this is a bit bloated but should work
* added ddc_index option --anonymize (-a): enable legacy anonymized tokens (disabled by default)
- fixes mantis #31586

mukau 2018-10-23

changed /ddc/trunk/Changes
changed /ddc/trunk/configure.ac
changed /ddc/trunk/doc/ddc_cfg.5
changed /ddc/trunk/doc/ddc_files.5
changed /ddc/trunk/doc/ddc_opt.5
changed /ddc/trunk/doc/ddc_proto.5
changed /ddc/trunk/doc/ddc_query.5
changed /ddc/trunk/doc/ddc_server.opt.5
changed /ddc/trunk/doc/ddc_tabs.5
changed /ddc/trunk/src/ConcordLib/IndexSet.cpp
changed /ddc/trunk/src/ConcordLib/IndexSetForLoadingStage.cpp
changed /ddc/trunk/src/ConcordLib/IndexSetForLoadingStage.h
changed /ddc/trunk/src/common/ddcConfigNoAuto.h
changed /ddc/trunk/src/ddc_index/ddc_index.cpp
changed /ddc/trunk/src/ddc_split/ddc_split.cpp