[go: up one dir, main page]

Menu

Commit [r1433]  Maximize  Restore  History

* fixed ddc_index Schlemiehl-the-Painter bug due to flat sorted list<CStringItem> CFreeBiblStringIndex::m_BuildStringItems

- symptoms: throughput rates reported by ddc_index (exponential decay?) as indexing progresses,
dropping from initial >130k tok/sec to <4k tok/sec after processing ca. 100M tokens in ca. 300K documents
- re-implemented using map<string,DWORD> CFreeBiblStringIndex::m_BuildStringItemsMap
- new code reports near-constant throughput on 5M-token test corpus
+ (150k +/- 9k) tok/sec if input corpus is in OS buffer cache
+ ( 32k +/- 8k) tok/sec otherwise

mukau 2020-02-14

changed /ddc/trunk/Changes
changed /ddc/trunk/configure.ac
changed /ddc/trunk/doc/ddc_cfg.5
changed /ddc/trunk/doc/ddc_files.5
changed /ddc/trunk/doc/ddc_opt.5
changed /ddc/trunk/doc/ddc_proto.5
changed /ddc/trunk/doc/ddc_query.5
changed /ddc/trunk/doc/ddc_server.opt.5
changed /ddc/trunk/doc/ddc_tabs.5
changed /ddc/trunk/src/ConcordLib/FreeBiblIndex.cpp
changed /ddc/trunk/src/ConcordLib/FreeBiblIndex.h