The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.
| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| packaging upgrades, faster language id, bug fixes source code.tar.gz | 2023-04-02 | 410.7 kB | |
| packaging upgrades, faster language id, bug fixes source code.zip | 2023-04-02 | 510.3 kB | |
| README.md | 2023-04-02 | 1.5 kB | |
| Totals: 3 Items | 922.5 kB | 0 | |
Took a (longer than expected) break from NLP, so this release is mostly just maintenance and bug fixes — but in anticipation of more interesting updates to come.
- upgraded built-in language identification model (PR [#375])
- replaced v2 thinc/cld3 model with v3 floret/fasttext model, which has much faster predictions and comparable but more consistent performance
- modernized and improved Python packaging for faster, simpler installation and testing (PR [#368] and [#369])
- all package metadata and configuration moved into a single
pyproject.tomlfile - code formatting and linting updated to use
ruffplus newer versions ofmypyandblack, and their use in GitHub Actions CI has been consolidated - bumped supported Python versions range from 3.8–3.10 to 3.9–3.11 (PR [#369])
- added full CI testing matrix for PY 3.9/3.10/3.11 x Linux/macOS/Windows, and removed extraneous AppVeyor integration
- updated and improved type hints throughout, reducing number of
mypycomplaints by ~80% (PR [#372])
Fixed
- fixed ReDoS bugs in regex patterns (PR [#371])
- fixed breaking API issues with newer networkx/scikit-learn versions (PR [#367])
- improved dev workflow documentation and code to better incorporate language data (PR [#363])
- updated caching code with a fix from upstream pysize library, which was preventing Russian-language spaCy model from loading properly (PR [#358])
Contributors
Big thanks to @jonwiggins, @Hironsan, amnd @kevinbackhouse for the fixes!