The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Home / 0.13.0
Name Modified Size InfoDownloads / Week
Parent folder
packaging upgrades, faster language id, bug fixes source code.tar.gz 2023-04-02 410.7 kB
packaging upgrades, faster language id, bug fixes source code.zip 2023-04-02 510.3 kB
README.md 2023-04-02 1.5 kB
Totals: 3 Items   922.5 kB 0

Took a (longer than expected) break from NLP, so this release is mostly just maintenance and bug fixes — but in anticipation of more interesting updates to come.

  • upgraded built-in language identification model (PR [#375])
  • replaced v2 thinc/cld3 model with v3 floret/fasttext model, which has much faster predictions and comparable but more consistent performance
  • modernized and improved Python packaging for faster, simpler installation and testing (PR [#368] and [#369])
  • all package metadata and configuration moved into a single pyproject.toml file
  • code formatting and linting updated to use ruff plus newer versions of mypy and black, and their use in GitHub Actions CI has been consolidated
  • bumped supported Python versions range from 3.8–3.10 to 3.9–3.11 (PR [#369])
  • added full CI testing matrix for PY 3.9/3.10/3.11 x Linux/macOS/Windows, and removed extraneous AppVeyor integration
  • updated and improved type hints throughout, reducing number of mypy complaints by ~80% (PR [#372])

Fixed

  • fixed ReDoS bugs in regex patterns (PR [#371])
  • fixed breaking API issues with newer networkx/scikit-learn versions (PR [#367])
  • improved dev workflow documentation and code to better incorporate language data (PR [#363])
  • updated caching code with a fix from upstream pysize library, which was preventing Russian-language spaCy model from loading properly (PR [#358])

Contributors

Big thanks to @jonwiggins, @Hironsan, amnd @kevinbackhouse for the fixes!

Source: README.md, updated 2023-04-02